TF-Lite uses gemmlowp for matrix multiplication, which stores results of uint8 matrix products in int32. Then, we can add the biases quantized in higher precision as int32 itself. Finally, in going from 32-bit to 8-bit.

https://github.com/google/gemmlowp/blob/master/doc/quantization.md

 

google/gemmlowp

Low-precision matrix multiplication. Contribute to google/gemmlowp development by creating an account on GitHub.

github.com

https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc

  • Quantization as an affine map
    • quantized uint8 value와 real value의 매핑을 나타내면
    • ...더보기
      real_value = A * quantized_value + B              (1)

      real_value = C * (quantized_value + D)            (2)
  • zero-padding 등 구현의 이점을 위하여 real value 0에 대응하는 quantized value를 zero-point라 한다.
  • Quantizing a matrix multiplication
    • quantied value의 행렬곱 형태로 변경한다.
    • 2개의 real matrix가 있다고 하면 (lhs_real_matrix, rhs_real_matrix)
    • ...더보기
      lhs_real_value[i] = lhs_scale * (lhs_quantized_value[i] - lhs_zero_point)

      rhs_real_value[i] = rhs_scale * (rhs_quantized_value[i] - rhs_zero_point)
    • 행렬곱 결과인 result_real_value는
    • ...더보기
      result_real_value
           = Sum_over_i(lhs_real_value[i] * rhs_real_value[i])
           = Sum_over_i(
                  lhs_scale * (lhs_quantized_value[i] - lhs_zero_point) *
                  rhs_scale * (rhs_quantized_value[i] - rhs_zero_point) )
           = lhs_scale * rhs_scale * Sum_over_i(
                  (lhs_quantized_value[i] - lhs_zero_point) *
                  (rhs_quantized_value[i] - rhs_zero_point)
              )                                                                               (4)

+ Recent posts