TF-Lite uses gemmlowp for matrix multiplication

Miss_Baker 2019. 4. 18. 03:34

2019. 4. 18. 03:34

TF-Lite uses gemmlowp for matrix multiplication, which stores results of uint8 matrix products in int32. Then, we can add the biases quantized in higher precision as int32 itself. Finally, in going from 32-bit to 8-bit.

https://github.com/google/gemmlowp/blob/master/doc/quantization.md

google/gemmlowp

Low-precision matrix multiplication. Contribute to google/gemmlowp development by creating an account on GitHub.

github.com

https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc

Quantization as an affine map
- quantized uint8 value와 real value의 매핑을 나타내면
- ...더보기
  real_value = A * quantized_value + B (1)
  
  real_value = C * (quantized_value + D) (2)
zero-padding 등 구현의 이점을 위하여 real value 0에 대응하는 quantized value를 zero-point라 한다.
- ...더보기
  0 = A * zero_porint + B
- ...더보기
  real_value = scale * (quantized_value - zero_point) (3)
Quantizing a matrix multiplication
- quantied value의 행렬곱 형태로 변경한다.
- 2개의 real matrix가 있다고 하면 (lhs_real_matrix, rhs_real_matrix)
- ...더보기
  lhs_real_value[i] = lhs_scale * (lhs_quantized_value[i] - lhs_zero_point)
  
  rhs_real_value[i] = rhs_scale * (rhs_quantized_value[i] - rhs_zero_point)
- 행렬곱 결과인 result_real_value는
- ...더보기
  result_real_value
  = Sum_over_i(lhs_real_value[i] * rhs_real_value[i])
  = Sum_over_i(
  lhs_scale * (lhs_quantized_value[i] - lhs_zero_point) *
  rhs_scale * (rhs_quantized_value[i] - rhs_zero_point) )
  = lhs_scale * rhs_scale * Sum_over_i(
  (lhs_quantized_value[i] - lhs_zero_point) *
  (rhs_quantized_value[i] - rhs_zero_point)
  ) (4)

저작자표시 비영리 변경금지

'개발 > Deep learning' 카테고리의 다른 글

RL & GA (0)	2019.10.01
Stock Price Prediction \| AI in Finance (0)	2019.04.22
Quantization and Training of Neural Networks for EfficientInteger-Arithmetic-Only Inference (0)	2019.04.18
밑바닥 딥러닝_7장 합성곱 신경망(CNN) (1)	2019.01.27

배움의 과정 : 실천

TF-Lite uses gemmlowp for matrix multiplication

'개발 > Deep learning' 카테고리의 다른 글

+ Recent posts

티스토리툴바