TF-Lite uses gemmlowp for matrix multiplication, which stores results of uint8 matrix products in int32. Then, we can add the biases quantized in higher precision as int32 itself. Finally, in going from 32-bit to 8-bit.
https://github.com/google/gemmlowp/blob/master/doc/quantization.md
https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc
- Quantization as an affine map
- quantized uint8 value와 real value의 매핑을 나타내면
-
...더보기real_value = A * quantized_value + B (1)
real_value = C * (quantized_value + D) (2)
- zero-padding 등 구현의 이점을 위하여 real value 0에 대응하는 quantized value를 zero-point라 한다.
- Quantizing a matrix multiplication
- quantied value의 행렬곱 형태로 변경한다.
- 2개의 real matrix가 있다고 하면 (lhs_real_matrix, rhs_real_matrix)
-
...더보기lhs_real_value[i] = lhs_scale * (lhs_quantized_value[i] - lhs_zero_point)
rhs_real_value[i] = rhs_scale * (rhs_quantized_value[i] - rhs_zero_point) - 행렬곱 결과인 result_real_value는
-
...더보기result_real_value
= Sum_over_i(lhs_real_value[i] * rhs_real_value[i])
= Sum_over_i(
lhs_scale * (lhs_quantized_value[i] - lhs_zero_point) *
rhs_scale * (rhs_quantized_value[i] - rhs_zero_point) )
= lhs_scale * rhs_scale * Sum_over_i(
(lhs_quantized_value[i] - lhs_zero_point) *
(rhs_quantized_value[i] - rhs_zero_point)
) (4)
'개발 > Deep learning' 카테고리의 다른 글
RL & GA (0) | 2019.10.01 |
---|---|
Stock Price Prediction | AI in Finance (0) | 2019.04.22 |
Quantization and Training of Neural Networks for EfficientInteger-Arithmetic-Only Inference (0) | 2019.04.18 |
밑바닥 딥러닝_7장 합성곱 신경망(CNN) (1) | 2019.01.27 |