Gemmlowp

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference Quantization scheme Equation(1): \[ r=S(q-Z)\\ where\,\bold{S}\,is\,scale,\,\bold{Z}\,is\,zero-point\\

Quantization

2019-03-17

量化算法的一个总结

https://nervanasystems.github.io/distiller/algo_quantization/index.html 量化算法基于范围线性量化分解以上专业术语：线性：Means a float value is quantized by multiplying with a numeric constant (the scale factor).

Quantization

2019-03-16

TensorRT-量化指北

TensorRT量化指北对称的线性量化： \[ TensorValues = FP32\,scale\,factor\,*int8\,array \] One FP32 scale factor for the entire int8 tensor Q: 怎么设置scale factor？非饱和方式：映射|max|到127 下图所示一般上面的方式映射就会出现精

分类：: Quantization

Gemmlowp

量化算法的一个总结

TensorRT-量化指北