模型量化

https://www.zhihu.com/question/362455124?sort=created

微软的模型压缩工具:distiller(重点)

https://github.com/NervanaSystems/distiller (支持PyTorch,官方是在PyTorch1.3上测试的,在GitHub上搜PyTorch Pruning最多星)

微软的AutoML工具:nni中也有模型压缩的模块,https://github.com/microsoft/nni/blob/master/examples/model_compress/QAT_torch_quantizer.py

https://nni.readthedocs.io/zh/latest/Compressor/Quantizer.html

https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py

Pytorch自带的量化工具(PyTorch>1.3)

https://zhuanlan.zhihu.com/p/81026071

https://github.com/pytorch/glow/blob/master/docs/Quantization.md

https://github.com/pytorch/QNNPACK

nni中的量化从例子看起来还蛮好用的(Pytorch的官方量化文档看晕了,不适合刚入手量化的小白)

https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py

nni中有4种量化方式:

1. Naive Quantizer,2. QAT Quantizer,3. DoReFa Quantizer,4.BNN Quantizer

其中,1貌似是最low的,推理的时候32位变成8位,不想用,4,二进制神经网络是啥?按位运算挺好的,不知道部署时候是否有坑

2,Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)nni中说不支持批量归一化折叠,不知道有没有影响

3,DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients(Face++ 2018)例子看起来简单https://arxiv.org/abs/1606.06160,https://blog.csdn.net/langzi453/article/details/88172080

nni中的量化只是模拟不是加速?https://github.com/microsoft/nni/issues/2332

https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py

paper with code 的量化github排名:

https://paperswithcode.com/search?q_meta=&q=Quantizer

第一名:Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)

https://paperswithcode.com/paper/quantization-and-training-of-neural-networks    Tensorflow劝退

第二名:Training with Quantization Noise for Extreme Model Compression(2020)

https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme PyTorch实现


模型压缩的benchmark: https://paperswithcode.com/task/model-compression

模型压缩的benchmark:https://paperswithcode.com/task/quantization


DoReFa还有这个人的:https://github.com/666DZY666/model-compression\

QAT还有这个人的:https://github.com/Xilinx/brevitas

你可能感兴趣的:(模型量化)