深度学习(Deep Learning)因其计算复杂度或参数冗余,在一些场景和设备上限制了相应的模型部署,需要借助模型压缩、优化加速、异构计算等方法突破瓶颈。
深度学习模型因其稀疏性,可以被裁剪为结构精简的网络模型,具体包括结构性剪枝与非结构性剪枝:、
模型量化是指权重或激活输出可以被聚类到一些离散、低精度的数值点上,通常依赖于特定算法库或硬件平台的支持:
其中知识蒸馏相关的讨论可参考:
https://blog.csdn.net/nature553863/article/details/80568658
Discrimination-aware Channel Pruning是一种结合鉴别力感知辅助Loss的分阶段Channel Selection剪枝策略,具体可参考:
https://blog.csdn.net/nature553863/article/details/83822895
References
[1]https://arxiv.org/abs/1801.02108,, Github: https://github.com/uber/sbnet
[3] https://devblogs.nvidia.com/tensorrt-3-faster-tensorflow-inference/
[4] https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/
[5] https://arxiv.org/abs/1510.00149
[6] https://arxiv.org/abs/1802.06367, https://ai.intel.com/winograd-2/, Github: https://github.com/xingyul/Sparse-Winograd-CNN
[7] https://arxiv.org/abs/1707.06168, Github:https://github.com/yihui-he/channel-pruning
[8] https://arxiv.org/abs/1707.06342
[9] https://arxiv.org/abs/1810.11809, Github: https://github.com/Tencent/PocketFlow
[10] https://arxiv.org/abs/1708.06519, Github: https://github.com/foolwood/pytorch-slimming
[11] https://arxiv.org/abs/1611.06440, Github: https://github.com/jacobgil/pytorch-pruning
[12] http://xuanyidong.com/publication/ijcai-2018-sfp/
[13] https://arxiv.org/abs/1603.05279, Github: https://github.com/ayush29feb/Sketch-A-XNORNet
Github: https://github.com/jiecaoyu/XNOR-Net-PyTorch
[14] https://arxiv.org/abs/1711.11294, Github: https://github.com/layog/Accurate-Binary-Convolution-Network
[15] https://arxiv.org/abs/1708.08687
[16] https://arxiv.org/abs/1808.00278, Github:https://github.com/liuzechun/Bi-Real-net
[17] https://arxiv.org/abs/1605.04711
[18] https://arxiv.org/abs/1612.01064, Github: https://github.com/czhu95/ternarynet
[19] http://phwl.org/papers/syq_cvpr18.pdf, Github: https://github.com/julianfaraone/SYQ
[20] https://arxiv.org/abs/1712.05877
[21] http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf
[22] https://arxiv.org/abs/1702.03044
[23] https://papers.nips.cc/paper/6390-cnnpack-packing-convolutional-neural-networks-in-the-frequency-domain
参考:https://blog.csdn.net/nature553863/article/details/81083955