模型压缩与加速技术 | 描述 |
参数剪枝(A) | 设计关于参数重要性的评价准则,基于该准则判断网络参数的重要程度,删除冗余参数 |
参数量化(A) | 将网络参数从 32 位全精度浮点数量化到更低位数 |
低秩分解(A) | 将高维参数向量降维分解为稀疏的低维向量 |
参数共享(A) | 利用结构化矩阵或聚类方法映射网络内部参数 |
紧凑网络(B) | 从卷积核、特殊层和网络结构3个级别设计新型轻量网络 |
知识蒸馏(B) | 将较大的教师模型的信息提炼到较小的学生模型 |
混合方式(A+B) | 前几种方法的结合 |
A:压缩参数 B:压缩结构
[96] Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions. arXiv Preprint arXiv: 1405.3866, 2014.
[97] Liu B, Wang M, Foroosh H, et al. Sparse convolutional neural networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 806814.
[98] Tai C, Xiao T, Zhang Y, et al. Convolutional neural networks with low-rank regularization. arXiv Preprint arXiv: 1511.06067, \2015.
[99] Masana M, van de Weijer J, Herranz L, et al. Domain-adaptive deep network compression. In: Proc. of the IEEE Int’l Conf. on Computer Vision. 2017. 42894297.
[100] Wen W, Xu C, Wu C, et al. Coordinating filters for faster deep neural networks. In: Proc. of the IEEE Int’l Conf. on Computer Vision. 2017. 658666.
[101] Wang P, Cheng J. Fixed-Point factorized networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.40124020.
[102] Peng B, Tan W, Li Z, et al. Extreme network compression via filter group approximation. In: Proc. of the European Conf. on Computer Vision (ECCV). 2018. 300316.
[103] Qiu Q, Cheng X, Calderbank R, et al. DCFnet: Deep neural network with decomposed convolutional filters. arXiv Preprint arXiv: 1802.04145, 2018.
[104] Novikov A, Podoprikhin D, Osokin A, et al. Tensorizing neural networks. In: Advances in Neural Information Processing Systems. \2015. 442450.
[105] Garipov T, Podoprikhin D, Novikov A, et al. Ultimate tensorization: compressing convolutional and fc layers alike. arXiv Preprint arXiv: 1611.03214, 2016.
[106] Wang W, Sun Y, Eriksson B, et al. Wide compression: Tensor ring nets. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 93299338.
[107] Kim YD, Park E, Yoo S, et al. Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv Preprint arXiv: 1511.06530, 2015.
[108] Wang P, Cheng J. Accelerating convolutional neural networks for mobile applications. In: Proc. of the 24th ACM Int’l Conf. on Multimedia. 2016. 541545.
[109] Lebedev V, Ganin Y, Rakhuba M, et al. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv Preprint arXiv: 1412.6553, 2014.