深度学习模型的压缩和加速是指利用神经网络参数的冗余性和网络结构的冗余性精简模型,在不影响任务完成度的情况下,得到参数量更少、结构更精简的模型。被压缩后的模型计算资源需求和内存需求更小,相比原始模型能够满足更加广泛的应用需求。在深度学习技术日益火爆的背景下,对深度学习模型强烈的应用需求使得人们对内存占用少、计算资源要求低、同时依旧保证相当高的正确率的“小模型”格外关注。利用神经网络的冗余性进行深度学习的模型压缩和加速引起了学术界和工业界的广泛兴趣,各种工作也层出不穷。
本文参考2021发表在软件学报上的《深度学习模型压缩与加速综述》进行了总结和学习。
相关链接:
深度学习模型压缩与加速技术(一):参数剪枝
深度学习模型压缩与加速技术(二):参数量化
深度学习模型压缩与加速技术(三):低秩分解
深度学习模型压缩与加速技术(四):参数共享
深度学习模型压缩与加速技术(五):紧凑网络
深度学习模型压缩与加速技术(六):知识蒸馏
深度学习模型压缩与加速技术(七):混合方式
模型压缩与加速技术 | 描述 |
---|---|
参数剪枝(A) | 设计关于参数重要性的评价准则,基于该准则判断网络参数的重要程度,删除冗余参数 |
参数量化(A) | 将网络参数从 32 位全精度浮点数量化到更低位数 |
低秩分解(A) | 将高维参数向量降维分解为稀疏的低维向量 |
参数共享(A) | 利用结构化矩阵或聚类方法映射网络内部参数 |
紧凑网络(B) | 从卷积核、特殊层和网络结构3个级别设计新型轻量网络 |
知识蒸馏(B) | 将较大的教师模型的信息提炼到较小的学生模型 |
混合方式(A+B) | 前几种方法的结合 |
A:压缩参数 B:压缩结构
设计更紧凑的新型网络结构,是一种新兴的网络压缩与加速理念,构造特殊结构的 filter、网络层甚至网络,
从头训练,获得适宜部署到移动平台等资源有限设备的网络性能,不再需要像参数压缩类方法那样专门存储预训练模型,也不需要通过微调来提升性能,降低了时间成本,具有存储量小、计算量低和网络性能好的特点。缺点在于:由于其特殊结构很难与其他的压缩与加速方法组合使用,并且泛化性较差,不适合作为预训练模型帮助其他模型训练。
主要参考:高晗,田育龙,许封元,仲盛.深度学习模型压缩与加速综述[J].软件学报,2021,32(01):68-92.DOI:10.13328/j.cnki.jos.006096.
[124] Iandola FN, Han S, Moskewicz MW, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv Preprint arXiv: 1602.07360, 2016.
[125] Howard AG, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv Preprint arXiv: 1704.04861, 2017.
[126] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 45104520.
[127] Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 68486856.
[128] Ma N, Zhang X, Zheng HT, et al. Shufflenet v2: Practical guidelines for efficient CNN architecture design. In: Proc. of the European Conf. on Computer Vision (ECCV). 2018. 116131.
[129] Zhang T, Qi GJ, Xiao B, et al. Interleaved group convolutions. In: Proc. of the IEEE Int’l Conf. on Computer Vision. 2017. 43734382.
[130] Xie G, Wang J, Zhang T, et al. Interleaved structured sparse convolutional neural networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 88478856.
[131] Wang X, Kan M, Shan S, et al. Fully learnable group convolution for acceleration of deep neural networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 90499058.
[132] Park J, Li S, Wen W, et al. Faster CNNs with direct sparse convolutions and guided pruning. arXiv Preprint arXiv: 1608.01409, \2016.
[133] Zhang J, Franchetti F, Low TM. High performance zero-memory overhead direct convolutions. arXiv Preprint arXiv: 1809.10170, \2018.
[134] Ioannou Y, Robertson D, Shotton J, et al. Training cnns with low-rank filters for efficient image classification. arXiv Preprint arXiv: 1511.06744, 2015.
[135] Bagherinezhad H, Rastegari M, Farhadi A. LCNN: Lookup-based convolutional neural network. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 71207129.
[136] Wang Y, Xu C, Chunjing XU, et al. Learning versatile filters for efficient convolutional neural networks. In: Advances in Neural Information Processing Systems. 2018. 16081618.
[137] Huang G, Sun Y, Liu Z, et al. Deep networks with stochastic depth. In: Proc. of the European Conf. on Computer Vision. Cham: Springer-Verlag, 2016. 646661.
[138] Dong X, Huang J, Yang Y, et al. More is less: A more complicated network with less inference complexity. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 58405848.
[139] Li D, Wang X, Kong D. Deeprebirth: Accelerating deep neural network execution on mobile devices. In: Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018.
[140] Prabhu A, Varma G, Namboodiri A. Deep expander networks: Efficient deep networks from graph theory. In: Proc. of the European Conf. on Computer Vision (ECCV). 2018. 2035.
[141] Wu B, Wan A, Yue X, et al. Shift: A zero flop, zero parameter alternative to spatial convolutions. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 91279135.
[142] Chen W, Xie D, Zhang Y, et al. All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 72417250.
[143] Kim J, Park Y, Kim G, et al. SplitNet: Learning to semantically split deep networks for parameter reduction and model parallelization. In: Proc. of the 34th Int’l Conf. on Machine Learning, Vol.70. JMLR.org, 2017. 18661874.
[144] Gordon A, Eban E, Nachum O, et al. Morphnet: Fast & simple resource-constrained structure learning of deep networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 15861595.
[145] Kim E, Ahn C, Oh S. Nestednet: Learning nested sparse structures in deep neural networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 86698678.