Deep Compression/Acceleration(模型压缩加速总结)

模型压缩论文目录

  • 结构`structure`
      • [CVPR2019] Searching for MobileNetV3
      • [BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks
      • [CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks
      • [CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks
      • [ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
  • 量化`quantization`
      • Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
      • [ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
      • [CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
      • [CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
      • [CVPR2016] Ternary Weight Networks
      • Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
      • [ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
      • Two-Step Quantization for Low-bit Neural Networks
  • 剪枝`pruning`
    • 通道裁剪`channel pruning`
      • [NIPS2018] Discrimination-aware Channel Pruning for Deep Neural Networks
      • [ICCV2017] Channel Pruning for Accelerating Very Deep Neural Networks
      • [ECCV2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
      • [ICCV2017] Learning Efficient Convolutional Networks through Network Slimming
      • [ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
      • [CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation
      • [ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
    • 稀疏`sparsity`
      • SBNet: Sparse Blocks Network for Fast Inference
      • To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression
      • Submanifold Sparse Convolutional Networks
      • MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure
  • 融合`fusion`
  • 蒸馏`distillation`
      • [NIPS2014] Distilling the Knowledge in a Neural Network
  • 综合`comprehensive`
      • [ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
      • Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

根据个人理解将模型压缩方面研究分为以下七个方向:

结构structure

[CVPR2019] Searching for MobileNetV3

  • intro: 神经网络结构搜索NAS 、强化学习
  • arxiv: https://arxiv.org/abs/1905.02244
  • github: https://github.com/xiaolai-sqlai/mobilenetv3
  • github: https://github.com/leaderj1001/MobileNetV3-Pytorch

[BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks

  • intro:
  • arxiv:https://arxiv.org/abs/1806.00178
  • github:https://github.com/homles11/IGCV3

[CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks

  • intro:
  • arxiv:https://arxiv.org/abs/1804.06202
  • 同上

[CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks

  • intro:
  • arxiv:https://arxiv.org/abs/1801.04381
  • github:https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

[ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

  • intro:
  • arxiv:https://arxiv.org/abs/1807.11164
  • github:

量化quantization

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

  • intro:二值网络
  • arxiv:https://arxiv.org/abs/1602.02830
  • github: https://github.com/MatthieuCourbariaux/BinaryNet
    https://github.com/itayhubara/BinaryNet

[ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

  • intro:二值网络
  • pdf:http://www.idi.ntnu.no/~yamanu/2017-fpga-finn-preprint.pdf
  • github:https://github.com/Xilinx/FINN

[CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

  • intro:低bit位
  • arxiv:https://arxiv.org/abs/1606.06160
  • github:https://github.com/tensorpack/tensorpack/tree/master/examples/DoReFa-Net

[CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

  • intro:darknet团队出品
  • arxiv:https://arxiv.org/abs/1603.05279
  • github:https://github.com/allenai/XNOR-Net

[CVPR2016] Ternary Weight Networks

  • intro:
  • arxiv:https://arxiv.org/abs/1605.04711
  • github:https://github.com/fengfu-chris/caffe-twns

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

  • Google出品
  • arxiv:https://arxiv.org/abs/1712.05877
  • github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize

[ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

  • intro:QNNs
  • arxiv:https://arxiv.org/abs/1609.07061
  • github:https://github.com/peisuke/qnn

Two-Step Quantization for Low-bit Neural Networks

  • intro:
  • paper:http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Two-Step_Quantization_for_CVPR_2018_paper.pdf
  • github:

剪枝pruning

通道裁剪channel pruning

[NIPS2018] Discrimination-aware Channel Pruning for Deep Neural Networks

  • intro:
  • arxiv:https://arxiv.org/abs/1810.11809
  • github:https://github.com/Tencent/PocketFlow支持DisChnPrunedLearner

[ICCV2017] Channel Pruning for Accelerating Very Deep Neural Networks

  • intro:Lasso回归
  • arxiv:https://arxiv.org/abs/1707.06168
  • github:https://github.com/yihui-he/channel-pruning

[ECCV2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

  • intro:自动学习优化
  • arxiv:https://arxiv.org/abs/1802.03494
  1. https://www.jiqizhixin.com/articles/AutoML-for-Model-Compression-and-Acceleration-on-Mobile-Devices论文翻译
  • github:https://github.com/Tencent/PocketFlow

[ICCV2017] Learning Efficient Convolutional Networks through Network Slimming

  • intro:Zhuang Liu
  • arxiv:https://arxiv.org/abs/1708.06519
  • github:https://github.com/Eric-mingjie/network-slimming
    https://github.com/foolwood/pytorch-slimming

[ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

  • intro:
  • arxiv:https://arxiv.org/abs/1802.00124
  • github:[PyTorch]https://github.com/jack-willturner/batchnorm-pruning
    [TensorFlow]https://github.com/bobye/batchnorm_prune

[CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation

  • intro:
  • arxiv:https://arxiv.org/abs/1711.05908

[ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

  • intro:
  • web:http://lamda.nju.edu.cn/luojh/project/ThiNet_ICCV17/ThiNet_ICCV17_CN.html
  • github:https://github.com/Roll920/ThiNet
    https://github.com/Roll920/ThiNet_Code

稀疏sparsity

SBNet: Sparse Blocks Network for Fast Inference

  • intro: Uber
  • arxiv:https://arxiv.org/abs/1801.02108
  • github:https://github.com/uber/sbnet

To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression

  • intro:稀疏
  • arxiv:https://arxiv.org/abs/1710.01878
  • github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/model_pruning

Submanifold Sparse Convolutional Networks

  • intro:Facebook
  • arxiv:https://arxiv.org/abs/1706.01307
  • github:https://github.com/facebookresearch/SparseConvNet

MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure

  • intro: Google,正则化
  • arxiv: https://arxiv.org/abs/1711.06798
  • github: https://github.com/google-research/morph-net

融合fusion

蒸馏distillation

[NIPS2014] Distilling the Knowledge in a Neural Network

  • intro:Hinton出品
  • arxiv:https://arxiv.org/abs/1503.02531
  • github:https://github.com/peterliht/knowledge-distillation-pytorch

综合comprehensive

[ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

  • intro:开创先河
  • arxiv:https://arxiv.org/abs/1510.00149
  • github:https://github.com/songhan

Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

  • intro:实验比较多,适合工程化
  • arxiv:https://arxiv.org/abs/1709.02929

你可能感兴趣的:(模型压缩)