CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices

Quantized Convolutional Neural Networks for Mobile Devices

CVPR2016

GitHub code: https://github.com/jiaxiang-wu/quantized-cnn

本文主要是通过对CNN网络的量化,达到压缩模型大小及加快速度的目的,牺牲的准确率比较小。

CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第1张图片

CNN网络在 test phase , 运算时间主要消耗在卷积层,CNN网络的参数主要集中在全连接层。主要在卷积层需要加速,在全连接层压缩参数空间。这里我们采用了 Product quantization 技术。

3 Quantized CNN
3.1. Quantizing the Fully-connected Layer
CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第2张图片

3.2. Quantizing the Convolutional Layer
Similar to the fully-connectedlayer, we pre-compute the look-up tables of inner products with the input feature maps

3.3. Quantization with Error Correction
对每层量化进行误差纠正,避免累计误差太大。

3.4. Computation Complexity

CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第3张图片

the reduction in the computation and storage overhead largely depends on two hyper-parameters, M (number of subspaces) and K (number of sub-codewords in each subspace)

5.1. Results on MNIST
CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第4张图片

5.2. Results on ILSVRC-12
CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第5张图片

5.3. Results on Mobile Devices
CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第6张图片

5.4. Theoretical vs. Realistic Speed-up

CNN网络量化 - Quantized Convolutional Neural Networks for Mobile Devices_第7张图片

你可能感兴趣的:(深度学习,ZJ,CVPR,2016,CNN网络压缩)