Modle Compression and Acceleration for DNN

A Survey of Model Compression and Acceleration for Deep Neural Networks

Compacting and accelerating CNN techniques are roughly categorized into four scheme:

1. Parameter pruning and sharing: explore the redundancy in the model parameters and try to remove redundant and uncritical ones [vector quantization, binary coding, sparse constraints]

a. model quantization and binarization; b. parameter sharing; c. structural matrix

2. Low-rank factorization: use matrix/tensor decomposition to estimate the informative parameters of the deep CNNs

3. Transfered/compact convolutional filters: design special structural convolutional filters to reduce the storage and computational complexity

4. Knowledge distillation: learn a distilled model and train a more compact neural networks to reproduce the output of a larger network

It makes senses to combine two or three of them to maximize the compression/speedup rates. For some specific applications, like object detection, which requires both convolutional and fully connected layers, you can compress the convolutional layers with low rank factorization and the fully connected layers with a pruning method.

DNNDK暂不对外支持TensorFlow模型

TensorFlow Lite 及支持模型列表

TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices. It enables on-device machine learning inference with low latency and a small binary size.

Modle Compression and Acceleration for DNN_第1张图片
Modle Compression and Acceleration for DNN_第2张图片

识别物体大小对各个模型效果的影响

Modle Compression and Acceleration for DNN_第3张图片

服务器上Tensorflow Object Detection API支持的模型列表

Modle Compression and Acceleration for DNN_第4张图片

我用ssd_mobilenet重新训练了仪表盘识别模型和数字识别模型,训练完模型从200+M降低到22M左右,物体大小的影响问题很明显,表盘基本能够准确识别,但size较小时出现漏检。而因为表盘上的数字都很小,除少量外基本完全无法识别。。。

基于上表的数据,没有在尝试其他模型的必要。而TensorFlow + Resnet101的组合没有现成的开源压缩代码,我找到的以caffe,keras居多,且涉及Resnet压缩的都使用只到Resnet16,32这种小型网络,原因不详。。。

所以,我认为,如果一定需要压缩模型,比较高效且可行概率较大的方式是换成Caffe训练。。。

你可能感兴趣的:(Modle Compression and Acceleration for DNN)