【MNN学习二】模型压缩与量化

目录

1. 量化的作用

2. 编译

3. 量化工具的使用


1. 量化的作用

    量化将网络中主要算子(卷积)由原先的浮点计算转成低精度的Int8计算,减少模型大小并提升性能

2. 编译

(1) 编译宏

    编译MNN时开启 MNN_BUILD_QUANTOOLS 宏,即开启量化工具的编译

(2) 编译产物

    量化模型的工具: quantized.out

    量化模型与浮点模型的对比工具:testQuanModel.out

3. 量化工具的使用

【MNN学习二】模型压缩与量化_第1张图片

cd MNN/build
./quantized.out ../benchmark/models/mobilenet_v1.caffe.mnn ../benchmark/models/mobilenet_v1_quant.caffe.mnn mobilenetCaffeConfig.json

    注意:mobilenetCaffeConfig.json 文件在目录 MNN/tools/quantization/.

    执行完上一条命令,量化后的网络将保存在 MNN/benchmark/models/目录下。接下来,测试比较量化前后模型的运行时间:

cd MNN/build/
./benchmark.out ../benchmark/models/ 10


# ----------------- output ------------------- #
MNN benchmark
Forward type: **CPU** thread=4** precision=2
--------> Benchmarking... loop = 10
[ - ] mobilenet_v1_quant.caffe.mnn    max =   75.278ms  min =   74.261ms  avg =   74.763ms
[ - ] mobilenet_v1.caffe.mnn      max =  107.799ms  min =   94.063ms  avg =   98.914ms

    最后,测试比较量化前后模型的各个网络层运行时间:     

cd MNN/build/
./MNNV2Basic.out ../benchmark/models/mobilenet_v1.caffe.mnn 50


# ----------------- output ------------------- #
Open Model ../benchmark/models/mobilenet_v1.caffe.mnn
test_main, 199, cost time: 36.553001 ms
===========> Session Resize Done.
===========> Session Start running...
Input: 1, 224, 224, 3
Run 50 time:
                 prob run 50 average cost 0.704100 ms, 0.727 %, FlopsRate: 0.000 %
                pool6 run 50 average cost 0.821800 ms, 0.849 %, FlopsRate: 0.000 %
           conv5_6/dw run 50 average cost 0.872040 ms, 0.901 %, FlopsRate: 0.040 %
             conv6/dw run 50 average cost 0.885800 ms, 0.915 %, FlopsRate: 0.079 %
           conv5_5/dw run 50 average cost 0.977860 ms, 1.010 %, FlopsRate: 0.159 %
           conv5_1/dw run 50 average cost 0.984140 ms, 1.017 %, FlopsRate: 0.159 %
           conv5_2/dw run 50 average cost 0.996680 ms, 1.030 %, FlopsRate: 0.159 %
           conv5_4/dw run 50 average cost 0.997360 ms, 1.030 %, FlopsRate: 0.159 %
           conv4_2/dw run 50 average cost 1.052500 ms, 1.087 %, FlopsRate: 0.079 %
           conv5_3/dw run 50 average cost 1.073480 ms, 1.109 %, FlopsRate: 0.159 %
           conv4_1/dw run 50 average cost 1.310980 ms, 1.354 %, FlopsRate: 0.318 %
           conv3_2/dw run 50 average cost 1.534120 ms, 1.585 %, FlopsRate: 0.159 %
                  fc7 run 50 average cost 1.652180 ms, 1.707 %, FlopsRate: 0.180 %
           conv3_1/dw run 50 average cost 1.987280 ms, 2.053 %, FlopsRate: 0.635 %
           conv2_2/dw run 50 average cost 2.183100 ms, 2.255 %, FlopsRate: 0.318 %
           conv2_1/dw run 50 average cost 2.194480 ms, 2.267 %, FlopsRate: 0.635 %
                conv1 run 50 average cost 2.954881 ms, 3.052 %, FlopsRate: 1.906 %
          conv4_2/sep run 50 average cost 3.274820 ms, 3.383 %, FlopsRate: 4.517 %
          conv3_2/sep run 50 average cost 3.464180 ms, 3.579 %, FlopsRate: 4.517 %
          conv2_2/sep run 50 average cost 3.914620 ms, 4.044 %, FlopsRate: 4.517 %
          conv5_6/sep run 50 average cost 3.952840 ms, 4.083 %, FlopsRate: 4.517 %
          conv2_1/sep run 50 average cost 5.019800 ms, 5.186 %, FlopsRate: 4.517 %
          conv4_1/sep run 50 average cost 6.121240 ms, 6.323 %, FlopsRate: 9.034 %
          conv5_1/sep run 50 average cost 6.463820 ms, 6.677 %, FlopsRate: 9.034 %
          conv5_2/sep run 50 average cost 6.497541 ms, 6.712 %, FlopsRate: 9.034 %
          conv3_1/sep run 50 average cost 6.564321 ms, 6.781 %, FlopsRate: 9.034 %
          conv5_3/sep run 50 average cost 6.629260 ms, 6.848 %, FlopsRate: 9.034 %
          conv5_4/sep run 50 average cost 6.718319 ms, 6.940 %, FlopsRate: 9.034 %
          conv5_5/sep run 50 average cost 7.202300 ms, 7.440 %, FlopsRate: 9.034 %
            conv6/sep run 50 average cost 7.380579 ms, 7.624 %, FlopsRate: 9.034 %
Avg= 96.802780 ms, min= 94.481003 ms, max= 107.938004 ms
cd MNN/build/
./MNNV2Basic.out ../benchmark/models/mobilenet_v1_quant.caffe.mnn 50


# ----------------- output ------------------- #
Open Model ../benchmark/models/mobilenet_v1_quant.caffe.mnn
test_main, 199, cost time: 86.402000 ms
===========> Session Resize Done.
===========> Session Start running...
Input: 1, 224, 224, 3
Run 50 time:
     pool6___FloatToInt8___ run 50 average cost 0.687860 ms, 0.897 %, FlopsRate: 0.000 %
                       prob run 50 average cost 0.697700 ms, 0.910 %, FlopsRate: 0.000 %
___Int8ToFloat___for__pool6 run 50 average cost 0.731080 ms, 0.953 %, FlopsRate: 0.008 %
                      pool6 run 50 average cost 0.746080 ms, 0.973 %, FlopsRate: 0.000 %
                 conv5_6/dw run 50 average cost 0.821800 ms, 1.072 %, FlopsRate: 0.040 %
                        fc7 run 50 average cost 0.833360 ms, 1.087 %, FlopsRate: 0.180 %
  ___Int8ToFloat___For_prob run 50 average cost 0.845900 ms, 1.103 %, FlopsRate: 0.000 %
                 conv4_2/dw run 50 average cost 0.874820 ms, 1.141 %, FlopsRate: 0.079 %
                   conv6/dw run 50 average cost 0.943360 ms, 1.230 %, FlopsRate: 0.079 %
      data___FloatToInt8___ run 50 average cost 0.946700 ms, 1.234 %, FlopsRate: 0.034 %
                 conv3_2/dw run 50 average cost 1.026780 ms, 1.339 %, FlopsRate: 0.159 %
                 conv5_5/dw run 50 average cost 1.049580 ms, 1.369 %, FlopsRate: 0.159 %
                 conv5_4/dw run 50 average cost 1.054000 ms, 1.374 %, FlopsRate: 0.159 %
                 conv5_3/dw run 50 average cost 1.055960 ms, 1.377 %, FlopsRate: 0.159 %
                 conv5_2/dw run 50 average cost 1.058860 ms, 1.381 %, FlopsRate: 0.159 %
                 conv5_1/dw run 50 average cost 1.062060 ms, 1.385 %, FlopsRate: 0.159 %
                 conv4_1/dw run 50 average cost 1.340380 ms, 1.748 %, FlopsRate: 0.317 %
                 conv2_2/dw run 50 average cost 1.341880 ms, 1.750 %, FlopsRate: 0.317 %
                 conv2_1/dw run 50 average cost 1.774560 ms, 2.314 %, FlopsRate: 0.635 %
                 conv3_1/dw run 50 average cost 2.050640 ms, 2.674 %, FlopsRate: 0.635 %
                conv4_2/sep run 50 average cost 2.851060 ms, 3.717 %, FlopsRate: 4.515 %
                conv3_2/sep run 50 average cost 2.941639 ms, 3.835 %, FlopsRate: 4.515 %
                conv5_6/sep run 50 average cost 2.961320 ms, 3.861 %, FlopsRate: 4.515 %
                      conv1 run 50 average cost 3.237540 ms, 4.221 %, FlopsRate: 1.905 %
                conv2_2/sep run 50 average cost 3.478861 ms, 4.536 %, FlopsRate: 4.515 %
                conv5_5/sep run 50 average cost 4.189919 ms, 5.463 %, FlopsRate: 9.030 %
                conv2_1/sep run 50 average cost 4.200140 ms, 5.476 %, FlopsRate: 4.515 %
                conv5_1/sep run 50 average cost 4.201300 ms, 5.478 %, FlopsRate: 9.030 %
                conv5_3/sep run 50 average cost 4.221800 ms, 5.505 %, FlopsRate: 9.030 %
                conv5_2/sep run 50 average cost 4.269680 ms, 5.567 %, FlopsRate: 9.030 %
                conv5_4/sep run 50 average cost 4.351900 ms, 5.674 %, FlopsRate: 9.030 %
                conv4_1/sep run 50 average cost 4.569061 ms, 5.957 %, FlopsRate: 9.030 %
                  conv6/sep run 50 average cost 4.663580 ms, 6.081 %, FlopsRate: 9.030 %
                conv3_1/sep run 50 average cost 5.220601 ms, 6.807 %, FlopsRate: 9.030 %
Avg= 76.695343 ms, min= 73.639000 ms, max= 86.840004 ms

参考链接:MNN模型压缩与量化

 

你可能感兴趣的:(MNN)