目录
1. 量化的作用
2. 编译
3. 量化工具的使用
量化将网络中主要算子(卷积)由原先的浮点计算转成低精度的Int8计算,减少模型大小并提升性能
(1) 编译宏
编译MNN时开启 MNN_BUILD_QUANTOOLS 宏,即开启量化工具的编译
(2) 编译产物
量化模型的工具: quantized.out
量化模型与浮点模型的对比工具:testQuanModel.out
cd MNN/build ./quantized.out ../benchmark/models/mobilenet_v1.caffe.mnn ../benchmark/models/mobilenet_v1_quant.caffe.mnn mobilenetCaffeConfig.json
注意:mobilenetCaffeConfig.json 文件在目录 MNN/tools/quantization/.
执行完上一条命令,量化后的网络将保存在 MNN/benchmark/models/目录下。接下来,测试比较量化前后模型的运行时间:
cd MNN/build/ ./benchmark.out ../benchmark/models/ 10 # ----------------- output ------------------- # MNN benchmark Forward type: **CPU** thread=4** precision=2 --------> Benchmarking... loop = 10 [ - ] mobilenet_v1_quant.caffe.mnn max = 75.278ms min = 74.261ms avg = 74.763ms [ - ] mobilenet_v1.caffe.mnn max = 107.799ms min = 94.063ms avg = 98.914ms
最后,测试比较量化前后模型的各个网络层运行时间:
cd MNN/build/ ./MNNV2Basic.out ../benchmark/models/mobilenet_v1.caffe.mnn 50 # ----------------- output ------------------- # Open Model ../benchmark/models/mobilenet_v1.caffe.mnn test_main, 199, cost time: 36.553001 ms ===========> Session Resize Done. ===========> Session Start running... Input: 1, 224, 224, 3 Run 50 time: prob run 50 average cost 0.704100 ms, 0.727 %, FlopsRate: 0.000 % pool6 run 50 average cost 0.821800 ms, 0.849 %, FlopsRate: 0.000 % conv5_6/dw run 50 average cost 0.872040 ms, 0.901 %, FlopsRate: 0.040 % conv6/dw run 50 average cost 0.885800 ms, 0.915 %, FlopsRate: 0.079 % conv5_5/dw run 50 average cost 0.977860 ms, 1.010 %, FlopsRate: 0.159 % conv5_1/dw run 50 average cost 0.984140 ms, 1.017 %, FlopsRate: 0.159 % conv5_2/dw run 50 average cost 0.996680 ms, 1.030 %, FlopsRate: 0.159 % conv5_4/dw run 50 average cost 0.997360 ms, 1.030 %, FlopsRate: 0.159 % conv4_2/dw run 50 average cost 1.052500 ms, 1.087 %, FlopsRate: 0.079 % conv5_3/dw run 50 average cost 1.073480 ms, 1.109 %, FlopsRate: 0.159 % conv4_1/dw run 50 average cost 1.310980 ms, 1.354 %, FlopsRate: 0.318 % conv3_2/dw run 50 average cost 1.534120 ms, 1.585 %, FlopsRate: 0.159 % fc7 run 50 average cost 1.652180 ms, 1.707 %, FlopsRate: 0.180 % conv3_1/dw run 50 average cost 1.987280 ms, 2.053 %, FlopsRate: 0.635 % conv2_2/dw run 50 average cost 2.183100 ms, 2.255 %, FlopsRate: 0.318 % conv2_1/dw run 50 average cost 2.194480 ms, 2.267 %, FlopsRate: 0.635 % conv1 run 50 average cost 2.954881 ms, 3.052 %, FlopsRate: 1.906 % conv4_2/sep run 50 average cost 3.274820 ms, 3.383 %, FlopsRate: 4.517 % conv3_2/sep run 50 average cost 3.464180 ms, 3.579 %, FlopsRate: 4.517 % conv2_2/sep run 50 average cost 3.914620 ms, 4.044 %, FlopsRate: 4.517 % conv5_6/sep run 50 average cost 3.952840 ms, 4.083 %, FlopsRate: 4.517 % conv2_1/sep run 50 average cost 5.019800 ms, 5.186 %, FlopsRate: 4.517 % conv4_1/sep run 50 average cost 6.121240 ms, 6.323 %, FlopsRate: 9.034 % conv5_1/sep run 50 average cost 6.463820 ms, 6.677 %, FlopsRate: 9.034 % conv5_2/sep run 50 average cost 6.497541 ms, 6.712 %, FlopsRate: 9.034 % conv3_1/sep run 50 average cost 6.564321 ms, 6.781 %, FlopsRate: 9.034 % conv5_3/sep run 50 average cost 6.629260 ms, 6.848 %, FlopsRate: 9.034 % conv5_4/sep run 50 average cost 6.718319 ms, 6.940 %, FlopsRate: 9.034 % conv5_5/sep run 50 average cost 7.202300 ms, 7.440 %, FlopsRate: 9.034 % conv6/sep run 50 average cost 7.380579 ms, 7.624 %, FlopsRate: 9.034 % Avg= 96.802780 ms, min= 94.481003 ms, max= 107.938004 ms
cd MNN/build/ ./MNNV2Basic.out ../benchmark/models/mobilenet_v1_quant.caffe.mnn 50 # ----------------- output ------------------- # Open Model ../benchmark/models/mobilenet_v1_quant.caffe.mnn test_main, 199, cost time: 86.402000 ms ===========> Session Resize Done. ===========> Session Start running... Input: 1, 224, 224, 3 Run 50 time: pool6___FloatToInt8___ run 50 average cost 0.687860 ms, 0.897 %, FlopsRate: 0.000 % prob run 50 average cost 0.697700 ms, 0.910 %, FlopsRate: 0.000 % ___Int8ToFloat___for__pool6 run 50 average cost 0.731080 ms, 0.953 %, FlopsRate: 0.008 % pool6 run 50 average cost 0.746080 ms, 0.973 %, FlopsRate: 0.000 % conv5_6/dw run 50 average cost 0.821800 ms, 1.072 %, FlopsRate: 0.040 % fc7 run 50 average cost 0.833360 ms, 1.087 %, FlopsRate: 0.180 % ___Int8ToFloat___For_prob run 50 average cost 0.845900 ms, 1.103 %, FlopsRate: 0.000 % conv4_2/dw run 50 average cost 0.874820 ms, 1.141 %, FlopsRate: 0.079 % conv6/dw run 50 average cost 0.943360 ms, 1.230 %, FlopsRate: 0.079 % data___FloatToInt8___ run 50 average cost 0.946700 ms, 1.234 %, FlopsRate: 0.034 % conv3_2/dw run 50 average cost 1.026780 ms, 1.339 %, FlopsRate: 0.159 % conv5_5/dw run 50 average cost 1.049580 ms, 1.369 %, FlopsRate: 0.159 % conv5_4/dw run 50 average cost 1.054000 ms, 1.374 %, FlopsRate: 0.159 % conv5_3/dw run 50 average cost 1.055960 ms, 1.377 %, FlopsRate: 0.159 % conv5_2/dw run 50 average cost 1.058860 ms, 1.381 %, FlopsRate: 0.159 % conv5_1/dw run 50 average cost 1.062060 ms, 1.385 %, FlopsRate: 0.159 % conv4_1/dw run 50 average cost 1.340380 ms, 1.748 %, FlopsRate: 0.317 % conv2_2/dw run 50 average cost 1.341880 ms, 1.750 %, FlopsRate: 0.317 % conv2_1/dw run 50 average cost 1.774560 ms, 2.314 %, FlopsRate: 0.635 % conv3_1/dw run 50 average cost 2.050640 ms, 2.674 %, FlopsRate: 0.635 % conv4_2/sep run 50 average cost 2.851060 ms, 3.717 %, FlopsRate: 4.515 % conv3_2/sep run 50 average cost 2.941639 ms, 3.835 %, FlopsRate: 4.515 % conv5_6/sep run 50 average cost 2.961320 ms, 3.861 %, FlopsRate: 4.515 % conv1 run 50 average cost 3.237540 ms, 4.221 %, FlopsRate: 1.905 % conv2_2/sep run 50 average cost 3.478861 ms, 4.536 %, FlopsRate: 4.515 % conv5_5/sep run 50 average cost 4.189919 ms, 5.463 %, FlopsRate: 9.030 % conv2_1/sep run 50 average cost 4.200140 ms, 5.476 %, FlopsRate: 4.515 % conv5_1/sep run 50 average cost 4.201300 ms, 5.478 %, FlopsRate: 9.030 % conv5_3/sep run 50 average cost 4.221800 ms, 5.505 %, FlopsRate: 9.030 % conv5_2/sep run 50 average cost 4.269680 ms, 5.567 %, FlopsRate: 9.030 % conv5_4/sep run 50 average cost 4.351900 ms, 5.674 %, FlopsRate: 9.030 % conv4_1/sep run 50 average cost 4.569061 ms, 5.957 %, FlopsRate: 9.030 % conv6/sep run 50 average cost 4.663580 ms, 6.081 %, FlopsRate: 9.030 % conv3_1/sep run 50 average cost 5.220601 ms, 6.807 %, FlopsRate: 9.030 % Avg= 76.695343 ms, min= 73.639000 ms, max= 86.840004 ms
参考链接:MNN模型压缩与量化