yolov5模型压缩心得

当模型变小了很多,运行速度是不是也变快了?

答案显然不是的,模型的运行时间和参数量不成正比,在这里,我对yolov5n模型进行对对比测试,发现结果如下

模型 模型压缩说明 模型大小(onnx) 运行时间(ncnn)
yolov5n yolov5n原始模型 7.1m base mean time (秒): 0.0411731
yolov5n_0.3p 模型压缩0.3倍 4.3m 0.3prune mean time (秒): 0.0395994
yolov5n_0.6p 模型压缩0.6倍 2.3m 0.6 nochangep mean time (秒): 0.0413083
yolov5n_0.7p 模型压缩0.7倍 1.9m 0.7 nochange mean time (秒): 0.041187
yolov5n_0.7change_p 模型压缩0.7倍,压缩做了优化修改 2.0m 0.7changepurn mean time (秒): 0.0356186

在这里对模型压缩方式进行说明:初始版本对模型进行稀疏化训练,然后根据bn层的值的大小对模型进行减枝操作,但是在裁剪过程中你会发现一下情况:

prune0.7
==============================================================================================
|	layer name               |         origin channels     |         remaining channels  |
|	model.0.bn               |         16                  |         13                  |
|	model.1.bn               |         32                  |         31                  |
|	model.2.cv1.bn           |         16                  |         16                  |
|	model.2.cv2.bn           |         16                  |         15                  |
|	model.2.cv3.bn           |         32                  |         32                  |
|	model.2.m.0.cv1.bn       |         16                  |         16                  |
|	model.2.m.0.cv2.bn       |         16                  |         16                  |
|	model.3.bn               |         64                  |         57                  |
|	model.4.cv1.bn           |         32                  |         32                  |
|	model.4.cv2.bn           |         32                  |         28                  |
|	model.4.cv3.bn           |         64                  |         57                  |
|	model.4.m.0.cv1.bn       |         32                  |         32                  |
|	model.4.m.0.cv2.bn       |         32                  |         32                  |
|	model.4.m.1.cv1.bn       |         32                  |         32                  |
|	model.4.m.1.cv2.bn       |         32                  |         32                  |
|	model.5.bn               |         128                 |         66                  |
|	model.6.cv1.bn           |         64                  |         64                  |
|	model.6.cv2.bn           |         64                  |         30                  |
|	model.6.cv3.bn           |         128                 |         54                  |
|	model.6.m.0.cv1.bn       |         64                  |         64                  |
|	model.6.m.0.cv2.bn       |         64                  |         64                  |
|	model.6.m.1.cv1.bn       |         64                  |         64                  |
|	model.6.m.1.cv2.bn       |         64                  |         64                  |
|	model.6.m.2.cv1.bn       |         64                  |         64                  |
|	model.6.m.2.cv2.bn       |         64                  |         64                  |
|	model.7.bn               |         256                 |         1                   |
|	model.8.cv1.bn           |         128                 |         128                 |
|	model.8.cv2.bn           |         128                 |         1                   |
|	model.8.cv3.bn           |         256                 |         4                   |
|	model.8.m.0.cv1.bn       |         128                 |         128                 |
|	model.8.m.0.cv2.bn       |         128                 |         128                 |
|	model.9.cv1.bn           |         128                 |         1                   |
|	model.9.cv2.bn           |         256                 |         18                  |
|	model.10.bn              |         128                 |         13                  |
|	model.13.cv1.bn          |         64                  |         56                  |
|	model.13.cv2.bn          |         64                  |         8                   |
|	model.13.cv3.bn          |         128                 |         50                  |
|	model.13.m.0.cv1.bn      |         64                  |         47                  |
|	model.13.m.0.cv2.bn      |         64                  |         51                  |
|	model.14.bn              |         64                  |         35                  |
|	model.17.cv1.bn          |         32                  |         18                  |
|	model.17.cv2.bn          |         32                  |         5                   |
|	model.17.cv3.bn          |         64                  |         60                  |
|	model.17.m.0.cv1.bn      |         32                  |         22                  |
|	model.17.m.0.cv2.bn      |         32                  |         27                  |
|	model.18.bn              |         64                  |         17                  |
|	model.20.cv1.bn          |         64                  |         19                  |
|	model.20.cv2.bn          |         64                  |         9                   |
|	model.20.cv3.bn          |         128                 |         75                  |
|	model.20.m.0.cv1.bn      |         64                  |         17                  |
|	model.20.m.0.cv2.bn      |         64                  |         38                  |
|	model.21.bn              |         128                 |         18                  |
|	model.23.cv1.bn          |         128                 |         11                  |
|	model.23.cv2.bn          |         128                 |         9                   |
|	model.23.cv3.bn          |         256                 |         63                  |
|	model.23.m.0.cv1.bn      |         128                 |         10                  |
|	model.23.m.0.cv2.bn      |         128                 |         30                  |
=====================================================================================

在这里你会发现,压缩的模型会存在128->1的通道变化,其实这个在模型设计中是不太合理的,我之前在一个博客中看到,通道最好都设计成4、8、16、32的指数倍,所以我在这里萌生了一个想法,在根据通道bn层相应大小的准则下,进行通道数目的优化保存。于是对比0.7倍压缩和优化后的0.7倍压缩,虽然模型大小优化后的模型大小变大了,但是模型运行速度有了一个客观的提升。

def obtain_bn_mask_change(bn_module, thre):

    thre = thre.cuda()
    mask = bn_module.weight.data.abs().ge(thre).float()
    weights_numpy =bn_module.weight.data.abs().cpu().numpy()
    mask_length = int(mask.sum().item())

    # aa =mask.sum().item()
    if int(mask.sum().item())==0:
        max_val=bn_module.weight.data.abs().max().item()-0.00001
        mask = bn_module.weight.data.abs().ge(max_val).float()

    return mask
    
def getchangemasklength(mask_length):
    vals=[8,16,32,64,128,256,512]
    muls=[]

    for val in [8,16,32,64,128,256,512]:
       len8 =abs(mask_length -val)
       muls.append(len8)

    index =muls.index(min(muls))
    return vals[index]

def obtain_bn_maskchange(bn_module, thre):

    thre = thre.cuda()
    mask = bn_module.weight.data.abs().ge(thre).float()
    weights_numpy =bn_module.weight.data.abs().cpu().numpy()
    mask_length = int(mask.sum().item())

     top=getchangemasklength(mask_length)
     sort_numpy =np.sort(weights_numpy)
     top_numpy =sort_numpy[:top]
     mask = np.isin(weights_numpy, top_numpy)


    return torch.from_numpy(mask).cuda()

你可能感兴趣的:(基本算法,人工智能,深度学习,python)