最近在重构YOLOv5代码,本章主要介绍YOLOv5s的网络结构
我们熟知YOLOv5s由三部分组成,分别为backbone、neck、head
nc
:数据集中的类别数
depth_multiple
:模型层数因子(用来调整网络的深度)
width_multiple
:模型通道数因子(用来调整网络的宽度)
anchors
:锚定框
[from, number, module, args] 参数
第一个参数 from :从哪一层获得输入,-1表示从上一层获得,[-1, 6]表示从上层和第6层两层获得。
第二个参数 number:表示有几个相同的模块,如果为9则表示有9个相同的模块。
第三个参数 module:模块的名称,这些模块写在common.py中。
第四个参数 args:类的初始化参数,用于解析作为 moudle 的传入参数。
nc: 80 # number of classes 数据集中的类别数,也就是你要检测的类别数
depth_multiple: 0.33 # model depth multiple 模型层数因子(用来调整网络的深度)
width_multiple: 0.50 # layer channel multiple 模型通道数因子(用来调整网络的宽度)
anchors: # 表示作用于当前特征图的Anchor大小为 xxx
# 9个anchor,其中P表示特征图的层级,P3/8该层特征图缩放为1/8,是第3层特征
- [10,13, 16,30, 33,23] # P3/8, 表示[10,13],[16,30], [33,23]3个anchor
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5s v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2,通过该层之后特征图的大小变成原图的1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4,通过该层之后特征图的大小变成原图的1/4
[-1, 3, C3, [128]], # 2,通过该层之后特征图的大小不变
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8,过该层之后特征图的大小变成原图的1/8
[-1, 6, C3, [256]], # 4,通过该层之后特征图的大小不变
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16,,过该层之后特征图的大小变成原图的1/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5s v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']], #11 不改变通道数,特征图的长和宽会增加一倍
[[-1, 6], 1, Concat, [1]], # 12 cat backbone P4 与第6层的输出进行特征图的融合
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3,与第4层的输出进行特征图的融合。
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4,与第14层的输出进行特征图的融合
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5,与第10层的输出进行特征图的融合
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
在yolov5中有l,n,m,s,x,5种配置文件,这5种配置文件只是depth_multiple和width_multiple两个参数不同,其它部分都相同的。
- yolov5s.yaml的width_multiple都为0.50
- 举例:[[-1, 1, Conv, [64, 6, 2, 2]],当width_multiple为0.5时候,则输出通道为64*0.5=32通道
层数 | from | moudule | arguments | input | output |
---|---|---|---|---|---|
0 | -1 | Conv | [3, 32, 6, 2, 2] | [3, 640, 640] | [32, 320, 320] |
1 | -1 | Conv | [32, 64, 3, 2] | [32, 320, 320] | [64, 160, 160] |
2 | -1 | C3 | [64, 64, 1] | [64, 160, 160] | [64, 160, 160] |
3 | -1 | Conv | [64, 128, 3, 2] | [64, 160, 160] | [128, 80, 80] |
4 | -1 | C3 | [128, 128, 2] | [128, 80, 80] | [128, 80, 80] |
5 | -1 | Conv | [128, 256, 3, 2] | [128, 80, 80] | [256, 40, 40] |
6 | -1 | C3 | [256, 256, 3] | [256, 40, 40] | [256, 40, 40] |
7 | -1 | Conv | [256, 512, 3, 2] | [256, 40, 40] | [512, 20, 20] |
8 | -1 | C3 | [512, 512, 1] | [512, 20, 20] | [512, 20, 20] |
9 | -1 | SPPF | [512, 512, 5] | [512, 20, 20] | [512, 20, 20] |
10 | -1 | Conv | [512, 256, 1, 1] | [512, 20, 20] | [256, 20, 20] |
11 | -1 | Upsample | [None, 2, ‘nearest’] | [256, 20, 20] | [256, 40, 40] |
12 | [-1, 6] | Concat | [1] | [1, 256, 40, 40],[1, 256, 40, 40] | [512, 40, 40] |
13 | -1 | C3 | [512, 256, 1, False] | [512, 40, 40] | [256, 40, 40] |
14 | -1 | Conv | [256, 128, 1, 1] | [256, 40, 40] | [128, 40, 40] |
15 | -1 | Upsample | [None, 2, ‘nearest’] | [128, 40, 40] | [128, 80, 80] |
16 | [-1, 4] | Concat | [1] | [1, 128, 80, 80],[1, 128, 80, 80] | [256, 80, 80] |
17 | -1 | C3 | [256, 128, 1, False] | [256, 80, 80] | [128, 80, 80] |
18 | -1 | Conv | [128, 128, 3, 2] | [128, 80, 80] | [128, 40, 40] |
19 | [-1, 14] | Concat | [1] | [1, 128, 40, 40],[1, 128, 40, 40] | [256, 40, 40] |
20 | -1 | C3 | [256, 256, 1, False] | [256, 40, 40] | [256, 40, 40] |
21 | -1 | Conv | [256, 256, 3, 2] | [256, 40, 40] | [256, 20, 20] |
22 | [-1, 10] | Concat | [1] | [1, 256, 20, 20],[1, 256, 20, 20] | [512, 20, 20] |
23 | -1 | C3 | [512, 512, 1, False] | [512, 20, 20] | [512, 20, 20] |
24 | [17, 20, 23] | Detect | [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] | [1, 128, 80, 80],[1, 256, 40, 40],[1, 512, 20, 20] | [1, 3, 80, 80, 85],[1, 3, 40, 40, 85],[1, 3, 20, 20, 85] |
注:yolov5l.yaml的depth_multiple和width_multiple都为1
层数 | from | moudule | arguments | input | output |
---|---|---|---|---|---|
0 | -1 | Conv | [3, 64, 6, 2, 2] | [3, 640, 640] | [64, 320, 320] |
1 | -1 | Conv | [64, 128, 3, 2] | [64, 320, 320] | [128, 160, 160] |
2 | -1 | C3 | [128,128, 1] | [128, 160, 160] | [128 160, 160] |
3 | -1 | Conv | [128,256, 3, 2] | [128 160, 160] | [256, 80, 80] |
4 | -1 | C3 | [256,256, 2] | [256, 80, 80] | [256, 80, 80] |
5 | -1 | Conv | [256,512, 3, 2] | [256, 80, 80] | [512, 40, 40] |
6 | -1 | C3 | [512, 512, 3] | [512, 40, 40] | [512, 40, 40] |
7 | -1 | Conv | [512, 1024, 3, 2] | [512, 40, 40] | [1024, 20, 20] |
8 | -1 | C3 | [1024, 1024, 1] | [1024, 20, 20] | [1024, 20, 20] |
9 | -1 | SPPF | [1024, 1024, 5] | [1024, 20, 20] | [1024, 20, 20] |
10 | -1 | Conv | [1024,512, 1, 1] | [1024, 20, 20] | [512, 20, 20] |
11 | -1 | Upsample | [None, 2, ‘nearest’] | [512, 20, 20] | [512, 40, 40] |
12 | [-1, 6] | Concat | [512+512] | [1, 512, 40, 40],[1, 512, 40, 40] | [1024, 40, 40] |
13 | -1 | C3 | [1024,512, 1, False] | [1024, 40, 40] | [512, 40, 40] |
14 | -1 | Conv | [512,256, 1, 1] | [512, 40, 40] | [256, 40, 40] |
15 | -1 | Upsample | [None, 2, ‘nearest’] | [256, 40, 40] | [256, 80, 80] |
16 | [-1, 4] | Concat | [256+256] | [1, 256, 80, 80],[1, 256, 80, 80] | [512, 80, 80] |
17 | -1 | C3 | [512, 256, 1, False] | [512, 80, 80] | [256, 80, 80] |
18 | -1 | Conv | [256, 256, 3, 2] | [256, 80, 80] | [256, 40, 40] |
19 | [-1, 14] | Concat | [256+256] | [1, 256, 40, 40],[1, 256, 40, 40] | [512, 40, 40] |
20 | -1 | C3 | [512, 512, 1, False] | [512, 40, 40] | [512, 40, 40] |
21 | -1 | Conv | [512, 512, 3, 2] | [512, 40, 40] | [512, 20, 20] |
22 | [-1, 10] | Concat | [512+512] | [1, 512, 20, 20],[1, 512, 20, 20] | [1024 20, 20] |
23 | -1 | C3 | [1024,1024, 1, False] | [1024, 20, 20] | [1024, 20, 20] |
24 | [17, 20, 23] | Detect | [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] | [1, 256, 80, 80],[1, 512, 40, 40],[1, 1024, 20, 20] | [1, 3, 80, 80, 85],[1, 3, 40, 40, 85],[1, 3, 20, 20, 85] |
其实看懂代码,结合画的网络结构图,就很容易理解YOLO的模型。后续,将会介绍如何利用YOLOv5进行训练。
参考文章