以yolov7 cfg/yolov7-w6-pose.yaml
为例:
# parameters
nc: 1 # number of classes
nkpt: 4 # number of key points
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
dw_conv_kpt: True
anchors:
- [ 19,27, 44,40, 38,94 ] # P3/8
- [ 96,68, 86,152, 180,137 ] # P4/16
- [ 140,301, 303,264, 238,542 ] # P5/32
- [ 436,615, 739,380, 925,792 ] # P6/64
# yolov7 backbone
backbone:
[[-1, 1, ReOrg, []], # 0
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
[-1, 1, Conv, [64, 1, 1]],
[-2, 1, Conv, [64, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 10
[-1, 1, Conv, [256, 3, 2]], # 11-P3/8
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 19
[-1, 1, Conv, [512, 3, 2]], # 20-P4/16
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 28
[-1, 1, Conv, [768, 3, 2]], # 29-P5/32
[-1, 1, Conv, [384, 1, 1]],
[-2, 1, Conv, [384, 1, 1]],
[-1, 1, Conv, [384, 3, 1]],
[-1, 1, Conv, [384, 3, 1]],
[-1, 1, Conv, [384, 3, 1]],
[-1, 1, Conv, [384, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [768, 1, 1]], # 37
[-1, 1, Conv, [1024, 3, 2]], # 38-P6/64
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [512, 3, 1]],
[-1, 1, Conv, [512, 3, 1]],
[-1, 1, Conv, [512, 3, 1]],
[-1, 1, Conv, [512, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]], # 46
]
# yolov7 head
head:
[[-1, 1, SPPCSPC, [512]], # 47
[-1, 1, Conv, [384, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[37, 1, Conv, [384, 1, 1]], # route backbone P5
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [384, 1, 1]],
[-2, 1, Conv, [384, 1, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [384, 1, 1]], # 59
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[28, 1, Conv, [256, 1, 1]], # route backbone P4
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 71
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[19, 1, Conv, [128, 1, 1]], # route backbone P3
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 83
[-1, 1, Conv, [256, 3, 2]],
[[-1, 71], 1, Concat, [1]], # cat
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 93
[-1, 1, Conv, [384, 3, 2]],
[[-1, 59], 1, Concat, [1]], # cat
[-1, 1, Conv, [384, 1, 1]],
[-2, 1, Conv, [384, 1, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[-1, 1, Conv, [192, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [384, 1, 1]], # 103
[-1, 1, Conv, [512, 3, 2]],
[[-1, 47], 1, Concat, [1]], # cat
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 113
[83, 1, Conv, [256, 3, 1]],
[93, 1, Conv, [512, 3, 1]],
[103, 1, Conv, [768, 3, 1]],
[113, 1, Conv, [1024, 3, 1]],
[[114,115,116,117], 1, IKeypoint, [nc, anchors, nkpt]], # Detect(P3, P4, P5, P6)
]
其中头部部分:
nc: 1 # number of classes
nkpt: 4 # number of key points
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
dw_conv_kpt: True
nc
:表示任务类别个数。例如做人、车、狗检测,此时nc=3nkpt
:表示关键点的数量。如做人的17个关键点检测,此时nkpt=17dept_multiple
:表示模型的深度width_multiple
:表示模型的宽度其中backbone部分:
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
[-1, 1, Conv, [64, 1, 1]],
[-2, 1, Conv, [64, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 10
这一部分从P2(第2层开始),向下一次为第3层,第4层,…,第10层。
其中
-1
表示当前层的输入来自上一层,若是-2
表示当前层(i)的输入来自第(i-2)层。1
表示这个参数乘上模型深度的超参数,就可以控制模型的深度Conv
表示具体的网络层Conv
:输出通道、卷积核大小、步长SPP
:输出通道、卷积核大小Foucs
:输出通道、卷积核大小BottleckCSP
:输出通道、是否启用shortcutConcat
:拼接维度Detect
:类别个数、anchors其中尾部部分:
[[114,115,116,117], 1, IKeypoint, [nc, anchors, nkpt]], # Detect(P3, P4, P5, P6)
[114,115,116,117]
:表示检测层的特征图来源,此时检测层层数为4IKeypoint
:关键点检测[nc, anchors, nkpt]
:
nc
:表示类别anchor
:表示anchornkpt
:表示关键点数量