本文摘要:提出了一种结合增强上下文和细化特征的特征金字塔网络。将多尺度扩张卷积得到的特征自上至下融合注入特征金字塔网络,补充上下文信息。引入通道和空间特征细化机制,抑制多尺度特征融合中的冲突形成,防止微小目标被淹没在冲突信息中。此外,提出了一种复制-减少-粘贴的数据增强方法,该方法可以增加微小对象在训练过程中对损失的贡献,确保训练更加均衡。 实验结果表明,该网络在VOC数据集上的目标平均精度达到16.9% (IOU=0.5:0.95),比YOLOV4高3.9%,比CenterNet高7.7%,比RefineDet高5.3%。
增强上下文操作:在C5上以不同的扩张率空洞卷积进行卷积以获得不同感受野的语义信息。核大小为3×3,扩张率为1、3、5。其实就是多尺度扩张卷积的级联。
# 代码索要 @马化腾:1444151069
class H4(nn.Module):
# 实现第四个头的生成nn.Module类,实现F2的特征图
def __init__(self, c):
super().__init__()
# 创建3个上采样操作,倍率依次为2,4,8,对应F3,F4,F5
self.ups = []
for i in range(1, 4):
self.ups.append(nn.Upsample(scale_factor=int(2 ** i)))
# 1*1 卷积,连接C2
self.conv = nn.Conv2d(c, c, 1, 1)
def forward(self, x):
# x是对应[F5, F4, F3, C2]特征图
feas = []
for i, up in enumerate(self.ups):
if i < 3:
feas.append(up(x[i]))
feas.append(self.conv(x[3]))
feas = torch.cat(feas, dim=1)
return feas
# anchors变化:
anchors:
- [5,6, 8,15, 16,12] # F2
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# 代码索要 @马化腾:1444151069
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, CoordAtt, [1024, 5]], # 9 CA <-- Coordinate Attention [out_channel, reduction]
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]], # F5
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13 F4
[[-1, 10, 4], 1, H3, [256]], # route [F4,F5,C3] -> F3
[[-1, -2, 10, 2], 1, H4, [128]], # route [F3,F4,F5,C2] -> F2
[[10, 13, 14, 15], 1, FRM, [4736,0]], # Centered on F5
[[10, 13, 14, 15], 1, FRM, [4736,1]], # Centered on F4
[[10, 13, 14, 15], 1, FRM, [4736,2]], # Centereds on F3
[[10, 13, 14, 15], 1, FRM, [4736,3]], # Centered on F2
[[19, 18, 17, 16], 1, Detect, [nc, anchors]], # Detect(F2, P3, P4, P5)
]
class FRM(nn.Module):
# 实现 FRM nn.Module类
def __init__(self, c1, m=0):...
def forward(self, x):
# x 是对应不同特征图
feas = []
for i, l in enumerate(self.layer1):
feas.append(l(x[i]))
C1 = self.conv(torch.cat(feas, dim=1))
C1_1 = self.cp(C1)
C1_2 = self.sfm(C1)
C1_1 = split_multipy(C1_1, feas)
C1_2 = split_multipy(C1_2, feas)
return self.cat_conv(C1_1+C1_2)
yolo: cfg=yolov5s1.yaml, device=cpu, profile=False
YOLOv5 2022-8-3 torch 1.7.1+cu101 CPU
from n params module arguments
0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 models.common.C3 [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 158002 models.common.CoordAtt [512, 512, 5]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 [-1, 10, 4] 1 409728 models.h3h4.H3 [128]
15 [-1, -2, 10, 2] 1 102464 models.h3h4.H4 [64]
16 [10, 13, 14, 15] 1 390100 models.frm.FRM [2368, 0]
17 [10, 13, 14, 15] 1 390100 models.frm.FRM [2368, 1]
18 [10, 13, 14, 15] 1 390100 models.frm.FRM [2368, 2]
19 [10, 13, 14, 15] 1 390100 models.frm.FRM [2368, 3]
20 [19, 18, 17, 16] 1 131580 Detect [80, [[5, 6, 8, 15, 16, 12], [10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 128, 128, 128]]
Model Summary: 242 layers, 6370302 parameters, 6370302 gradients