【YOLOv4探讨 之九】(3)混合域注意力机制CBAM -- 利用Darknet YOLOv4在网络中添加注意力机制

目录

  • 基本概念
  • 配置实现
    • CAM模块配置
    • CBAM
  • 训练效果
  • 小结

前两篇文章《【YOLOv4探讨 之七】利用Darknet YOLOv4在网络中添加注意力机制模块 系列之SE模块》( https://blog.csdn.net/qq_41736617/article/details/118424585)和《【YOLOv4探讨 之八】(2)SAM模块 – 利用Darknet YOLOv4在网络中添加注意力机制》( https://blog.csdn.net/qq_41736617/article/details/118496945)中,我们介绍了SE模块和SAM模块的添加方法,这一篇我们在Darknet中增加了CBAM模块。

基本概念

在Darknet中,CBAM模块可以通过CAM和SAM两种层的拼接来实现。其原理图如下:
【YOLOv4探讨 之九】(3)混合域注意力机制CBAM -- 利用Darknet YOLOv4在网络中添加注意力机制_第1张图片
这里还留了一个小尾巴,就是CAM模块,先看原理图:
【YOLOv4探讨 之九】(3)混合域注意力机制CBAM -- 利用Darknet YOLOv4在网络中添加注意力机制_第2张图片首先分别对待处理的feature map进行通道方向的全局maxpooling和avgpooling,得到2个 C × 1 × 1 C\times 1\times 1 C×1×1的向量,然后做Shared MLP,可以使用两层通道缩放的卷积,参考SE模型,然后对其相加,再做Sigmoid,将输出与待处理的feature map做scale相乘即可。

配置实现

同前两篇一样,这里依然使用的是yolov3-tiny.cfg进行改造,添加RES和CAM、CBAM模块需要在配置文件中增加####标注的内容:

CAM模块配置

这里使用MPL代替Shared MPL,根据convolutional_layer.c代码,可知使用BN的时候,bias = False。两层convolutional_layer的激活函数使用linear。表示全通过


[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2


[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

####添加内容开始#####

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
#####RES开始######
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
###SCAM###
####通道方向Global avgpooling####
[avgpool]
channelpool = 0

[convolutional]
batch_normalize=1
filters=8
size=1
stride=1
pad=1
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=linear
####通道方向Global maxpooling##
[route]
layers = -4

[maxpool]
size=13
stride=13
padding=0

#####MLP########
[convolutional]
batch_normalize=1
filters=8
size=1
stride=1
pad=1
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=linear
#####两个池化层输出相加#####
[shortcut]
from=-5
activation=logistic

[scale_channels]
from = -9
#scale_wh = 1
activation= linear

####CAM结束###
[shortcut]
from=-12
activation=linear

####RES结束#######
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky


[route]
layers = -1,-19

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
#####添加内容结束######
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1
......
......

CBAM

这里没有新的东西了,只是把CAM和SAM叠加即可。

......
......
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2


[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

####添加内容开始#####

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#####RES开始####
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
###CBAM开始###
###CAM开始###
####通道方向Global avgpooling####
[avgpool]
channelpool = 0

[convolutional]
batch_normalize=1
filters=8
size=1
stride=1
pad=1
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=linear
####通道方向Global maxpooling##
[route]
layers = -4

[maxpool]
size=13
stride=13
padding=0

#####MLP########
[convolutional]
batch_normalize=1
filters=8
size=1
stride=1
pad=1
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=linear
#####两个池化层输出相加#####
[shortcut]
from=-5
activation=logistic

[scale_channels]
from = -9
#scale_wh = 1
activation= linear

####CAM结束###
####SAM开始###
[maxpool]
maxpool_depth = 1
out_channels = 1

[route]
layers = -2

[avgpool]
channelpool = 1

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=128
size=7
stride=1
pad=1
activation=logistic #Sigmoid

[sam]
from = -6
activation= linear
####SAM结束###
####CBAM结束####

[shortcut]
from=-18
activation=linear
#####RES结束######
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky


[route]
layers = -1,-25

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
#####添加内容结束######
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1
......
......

训练效果

这里和前面一样,直接上图,具体调参这里省略,不具体比较不同网络结构之间的效果。主要证明改造成功。

2400次训练结果
【YOLOv4探讨 之九】(3)混合域注意力机制CBAM -- 利用Darknet YOLOv4在网络中添加注意力机制_第3张图片
3000次训练结果
【YOLOv4探讨 之九】(3)混合域注意力机制CBAM -- 利用Darknet YOLOv4在网络中添加注意力机制_第4张图片

小结

这次没有贴LOSS变化趋势图,是因为一开始没保存,后来总是出现如下问题:

CUDA Error: an illegal memory access was encountered: 资源暂时不可用

等解决以后再把本文更新完。

你可能感兴趣的:(YOLOv4,Darknet,YOLOv4,注意力机制,CBAM,CUDA问题)