论文链接:https://arxiv.org/abs/1910.03151
代码地址:https://github.com/BangguWu/ECANet
ECANet主要对SENet模块进行了一些改进,提出了一种不降维的局部跨信道交互策略(ECA模块)和自适应选择一维卷积核大小的方法,从而实现了性能上的提优。最近已经有很多文章在通道和空间注意力上做改进并取得了性能提升。例如SKNet,SANet,ResNeSt等等,不得不说,注意力机制真的香!
因此,足者提出了一种不降维的局部跨信道交互策略,该策略可以通过一维卷积有效地实现。进一步,作者又提出了一种自适应选择一维卷积核大小的方法,以确定局部跨信道交互的覆盖率。
具体来说,在给定输入特征的情况下,SE块首先对每个通道单独使用全局平均池化,然后使用两个具有非线性的完全连接(FC)层,然后使用一个Sigmoid函数来生成通道权值。两个FC层的设计是为了捕捉非线性的跨通道交互,其中包括降维来控制模型的复杂性。虽然该策略在后续的通道注意模块中得到了广泛的应用,但作者的实验研究表明,降维对通道注意预测带来了副作用,捕获所有通道之间的依赖是低效的,也是不必要的。
在不降低维数的通道级全局平均池化之后,ECA通过考虑每个通道及其k个邻居来捕获局部跨通道交互信息。实践证明,该方法保证了模型效率和计算效果。需要注意的是,ECA可以通过大小为k的快速1D卷积来有效实现,其中卷积核大小为k代表了局部跨信道交互的覆盖率,即,该通道附近有多少邻居参与了这个信道的注意力预测,为了避免通过交叉验证对k进行手动调优,本文提出了一种方法来自适应地确定k,其中交互的覆盖率(即卷积核大小 k)与通道维数成正比。
import torch
from torch import nn
from torch.nn.parameter import Parameter
class eca_layer(nn.Module):
"""Constructs a ECA module.
Args:
channel: Number of channels of the input feature map
k_size: Adaptive selection of kernel size
"""
def __init__(self, channel, k_size=3):
super(eca_layer, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
# x: input features with shape [b, c, h, w]
b, c, h, w = x.size()
# feature descriptor on the global spatial information
y = self.avg_pool(x)
# Two different branches of ECA module
y = self.conv(y.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1)
# Multi-scale information fusion
y = self.sigmoid(y)
return x * y.expand_as(x)
from torch import nn
from .eca_module import eca_layer
__all__ = ['ECA_MobileNetV2', 'eca_mobilenet_v2']
model_urls = {
'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
}
class ConvBNReLU(nn.Sequential):
def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
padding = (kernel_size - 1) // 2
super(ConvBNReLU, self).__init__(
nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
nn.BatchNorm2d(out_planes),
nn.ReLU6(inplace=True)
)
class InvertedResidual(nn.Module):
def __init__(self, inp, oup, stride, expand_ratio, k_size):
super(InvertedResidual, self).__init__()
self.stride = stride
assert stride in [1, 2]
hidden_dim = int(round(inp * expand_ratio))
self.use_res_connect = self.stride == 1 and inp == oup
layers = []
if expand_ratio != 1:
# pw
layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
layers.extend([
# dw
ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
])
layers.append(eca_layer(oup, k_size))
self.conv = nn.Sequential(*layers)
def forward(self, x):
if self.use_res_connect:
return x + self.conv(x)
else:
return self.conv(x)
class ECA_MobileNetV2(nn.Module):
def __init__(self, num_classes=1000, width_mult=1.0):
super(ECA_MobileNetV2, self).__init__()
block = InvertedResidual
input_channel = 32
last_channel = 1280
inverted_residual_setting = [
# t, c, n, s
[1, 16, 1, 1],
[6, 24, 2, 2],
[6, 32, 3, 2],
[6, 64, 4, 2],
[6, 96, 3, 1],
[6, 160, 3, 2],
[6, 320, 1, 1],
]
# building first layer
input_channel = int(input_channel * width_mult)
self.last_channel = int(last_channel * max(1.0, width_mult))
features = [ConvBNReLU(3, input_channel, stride=2)]
# building inverted residual blocks
for t, c, n, s in inverted_residual_setting:
output_channel = int(c * width_mult)
for i in range(n):
if c <= 96:
ksize = 1
else:
ksize = 3
stride = s if i == 0 else 1
features.append(block(input_channel, output_channel, stride, expand_ratio=t, k_size=ksize))
input_channel = output_channel
# building last several layers
features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1))
# make it nn.Sequential
self.features = nn.Sequential(*features)
# building classifier
self.classifier = nn.Sequential(
nn.Dropout(0.25),
nn.Linear(self.last_channel, num_classes),
)
# weight initialization
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out')
if m.bias is not None:
nn.init.zeros_(m.bias)
elif isinstance(m, nn.BatchNorm2d):
nn.init.ones_(m.weight)
nn.init.zeros_(m.bias)
elif isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01)
if m.bias is not None:
nn.init.zeros_(m.bias)
def forward(self, x):
x = self.features(x)
x = x.mean(-1).mean(-1)
x = self.classifier(x)
return x
def eca_mobilenet_v2(pretrained=False, progress=True, **kwargs):
"""
Constructs a ECA_MobileNetV2 architecture from
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
progress (bool): If True, displays a progress bar of the download to stderr
"""
model = ECA_MobileNetV2(**kwargs)
# if pretrained:
# state_dict = load_state_dict_from_url(model_urls['mobilenet_v2'],
# progress=progress)
# model.load_state_dict(state_dict)
return model