GhostNet由以Ghost modules为基础的Ghost Bottlenecks构成
如上图所示,GhostNet的第一层是一个有16个卷积核的标准卷积层,接下来是一系列用于增加通道数的Ghost Bottlenecks。这些Ghost Bottlenecks根据输入特征图的尺寸在不同阶段被分组。
除了每个阶段的最后一层,所有Ghost Bottlenecks的stride都为1。
在网络的最后,全局平均池化和卷积层将特征图转换为1280维用于最终分类。
为了能够定制网络,可以使用一个乘法因子α乘在通道数上。
α能够改变整个网络的宽度,α越小,性能越差。
def _make_divisible(v, divisor, min_value=None):
"""
This function is taken from the original tf repo.
It ensures that all layers have a channel number that is divisible by 8
它确保所有层的通道数都能被8整除
It can be seen here:
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
"""
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
# Make sure that round down does not go down by more than 10%.
# 确保四舍五入不超过10%
if new_v < 0.9 * v:
new_v += divisor
return new_v
class GhostNet(nn.Module):
def __init__(self, cfgs, num_classes=1000, width_mult=1.): # width_mult:宽度乘法因子
super(GhostNet, self).__init__()
# setting of inverted residual blocks
self.cfgs = cfgs
# building first layer
output_channel = _make_divisible(16 * width_mult, 4) # 保证output_channel能够被8整除
layers = [nn.Sequential(
nn.Conv2d(3, output_channel, 3, 2, 1, bias=False),
nn.BatchNorm2d(output_channel),
nn.ReLU(inplace=True)
)]
input_channel = output_channel # 这部分的输出通道数将作为下一部分的输入通道数
# building inverted residual blocks
block = GhostBottleneck
for k, exp_size, c, use_se, s in self.cfgs: # k:kernel_size s:stride
output_channel = _make_divisible(c * width_mult, 4) # c:输出的通道数,对应论文Table1中的#out的值
hidden_channel = _make_divisible(exp_size * width_mult, 4) # exp_size:bottleneck中DWConv的通道数
layers.append(block(input_channel, hidden_channel, output_channel, k, s, use_se))
input_channel = output_channel
self.features = nn.Sequential(*layers)
# building last several layers
output_channel = _make_divisible(exp_size * width_mult, 4)
self.squeeze = nn.Sequential(
nn.Conv2d(input_channel, output_channel, 1, 1, 0, bias=False),
nn.BatchNorm2d(output_channel),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d((1, 1)),
)
input_channel = output_channel
output_channel = 1280
self.classifier = nn.Sequential( # Table1中的最后两层
nn.Linear(input_channel, output_channel, bias=False),
nn.BatchNorm1d(output_channel),
nn.ReLU(inplace=True),
nn.Dropout(0.2),
nn.Linear(output_channel, num_classes),
)
self._initialize_weights()
def forward(self, x):
x = self.features(x)
x = self.squeeze(x)
x = x.view(x.size(0), -1) # 将张量x变为矩阵
x = self.classifier(x)
return x
def _initialize_weights(self):
for m in self.modules(): # self.modules()包含网络模块的自己本身和所有后代模块。
if isinstance(m, nn.Conv2d): # 如果模块m是nn.Conv2d
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
# 使用kaiming初始化,
# mode可以为“fan_in”(默认)或“fan_out”。
# “fan_in”保留前向传播时权值方差的量级,
# “fan_out”保留反向传播时的量级
elif isinstance(m, nn.BatchNorm2d): # 如果模块m是nn.BatchNorm2d
m.weight.data.fill_(1)
m.bias.data.zero_()
def ghost_net(**kwargs):
"""
Constructs a MobileNetV3-Large model
"""
cfgs = [ # 对应论文Table1中的数据
# k, t, c, SE, s
[3, 16, 16, 0, 1],
[3, 48, 24, 0, 2],
[3, 72, 24, 0, 1],
[5, 72, 40, 1, 2],
[5, 120, 40, 1, 1],
[3, 240, 80, 0, 2],
[3, 200, 80, 0, 1],
[3, 184, 80, 0, 1],
[3, 184, 80, 0, 1],
[3, 480, 112, 1, 1],
[3, 672, 112, 1, 1],
[5, 672, 160, 1, 2],
[5, 960, 160, 0, 1],
[5, 960, 160, 1, 1],
[5, 960, 160, 0, 1],
[5, 960, 160, 1, 1]
]
return GhostNet(cfgs, **kwargs)
view函数将四维张量转换成二维矩阵
如上图中的代码所示,张量a的形状为[2, 4, 3, 3]
在运行 b = a.view(a.size(0),-1) 后,b的形状变为[2, 36],其中的2是a第0维的数字,36=4x3x3
在运行 c = a.view(a.size(1),-1) 后,b的形状变为[4, 18],其中的4是a第1维的数字,18=2x3x3
同理还有 d = a.view(a.size(2),-1)和 e = a.view(a.size(3),-1)
d的形状为[3, 24];e的形状为[3, 24]
如有错误希望大家批评指正!感谢!