Author :Horizon Max
✨ 编程技巧篇:各种操作小结
机器视觉篇:会变魔术 OpenCV
深度学习篇:简单入门 PyTorch
神经网络篇:经典网络模型
算法篇:再忙也别忘了 LeetCode
ShuffleNet 使用是一种计算效率极高的CNN架构,它是专门为计算能力非常有限的移动设备设计的 ;
通过 逐点分组卷积(Pointwise Group Convolution) 和 通道洗牌(Channel Shuffle) 两种新运算,在保持精度的同时大大降低了计算成本 ;
ShuffleNet 比最近的 MobileNet 在 ImageNet 分类任务上的 top-1误差更低 (绝对7.8%) ;
在基于ARM的移动设备上,ShuffleNet 比 AlexNet 实现了约13倍的实际加速,同时保持了相当的精度 ;
论文地址:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
在最先进的基础网络结构中,像 Xception
和 ResNeXt
在非常小的网络中效率会降低,因为密集的 1×1卷积
代价很高 ;
基于此作者提出了 pointwise group convolution
以减少 1×1卷积 的计算复杂度 ;
为克制 pointwise group convolution 带来的副作用,提出了 channel shuffle
的操作,用于实现信息在特征通道之间流动 ;
分组卷积(Group Convolution)
的概念首先是在 AlexNet 中引入,用于将模型分布到两块 GPU 上 ;
在 Xception 和 MobileNet 中使用的 深度可分离卷积(depthwise separable convolution)
也都印证了它的有效性 ;
在小型网络中,昂贵的逐点卷积会导致满足复杂度约束的通道数量有限,从而严重的影响精度 ;
最直接的 解决方案 是:采用通道稀疏连接( channel sparse connections ),例如 分组卷积可以大大降低计算成本 ;
这样就会出现一个 问题 :某个通道的输出只能来自一小部分输入通道,这样阻止了通道之间的信息流,也就削弱了神经网络表达能力 ;
作者进一步将 分组卷积
和 深度可分离卷积
推广为一种新的形式 —— 利用 通道洗牌操作
( Channel Shuffle Operation ) ;
通过 通道洗牌(Channel Shuffle)
允许 分组卷积 从不同的组中获取输入数据,从而实现输入通道和输出通道相关联 ;
基于残差块(residual block)和 通道洗牌(channel shuffle)设计的 ShuffleNet Unit
;
(a):深度卷积
(b):逐点分组卷积
(c):逐点分组卷积 ( stride=2 )
第二个 1×1 GConv
的作用是:改变通道维数,实现与旁路的 shutcut 的维度相同 ;
(1)filters 数量对模型精度的影响(1×表示正常,0.5×表示数量减少到0.5倍):
(2)分组卷积(Group Convolution) 中不同组数量(g) 对模型精度的影响:
# Here is the code :
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchinfo import summary
def channel_shuffle(x, groups):
batchsize, num_channels, height, width = x.size()
channels_per_group = num_channels // groups
# reshape: b, num_channels, h, w --> b, groups, channels_per_group, h, w
x = x.view(batchsize, groups, channels_per_group, height, width)
# channelshuffle
x = torch.transpose(x, 1, 2).contiguous()
# flatten
x = x.view(batchsize, -1, height, width)
return x
class shuffleNet_unit(nn.Module):
expansion = 1
def __init__(self, in_channels, out_channels, stride, groups):
super(shuffleNet_unit, self).__init__()
mid_channels = out_channels//4
self.stride = stride
if in_channels == 24:
self.groups = 1
else:
self.groups = groups
self.GConv1 = nn.Sequential(
nn.Conv2d(in_channels, mid_channels, kernel_size=1, stride=1, groups=self.groups, bias=False),
nn.BatchNorm2d(mid_channels),
nn.ReLU(inplace=True)
)
self.DWConv = nn.Sequential(
nn.Conv2d(mid_channels, mid_channels, kernel_size=3, stride=self.stride, padding=1, groups=self.groups, bias=False),
nn.BatchNorm2d(mid_channels)
)
self.GConv2 = nn.Sequential(
nn.Conv2d(mid_channels, out_channels, kernel_size=1, stride=1, groups=self.groups, bias=False),
nn.BatchNorm2d(out_channels)
)
if self.stride == 2:
self.shortcut = nn.Sequential(
nn.AvgPool2d(kernel_size=3, stride=2, padding=1)
)
else:
self.shortcut = nn.Sequential()
def forward(self, x):
out = self.GConv1(x)
out = channel_shuffle(out, groups=self.groups)
out = self.DWConv(out)
out = self.GConv2(out)
short = self.shortcut(x)
if self.stride == 2:
out = F.relu(torch.cat([out, short], dim=1))
else:
out = F.relu(out + short)
return out
class ShuffleNet(nn.Module):
def __init__(self, groups, num_layers, num_channels, num_classes=1000):
super(ShuffleNet, self).__init__()
self.groups = groups
self.conv1 = nn.Sequential(
nn.Conv2d(3, 24, 3, 2, 1, bias=False),
nn.BatchNorm2d(24),
nn.ReLU(inplace=True),
)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.stage2 = self.make_layers(24, num_channels[0], num_layers[0], groups)
self.stage3 = self.make_layers(num_channels[0], num_channels[1], num_layers[1], groups)
self.stage4 = self.make_layers(num_channels[1], num_channels[2], num_layers[2], groups)
self.globalpool = nn.AvgPool2d(kernel_size=7, stride=1)
self.fc = nn.Linear(num_channels[2], num_classes)
def make_layers(self, in_channels, out_channels, num_layers, groups):
layers = []
layers.append(shuffleNet_unit(in_channels, out_channels - in_channels, 2, groups))
in_channels = out_channels
for i in range(num_layers - 1):
layers.append(shuffleNet_unit(in_channels, out_channels, 1, groups))
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.maxpool(x)
x = self.stage2(x)
x = self.stage3(x)
x = self.stage4(x)
x = self.globalpool(x)
x = x.view(x.size(0), -1)
out = self.fc(x)
return out
def ShuffleNet_g1(**kwargs):
num_layers = [4, 8, 4]
num_channels = [144, 288, 576]
model = ShuffleNet(1, num_layers, num_channels, **kwargs)
return model
def ShuffleNet_g2(**kwargs):
num_layers = [4, 8, 4]
num_channels = [200, 400, 800]
model = ShuffleNet(2, num_layers, num_channels, **kwargs)
return model
def ShuffleNet_g3(**kwargs):
num_layers = [4, 8, 4]
num_channels = [240, 480, 960]
model = ShuffleNet(3, num_layers, num_channels, **kwargs)
return model
def ShuffleNet_g4(**kwargs):
num_layers = [4, 8, 4]
num_channels = [272, 544, 1088]
model = ShuffleNet(4, num_layers, num_channels, **kwargs)
return model
def ShuffleNet_g8(**kwargs):
num_layers = [4, 8, 4]
num_channels = [384, 768, 1536]
model = ShuffleNet(8, num_layers, num_channels, **kwargs)
return model
def test():
net = ShuffleNet_g8()
y = net(torch.randn(1, 3, 224, 224))
print(y.size())
summary(net, (1, 3, 224, 224), depth=5)
if __name__ == '__main__':
test()
输出结果:
torch.Size([1, 1000])
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
ShuffleNet -- --
├─Sequential: 1-1 [1, 24, 112, 112] --
│ └─Conv2d: 2-1 [1, 24, 112, 112] 648
│ └─BatchNorm2d: 2-2 [1, 24, 112, 112] 48
│ └─ReLU: 2-3 [1, 24, 112, 112] --
├─MaxPool2d: 1-2 [1, 24, 56, 56] --
├─Sequential: 1-3 [1, 384, 28, 28] --
│ └─shuffleNet_unit: 2-4 [1, 384, 28, 28] --
│ │ └─Sequential: 3-1 [1, 90, 56, 56] --
│ │ │ └─Conv2d: 4-1 [1, 90, 56, 56] 2,160
│ │ │ └─BatchNorm2d: 4-2 [1, 90, 56, 56] 180
│ │ │ └─ReLU: 4-3 [1, 90, 56, 56] --
│ │ └─Sequential: 3-2 [1, 90, 28, 28] --
│ │ │ └─Conv2d: 4-4 [1, 90, 28, 28] 72,900
│ │ │ └─BatchNorm2d: 4-5 [1, 90, 28, 28] 180
│ │ └─Sequential: 3-3 [1, 360, 28, 28] --
│ │ │ └─Conv2d: 4-6 [1, 360, 28, 28] 32,400
│ │ │ └─BatchNorm2d: 4-7 [1, 360, 28, 28] 720
│ │ └─Sequential: 3-4 [1, 24, 28, 28] --
│ │ │ └─AvgPool2d: 4-8 [1, 24, 28, 28] --
│ └─shuffleNet_unit: 2-5 [1, 384, 28, 28] --
│ │ └─Sequential: 3-5 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-9 [1, 96, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-10 [1, 96, 28, 28] 192
│ │ │ └─ReLU: 4-11 [1, 96, 28, 28] --
│ │ └─Sequential: 3-6 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-12 [1, 96, 28, 28] 10,368
│ │ │ └─BatchNorm2d: 4-13 [1, 96, 28, 28] 192
│ │ └─Sequential: 3-7 [1, 384, 28, 28] --
│ │ │ └─Conv2d: 4-14 [1, 384, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-15 [1, 384, 28, 28] 768
│ │ └─Sequential: 3-8 [1, 384, 28, 28] --
│ └─shuffleNet_unit: 2-6 [1, 384, 28, 28] --
│ │ └─Sequential: 3-9 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-16 [1, 96, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-17 [1, 96, 28, 28] 192
│ │ │ └─ReLU: 4-18 [1, 96, 28, 28] --
│ │ └─Sequential: 3-10 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-19 [1, 96, 28, 28] 10,368
│ │ │ └─BatchNorm2d: 4-20 [1, 96, 28, 28] 192
│ │ └─Sequential: 3-11 [1, 384, 28, 28] --
│ │ │ └─Conv2d: 4-21 [1, 384, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-22 [1, 384, 28, 28] 768
│ │ └─Sequential: 3-12 [1, 384, 28, 28] --
│ └─shuffleNet_unit: 2-7 [1, 384, 28, 28] --
│ │ └─Sequential: 3-13 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-23 [1, 96, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-24 [1, 96, 28, 28] 192
│ │ │ └─ReLU: 4-25 [1, 96, 28, 28] --
│ │ └─Sequential: 3-14 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-26 [1, 96, 28, 28] 10,368
│ │ │ └─BatchNorm2d: 4-27 [1, 96, 28, 28] 192
│ │ └─Sequential: 3-15 [1, 384, 28, 28] --
│ │ │ └─Conv2d: 4-28 [1, 384, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-29 [1, 384, 28, 28] 768
│ │ └─Sequential: 3-16 [1, 384, 28, 28] --
├─Sequential: 1-4 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-8 [1, 768, 14, 14] --
│ │ └─Sequential: 3-17 [1, 96, 28, 28] --
│ │ │ └─Conv2d: 4-30 [1, 96, 28, 28] 4,608
│ │ │ └─BatchNorm2d: 4-31 [1, 96, 28, 28] 192
│ │ │ └─ReLU: 4-32 [1, 96, 28, 28] --
│ │ └─Sequential: 3-18 [1, 96, 14, 14] --
│ │ │ └─Conv2d: 4-33 [1, 96, 14, 14] 10,368
│ │ │ └─BatchNorm2d: 4-34 [1, 96, 14, 14] 192
│ │ └─Sequential: 3-19 [1, 384, 14, 14] --
│ │ │ └─Conv2d: 4-35 [1, 384, 14, 14] 4,608
│ │ │ └─BatchNorm2d: 4-36 [1, 384, 14, 14] 768
│ │ └─Sequential: 3-20 [1, 384, 14, 14] --
│ │ │ └─AvgPool2d: 4-37 [1, 384, 14, 14] --
│ └─shuffleNet_unit: 2-9 [1, 768, 14, 14] --
│ │ └─Sequential: 3-21 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-38 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-39 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-40 [1, 192, 14, 14] --
│ │ └─Sequential: 3-22 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-41 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-42 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-23 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-43 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-44 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-24 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-10 [1, 768, 14, 14] --
│ │ └─Sequential: 3-25 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-45 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-46 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-47 [1, 192, 14, 14] --
│ │ └─Sequential: 3-26 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-48 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-49 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-27 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-50 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-51 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-28 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-11 [1, 768, 14, 14] --
│ │ └─Sequential: 3-29 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-52 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-53 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-54 [1, 192, 14, 14] --
│ │ └─Sequential: 3-30 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-55 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-56 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-31 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-57 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-58 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-32 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-12 [1, 768, 14, 14] --
│ │ └─Sequential: 3-33 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-59 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-60 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-61 [1, 192, 14, 14] --
│ │ └─Sequential: 3-34 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-62 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-63 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-35 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-64 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-65 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-36 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-13 [1, 768, 14, 14] --
│ │ └─Sequential: 3-37 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-66 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-67 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-68 [1, 192, 14, 14] --
│ │ └─Sequential: 3-38 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-69 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-70 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-39 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-71 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-72 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-40 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-14 [1, 768, 14, 14] --
│ │ └─Sequential: 3-41 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-73 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-74 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-75 [1, 192, 14, 14] --
│ │ └─Sequential: 3-42 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-76 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-77 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-43 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-78 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-79 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-44 [1, 768, 14, 14] --
│ └─shuffleNet_unit: 2-15 [1, 768, 14, 14] --
│ │ └─Sequential: 3-45 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-80 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-81 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-82 [1, 192, 14, 14] --
│ │ └─Sequential: 3-46 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-83 [1, 192, 14, 14] 41,472
│ │ │ └─BatchNorm2d: 4-84 [1, 192, 14, 14] 384
│ │ └─Sequential: 3-47 [1, 768, 14, 14] --
│ │ │ └─Conv2d: 4-85 [1, 768, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-86 [1, 768, 14, 14] 1,536
│ │ └─Sequential: 3-48 [1, 768, 14, 14] --
├─Sequential: 1-5 [1, 1536, 7, 7] --
│ └─shuffleNet_unit: 2-16 [1, 1536, 7, 7] --
│ │ └─Sequential: 3-49 [1, 192, 14, 14] --
│ │ │ └─Conv2d: 4-87 [1, 192, 14, 14] 18,432
│ │ │ └─BatchNorm2d: 4-88 [1, 192, 14, 14] 384
│ │ │ └─ReLU: 4-89 [1, 192, 14, 14] --
│ │ └─Sequential: 3-50 [1, 192, 7, 7] --
│ │ │ └─Conv2d: 4-90 [1, 192, 7, 7] 41,472
│ │ │ └─BatchNorm2d: 4-91 [1, 192, 7, 7] 384
│ │ └─Sequential: 3-51 [1, 768, 7, 7] --
│ │ │ └─Conv2d: 4-92 [1, 768, 7, 7] 18,432
│ │ │ └─BatchNorm2d: 4-93 [1, 768, 7, 7] 1,536
│ │ └─Sequential: 3-52 [1, 768, 7, 7] --
│ │ │ └─AvgPool2d: 4-94 [1, 768, 7, 7] --
│ └─shuffleNet_unit: 2-17 [1, 1536, 7, 7] --
│ │ └─Sequential: 3-53 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-95 [1, 384, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-96 [1, 384, 7, 7] 768
│ │ │ └─ReLU: 4-97 [1, 384, 7, 7] --
│ │ └─Sequential: 3-54 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-98 [1, 384, 7, 7] 165,888
│ │ │ └─BatchNorm2d: 4-99 [1, 384, 7, 7] 768
│ │ └─Sequential: 3-55 [1, 1536, 7, 7] --
│ │ │ └─Conv2d: 4-100 [1, 1536, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-101 [1, 1536, 7, 7] 3,072
│ │ └─Sequential: 3-56 [1, 1536, 7, 7] --
│ └─shuffleNet_unit: 2-18 [1, 1536, 7, 7] --
│ │ └─Sequential: 3-57 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-102 [1, 384, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-103 [1, 384, 7, 7] 768
│ │ │ └─ReLU: 4-104 [1, 384, 7, 7] --
│ │ └─Sequential: 3-58 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-105 [1, 384, 7, 7] 165,888
│ │ │ └─BatchNorm2d: 4-106 [1, 384, 7, 7] 768
│ │ └─Sequential: 3-59 [1, 1536, 7, 7] --
│ │ │ └─Conv2d: 4-107 [1, 1536, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-108 [1, 1536, 7, 7] 3,072
│ │ └─Sequential: 3-60 [1, 1536, 7, 7] --
│ └─shuffleNet_unit: 2-19 [1, 1536, 7, 7] --
│ │ └─Sequential: 3-61 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-109 [1, 384, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-110 [1, 384, 7, 7] 768
│ │ │ └─ReLU: 4-111 [1, 384, 7, 7] --
│ │ └─Sequential: 3-62 [1, 384, 7, 7] --
│ │ │ └─Conv2d: 4-112 [1, 384, 7, 7] 165,888
│ │ │ └─BatchNorm2d: 4-113 [1, 384, 7, 7] 768
│ │ └─Sequential: 3-63 [1, 1536, 7, 7] --
│ │ │ └─Conv2d: 4-114 [1, 1536, 7, 7] 73,728
│ │ │ └─BatchNorm2d: 4-115 [1, 1536, 7, 7] 3,072
│ │ └─Sequential: 3-64 [1, 1536, 7, 7] --
├─AvgPool2d: 1-6 [1, 1536, 1, 1] --
├─Linear: 1-7 [1, 1000] 1,537,000
==========================================================================================
Total params: 3,328,156
Trainable params: 3,328,156
Non-trainable params: 0
Total mult-adds (M): 311.73
==========================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 71.43
Params size (MB): 13.31
Estimated Total Size (MB): 85.35
==========================================================================================