本文基于mxnet的gluon实现图像分类,本文基于多个网络结构实现图像分类
在神经网络和深度学习领域,Yann LeCun可以说是元老级人物。他于1998年在 IEEE 上发表了一篇42页的长文,文中首次提出卷积-池化-全连接的神经网络结构,由LeCun提出的七层网络命名LeNet5,因而也为他赢得了卷积神经网络之父的美誉。CNN在近几年的发展历程中,从经典的LeNet5网络到最近号称最好的图像分类网络EfficientNet,大量学者不断的做出了努力和创新。
在后面的一些图像算法过程中,图像分类的网络结构起到了举足轻重的作用,他们都依赖于分类网络做主干特征提取网络,有一定的程度体现出特征提取能力。
本文基于mxnet实现图像分类的网络结构,下边对各个网络结构进行介绍,后续只需要稍作修改即可完成网络结构的替换
论文地址:https://ieeexplore.ieee.org/document/726791
网络结构:
LeNet由Yann Lecun 提出,是一种经典的卷积神经网络,是现代卷积神经网络的起源之一。Yann将该网络用于邮局的邮政的邮政编码识别,有着良好的学习和识别能力。LeNet又称LeNet-5,具有一个输入层,两个卷积层,两个池化层,3个全连接层(其中最后一个全连接层为输出层)。
LeNet-5是一种经典的卷积神经网络结构,于1998年投入实际使用中。该网络最早应用于手写体字符识别应用中。普遍认为,卷积神经网络的出现开始于LeCun 等提出的LeNet 网络,可以说LeCun 等是CNN 的缔造者,而LeNet-5 则是LeCun 等创造的CNN 经典之作。
LeNet5 一共由7 层组成,分别是C1、C3、C5 卷积层,S2、S4 降采样层(降采样层又称池化层),F6 为一个全连接层,输出是一个高斯连接层,该层使用softmax 函数对输出图像进行分类。为了对应模型输入结构,将MNIST 中的28* 28 的图像扩展为32* 32 像素大小。下面对每一层进行详细介绍。C1 卷积层由6 个大小为5* 5 的不同类型的卷积核组成,卷积核的步长为1,没有零填充,卷积后得到6 个28* 28 像素大小的特征图;S2 为最大池化层,池化区域大小为2* 2,步长为2,经过S2 池化后得到6 个14* 14 像素大小的特征图;C3 卷积层由16 个大小为5* 5 的不同卷积核组成,卷积核的步长为1,没有零填充,卷积后得到16 个10* 10 像素大小的特征图;S4 最大池化层,池化区域大小为2* 2,步长为2,经过S2 池化后得到16 个5* 5 像素大小的特征图;C5 卷积层由120 个大小为5* 5 的不同卷积核组成,卷积核的步长为1,没有零填充,卷积后得到120 个1* 1 像素大小的特征图;将120 个1* 1 像素大小的特征图拼接起来作为F6 的输入,F6 为一个由84 个神经元组成的全连接隐藏层,激活函数使用sigmoid 函数;最后一层输出层是一个由10 个神经元组成的softmax 高斯连接层,可以用来做分类任务。
代码实现(gluon):
class _LeNet(nn.HybridBlock):
def __init__(self, classes=1000, **kwargs):
super(_LeNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
with self.features.name_scope():
self.features.add(nn.Conv2D(6, kernel_size=5))
self.features.add(nn.BatchNorm())
self.features.add(nn.Activation('sigmoid'))
self.features.add(nn.MaxPool2D(pool_size=2, strides=2))
self.features.add(nn.Conv2D(16, kernel_size=5))
self.features.add(nn.BatchNorm())
self.features.add(nn.Activation('sigmoid'))
self.features.add(nn.MaxPool2D(pool_size=2, strides=2))
self.features.add(nn.Dense(120))
self.features.add(nn.BatchNorm())
self.features.add(nn.Activation('sigmoid'))
self.features.add(nn.Dense(84))
self.features.add(nn.BatchNorm())
self.features.add(nn.Activation('sigmoid'))
self.features.add()
self.features.add()
self.features.add()
self.features.add()
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_lenet(class_num):
return _LeNet(classes=class_num)
论文地址:https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
网络结构:
AlexNet是2012年ImageNet竞赛冠军获得者Hinton和他的学生Alex Krizhevsky设计的。也是在那年之后,更多的更深的神经网络被提出,比如优秀的vgg,GoogLeNet。 这对于传统的机器学习分类算法而言,已经相当的出色。
AlexNet中包含了几个比较新的技术点,也首次在CNN中成功应用了ReLU、Dropout和LRN等Trick。同时AlexNet也使用了GPU进行运算加速。
AlexNet将LeNet的思想发扬光大,把CNN的基本原理应用到了很深很宽的网络中。AlexNet主要使用到的新技术点如下:
代码实现(gluon):
class _AlexNet(nn.HybridBlock):
def __init__(self, classes=1000,alpha=4, **kwargs):
super(AlexNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
with self.features.name_scope():
self.features.add(nn.Conv2D(96//alpha, kernel_size=11, strides=4, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(256//alpha, kernel_size=5, padding=2, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(384//alpha, kernel_size=3, padding=1, activation='relu'))
self.features.add(nn.Conv2D(384//alpha, kernel_size=3, padding=1, activation='relu'))
self.features.add(nn.Conv2D(256//alpha, kernel_size=3, padding=1, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Flatten())
self.features.add(nn.Dense(4096//alpha, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.features.add(nn.Dense(4096//alpha, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_alexnet(class_num):
net = _AlexNet(classes=class_num)
return net
论文地址:https://arxiv.org/pdf/1409.1556.pdf
网络结构:
VGG模型是2014年ILSVRC竞赛的第二名,第一名是GoogLeNet。但是VGG模型在多个迁移学习任务中的表现要优于googLeNet。而且,从图像中提取CNN特征,VGG模型是首选算法。它的缺点在于,参数量有140M之多,需要更大的存储空间。但是这个模型很有研究价值。
模型的名称——“VGG”代表了牛津大学的Oxford Visual Geometry Group,该小组隶属于1985年成立的Robotics Research Group,该Group研究范围包括了机器学习到移动机器人。下面是一段来自网络对同年GoogLeNet和VGG的描述:
VGG的特点:
代码实现(gluon):
class _VGG(nn.HybridBlock):
def __init__(self, layers, filters, classes=1000, batch_norm=False, **kwargs):
super(_VGG, self).__init__(**kwargs)
assert len(layers) == len(filters)
with self.name_scope():
self.features = self._make_features(layers, filters, batch_norm)
self.features.add(nn.Dense(4096, activation='relu', weight_initializer='normal', bias_initializer='zeros'))
self.features.add(nn.Dropout(rate=0.5))
self.features.add(nn.Dense(4096, activation='relu', weight_initializer='normal', bias_initializer='zeros'))
self.features.add(nn.Dropout(rate=0.5))
self.output = nn.Dense(classes, weight_initializer='normal', bias_initializer='zeros')
def _make_features(self, layers, filters, batch_norm):
featurizer = nn.HybridSequential(prefix='')
for i, num in enumerate(layers):
for _ in range(num):
featurizer.add(nn.Conv2D(filters[i], kernel_size=3, padding=1, weight_initializer=Xavier(rnd_type='gaussian', factor_type='out', magnitude=2), bias_initializer='zeros'))
if batch_norm:
featurizer.add(nn.BatchNorm())
featurizer.add(nn.Activation('relu'))
featurizer.add(nn.MaxPool2D(strides=2))
return featurizer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
_vgg_spec = {11: ([1, 1, 2, 2, 2], [64, 128, 256, 512, 512]),
13: ([2, 2, 2, 2, 2], [64, 128, 256, 512, 512]),
16: ([2, 2, 3, 3, 3], [64, 128, 256, 512, 512]),
19: ([2, 2, 4, 4, 4], [64, 128, 256, 512, 512])}
def get_vgg(num_layers,class_num,bn = False, **kwargs):
if bn:
kwargs['batch_norm'] = True
layers, filters = _vgg_spec[num_layers]
net = _VGG(layers, filters,classes = class_num, **kwargs)
return net
论文地址:https://arxiv.org/pdf/1409.4842.pdf
网络结构:
GoogLeNet是2014年Christian Szegedy提出的一种全新的深度学习结构,在这之前的AlexNet、VGG等结构都是通过增大网络的深度(层数)来获得更好的训练效果,但层数的增加会带来很多负作用,比如overfit、梯度消失、梯度爆炸等。inception的提出则从另一种角度来提升训练结果:能更高效的利用计算资源,在相同的计算量下能提取到更多的特征,从而提升训练结果。
inception模块的基本机构如图1,整个inception结构就是由多个这样的inception模块串联起来的。inception结构的主要贡献有两个:一是使用1x1的卷积来进行升降维;二是在多个尺寸上同时进行卷积再聚合。
1x1卷积
作用1:在相同尺寸的感受野中叠加更多的卷积,能提取到更丰富的特征。这个观点来自于Network in Network,图1里三个1x1卷积都起到了该作用。
图2左侧是是传统的卷积层结构(线性卷积),在一个尺度上只有一次卷积;图2右图是Network in Network结构(NIN结构),先进行一次普通的卷积(比如3x3),紧跟再进行一次1x1的卷积,对于某个像素点来说1x1卷积等效于该像素点在所有特征上进行一次全连接的计算,所以图2右侧图的1x1卷积画成了全连接层的形式,需要注意的是NIN结构中无论是第一个3x3卷积还是新增的1x1卷积,后面都紧跟着激活函数(比如relu)。将两个卷积串联,就能组合出更多的非线性特征。举个例子,假设第1个3x3卷积+激活函数近似于f1(x)=ax2+bx+c,第二个1x1卷积+激活函数近似于f2(x)=mx2+nx+q,那f1(x)和f2(f1(x))比哪个非线性更强,更能模拟非线性的特征?答案是显而易见的。NIN的结构和传统的神经网络中多层的结构有些类似,后者的多层是跨越了不同尺寸的感受野(通过层与层中间加pool层),从而在更高尺度上提取出特征;NIN结构是在同一个尺度上的多层(中间没有pool层),从而在相同的感受野范围能提取更强的非线性。
作用2:使用1x1卷积进行降维,降低了计算复杂度。图2中间3x3卷积和5x5卷积前的1x1卷积都起到了这个作用。当某个卷积层输入的特征数较多,对这个输入进行卷积运算将产生巨大的计算量;如果对输入先进行降维,减少特征数后再做卷积计算量就会显著减少。图3是优化前后两种方案的乘法次数比较,同样是输入一组有192个特征、32x32大小,输出256组特征的数据,图3第一张图直接用3x3卷积实现,需要192x256x3x3x32x32=452984832次乘法;图3第二张图先用1x1的卷积降到96个特征,再用3x3卷积恢复出256组特征,需要192x96x1x1x32x32+96x256x3x3x32x32=245366784次乘法,使用1x1卷积降维的方法节省了一半的计算量。有人会问,用1x1卷积降到96个特征后特征数不就减少了么,会影响最后训练的效果么?答案是否定的,只要最后输出的特征数不变(256组),中间的降维类似于压缩的效果,并不影响最终训练的结果。
多个尺寸上进行卷积再聚合
图2可以看到对输入做了4个分支,分别用不同尺寸的filter进行卷积或池化,最后再在特征维度上拼接到一起。这种全新的结构有什么好处呢?Szegedy从多个角度进行了解释:
解释1:在直观感觉上在多个尺度上同时进行卷积,能提取到不同尺度的特征。特征更为丰富也意味着最后分类判断时更加准确。
解释2:利用稀疏矩阵分解成密集矩阵计算的原理来加快收敛速度。举个例子图4左侧是个稀疏矩阵(很多元素都为0,不均匀分布在矩阵中),和一个2x2的矩阵进行卷积,需要对稀疏矩阵中的每一个元素进行计算;如果像图4右图那样把稀疏矩阵分解成2个子密集矩阵,再和2x2矩阵进行卷积,稀疏矩阵中0较多的区域就可以不用计算,计算量就大大降低。这个原理应用到inception上就是要在特征维度上进行分解!传统的卷积层的输入数据只和一种尺度(比如3x3)的卷积核进行卷积,输出固定维度(比如256个特征)的数据,所有256个输出特征基本上是均匀分布在3x3尺度范围上,这可以理解成输出了一个稀疏分布的特征集;而inception模块在多个尺度上提取特征(比如1x1,3x3,5x5),输出的256个特征就不再是均匀分布,而是相关性强的特征聚集在一起(比如1x1的的96个特征聚集在一起,3x3的96个特征聚集在一起,5x5的64个特征聚集在一起),这可以理解成多个密集分布的子特征集。这样的特征集中因为相关性较强的特征聚集在了一起,不相关的非关键特征就被弱化,同样是输出256个特征,inception方法输出的特征“冗余”的信息较少。用这样的“纯”的特征集层层传递最后作为反向计算的输入,自然收敛的速度更快。
解释3:Hebbin赫布原理。Hebbin原理是神经科学上的一个理论,解释了在学习的过程中脑中的神经元所发生的变化,用一句话概括就是fire togethter, wire together。赫布认为“两个神经元或者神经元系统,如果总是同时兴奋,就会形成一种‘组合’,其中一个神经元的兴奋会促进另一个的兴奋”。比如狗看到肉会流口水,反复刺激后,脑中识别肉的神经元会和掌管唾液分泌的神经元会相互促进,“缠绕”在一起,以后再看到肉就会更快流出口水。用在inception结构中就是要把相关性强的特征汇聚到一起。这有点类似上面的解释2,把1x1,3x3,5x5的特征分开。因为训练收敛的最终目的就是要提取出独立的特征,所以预先把相关性强的特征汇聚,就能起到加速收敛的作用。
在inception模块中有一个分支使用了max pooling,作者认为pooling也能起到提取特征的作用,所以也加入模块中。注意这个pooling的stride=1,pooling后没有减少数据的尺寸。
代码实现(gluon):
def _make_basic_conv(in_channels, channels, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
out = nn.HybridSequential(prefix='')
out.add(nn.Conv2D(in_channels=in_channels, channels=channels, use_bias=False, **kwargs))
out.add(norm_layer(in_channels=channels, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
out.add(nn.Activation('relu'))
return out
def _make_branch(use_pool, norm_layer, norm_kwargs, *conv_settings):
out = nn.HybridSequential(prefix='')
if use_pool == 'avg':
out.add(nn.AvgPool2D(pool_size=3, strides=1, padding=1))
elif use_pool == 'max':
out.add(nn.MaxPool2D(pool_size=3, strides=1, padding=1))
setting_names = ['in_channels', 'channels', 'kernel_size', 'strides', 'padding']
for setting in conv_settings:
kwargs = {}
for i, value in enumerate(setting):
if value is not None:
if setting_names[i] == 'in_channels':
in_channels = value
elif setting_names[i] == 'channels':
channels = value
else:
kwargs[setting_names[i]] = value
out.add(_make_basic_conv(in_channels, channels, norm_layer, norm_kwargs, **kwargs))
return out
def _make_Mixed_3a(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 64, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 96, 1, None, None), (96, 128, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 16, 1, None, None), (16, 32, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_3b(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 128, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 128, 1, None, None), (128, 192, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 32, 1, None, None), (32, 96, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_4a(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 192, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 96, 1, None, None), (96, 208, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 16, 1, None, None), (16, 48, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_4b(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 160, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 112, 1, None, None), (112, 224, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 24, 1, None, None), (24, 64, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_4c(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 128, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 128, 1, None, None), (128, 256, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 24, 1, None, None), (24, 64, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_4d(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 112, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 144, 1, None, None), (144, 288, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 32, 1, None, None), (32, 64, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_4e(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 256, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 160, 1, None, None), (160, 320, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 32, 1, None, None), (32, 128, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_5a(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 256, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 160, 1, None, None), (160, 320, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 32, 1, None, None), (32, 128, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_Mixed_5b(in_channels, pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 384, 1, None, None)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 192, 1, None, None), (192, 384, 3, None, 1)))
out.add(_make_branch(None, norm_layer, norm_kwargs, (in_channels, 48, 1, None, None), (48, 128, 3, None, 1)))
out.add(_make_branch('max', norm_layer, norm_kwargs, (in_channels, pool_features, 1, None, None)))
return out
def _make_aux(in_channels, classes, norm_layer, norm_kwargs):
out = nn.HybridSequential(prefix='')
out.add(nn.AvgPool2D(pool_size=5, strides=3))
out.add(_make_basic_conv(in_channels=in_channels, channels=128, kernel_size=1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
out.add(nn.Flatten())
out.add(nn.Dense(units=1024, in_units=2048))
out.add(nn.Activation('relu'))
out.add(nn.Dropout(0.7))
out.add(nn.Dense(units=classes, in_units=1024))
return out
class _GoogLeNet(nn.HybridBlock):
def __init__(self, classes=1000, norm_layer=nn.BatchNorm, dropout_ratio=0.4, aux_logits=False,norm_kwargs=None, partial_bn=False, **kwargs):
super(_GoogLeNet, self).__init__(**kwargs)
self.dropout_ratio = dropout_ratio
self.aux_logits = aux_logits
with self.name_scope():
self.conv1 = _make_basic_conv(in_channels=3, channels=64, kernel_size=7,
strides=2, padding=3,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.maxpool1 = nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True)
if partial_bn:
if norm_kwargs is not None:
norm_kwargs['use_global_stats'] = True
else:
norm_kwargs = {}
norm_kwargs['use_global_stats'] = True
self.conv2 = _make_basic_conv(in_channels=64, channels=64, kernel_size=1,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.conv3 = _make_basic_conv(in_channels=64, channels=192,
kernel_size=3, padding=1,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.maxpool2 = nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True)
self.inception3a = _make_Mixed_3a(192, 32, 'Mixed_3a_', norm_layer, norm_kwargs)
self.inception3b = _make_Mixed_3b(256, 64, 'Mixed_3b_', norm_layer, norm_kwargs)
self.maxpool3 = nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True)
self.inception4a = _make_Mixed_4a(480, 64, 'Mixed_4a_', norm_layer, norm_kwargs)
self.inception4b = _make_Mixed_4b(512, 64, 'Mixed_4b_', norm_layer, norm_kwargs)
self.inception4c = _make_Mixed_4c(512, 64, 'Mixed_4c_', norm_layer, norm_kwargs)
self.inception4d = _make_Mixed_4d(512, 64, 'Mixed_4d_', norm_layer, norm_kwargs)
self.inception4e = _make_Mixed_4e(528, 128, 'Mixed_4e_', norm_layer, norm_kwargs)
self.maxpool4 = nn.MaxPool2D(pool_size=2, strides=2)
self.inception5a = _make_Mixed_5a(832, 128, 'Mixed_5a_', norm_layer, norm_kwargs)
self.inception5b = _make_Mixed_5b(832, 128, 'Mixed_5b_', norm_layer, norm_kwargs)
if self.aux_logits:
self.aux1 = _make_aux(512, classes, norm_layer, norm_kwargs)
self.aux2 = _make_aux(528, classes, norm_layer, norm_kwargs)
self.head = nn.HybridSequential(prefix='')
self.avgpool = nn.AvgPool2D(pool_size=7)
self.dropout = nn.Dropout(self.dropout_ratio)
self.output = nn.Dense(units=classes, in_units=1024)
self.head.add(self.avgpool)
self.head.add(self.dropout)
self.head.add(self.output)
def hybrid_forward(self, F, x):
x = self.conv1(x)
x = self.maxpool1(x)
x = self.conv2(x)
x = self.conv3(x)
x = self.maxpool2(x)
x = self.inception3a(x)
x = self.inception3b(x)
x = self.maxpool3(x)
x = self.inception4a(x)
if self.aux_logits:
aux1 = self.aux1(x)
x = self.inception4b(x)
x = self.inception4c(x)
x = self.inception4d(x)
if self.aux_logits:
aux2 = self.aux2(x)
x = self.inception4e(x)
x = self.maxpool4(x)
x = self.inception5a(x)
x = self.inception5b(x)
x = self.head(x)
if self.aux_logits:
return (x, aux2, aux1)
return x
def get_googlenet(class_num=1000, dropout_ratio=0.4, aux_logits=False,partial_bn=False, **kwargs):
net = _GoogLeNet(classes=class_num, partial_bn=partial_bn, dropout_ratio=dropout_ratio, aux_logits=aux_logits, **kwargs)
return net
论文地址:https://arxiv.org/pdf/1512.03385.pdf
网络结构(VGG vs ResNet):
ResNet的发明者是何恺明(Kaiming He)、张翔宇(Xiangyu Zhang)、任少卿(Shaoqing Ren)和孙剑(Jiangxi Sun)。
在2015年的ImageNet大规模视觉识别竞赛(ImageNet Large Scale Visual Recognition Challenge, ILSVRC)中获得了图像分类和物体识别的优胜。 残差网络的特点是容易优化,并且能够通过增加相当的深度来提高准确率。其内部的残差块使用了跳跃连接,缓解了在深度神经网络中增加深度带来的梯度消失问题。
代码实现(gluon):
def _conv3x3(channels, stride, in_channels):
return nn.Conv2D(channels, kernel_size=3, strides=stride, padding=1, use_bias=False, in_channels=in_channels)
class _BasicBlockV1(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_BasicBlockV1, self).__init__(**kwargs)
self.body = nn.HybridSequential(prefix='')
self.body.add(_conv3x3(channels, stride, in_channels))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(_conv3x3(channels, 1, channels))
if not last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if downsample:
self.downsample = nn.HybridSequential(prefix='')
self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride, use_bias=False, in_channels=in_channels))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(residual+x, act_type='relu')
return x
class _BottleneckV1(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_BottleneckV1, self).__init__(**kwargs)
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Conv2D(channels//4, kernel_size=1, strides=stride))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(_conv3x3(channels//4, 1, channels//4))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(channels, kernel_size=1, strides=1))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if not last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if downsample:
self.downsample = nn.HybridSequential(prefix='')
self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride, use_bias=False, in_channels=in_channels))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(x + residual, act_type='relu')
return x
class _BasicBlockV2(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_BasicBlockV2, self).__init__(**kwargs)
self.bn1 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
self.conv1 = _conv3x3(channels, stride, in_channels)
if not last_gamma:
self.bn2 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
else:
self.bn2 = norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs))
self.conv2 = _conv3x3(channels, 1, channels)
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if downsample:
self.downsample = nn.Conv2D(channels, 1, stride, use_bias=False, in_channels=in_channels)
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.bn1(x)
x = F.Activation(x, act_type='relu')
if self.downsample:
residual = self.downsample(x)
x = self.conv1(x)
x = self.bn2(x)
x = F.Activation(x, act_type='relu')
x = self.conv2(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
return x + residual
class _BottleneckV2(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_BottleneckV2, self).__init__(**kwargs)
self.bn1 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
self.conv1 = nn.Conv2D(channels//4, kernel_size=1, strides=1, use_bias=False)
self.bn2 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
self.conv2 = _conv3x3(channels//4, stride, channels//4)
if not last_gamma:
self.bn3 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
else:
self.bn3 = norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs))
self.conv3 = nn.Conv2D(channels, kernel_size=1, strides=1, use_bias=False)
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if downsample:
self.downsample = nn.Conv2D(channels, 1, stride, use_bias=False, in_channels=in_channels)
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.bn1(x)
x = F.Activation(x, act_type='relu')
if self.downsample:
residual = self.downsample(x)
x = self.conv1(x)
x = self.bn2(x)
x = F.Activation(x, act_type='relu')
x = self.conv2(x)
x = self.bn3(x)
x = F.Activation(x, act_type='relu')
x = self.conv3(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
return x + residual
class _ResNetV1(nn.HybridBlock):
def __init__(self, block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_ResNetV1, self).__init__(**kwargs)
assert len(layers) == len(channels) - 1
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
if thumbnail:
self.features.add(_conv3x3(channels[0], 1, 0))
else:
self.features.add(nn.Conv2D(channels[0], 7, 2, 3, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(3, 2, 1))
for i, num_layer in enumerate(layers):
stride = 1 if i == 0 else 2
self.features.add(self._make_layer(block, num_layer, channels[i+1], stride, i+1, in_channels=channels[i], last_gamma=last_gamma, use_se=use_se, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.features.add(nn.GlobalAvgPool2D())
self.output = nn.Dense(classes, in_units=channels[-1])
def _make_layer(self, block, layers, channels, stride, stage_index, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
layer = nn.HybridSequential(prefix='stage%d_'%stage_index)
with layer.name_scope():
layer.add(block(channels, stride, channels != in_channels, in_channels=in_channels, last_gamma=last_gamma, use_se=use_se, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(layers-1):
layer.add(block(channels, 1, False, in_channels=channels, last_gamma=last_gamma, use_se=use_se, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
return layer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
class _ResNetV2(nn.HybridBlock):
def __init__(self, block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_ResNetV2, self).__init__(**kwargs)
assert len(layers) == len(channels) - 1
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
self.features.add(norm_layer(scale=False, center=False, **({} if norm_kwargs is None else norm_kwargs)))
if thumbnail:
self.features.add(_conv3x3(channels[0], 1, 0))
else:
self.features.add(nn.Conv2D(channels[0], 7, 2, 3, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(3, 2, 1))
in_channels = channels[0]
for i, num_layer in enumerate(layers):
stride = 1 if i == 0 else 2
self.features.add(self._make_layer(block, num_layer, channels[i+1], stride, i+1, in_channels=in_channels, last_gamma=last_gamma, use_se=use_se, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
in_channels = channels[i+1]
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.GlobalAvgPool2D())
self.features.add(nn.Flatten())
self.output = nn.Dense(classes, in_units=in_channels)
def _make_layer(self, block, layers, channels, stride, stage_index, in_channels=0,
last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
layer = nn.HybridSequential(prefix='stage%d_'%stage_index)
with layer.name_scope():
layer.add(block(channels, stride, channels != in_channels, in_channels=in_channels, last_gamma=last_gamma, use_se=use_se, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(layers-1):
layer.add(block(channels, 1, False, in_channels=channels, last_gamma=last_gamma, use_se=use_se, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
return layer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
_resnet_spec = {18: ('basic_block', [2, 2, 2, 2], [64, 64, 128, 256, 512]),
34: ('basic_block', [3, 4, 6, 3], [64, 64, 128, 256, 512]),
50: ('bottle_neck', [3, 4, 6, 3], [64, 256, 512, 1024, 2048]),
101: ('bottle_neck', [3, 4, 23, 3], [64, 256, 512, 1024, 2048]),
152: ('bottle_neck', [3, 8, 36, 3], [64, 256, 512, 1024, 2048])}
_resnet_net_versions = [_ResNetV1, _ResNetV2]
_resnet_block_versions = [{'basic_block': _BasicBlockV1, 'bottle_neck': _BottleneckV1}, {'basic_block': _BasicBlockV2, 'bottle_neck': _BottleneckV2}]
def get_resnet(version, num_layers,class_num,use_se=False):
block_type, layers, channels = _resnet_spec[num_layers]
resnet_class = _resnet_net_versions[version-1]
block_class = _resnet_block_versions[version-1][block_type]
net = resnet_class(block_class, layers, channels,use_se=use_se, classes=class_num)
return net
论文地址:https://arxiv.org/pdf/1704.04861.pdf
网络结构:
MobileNet V1是由google2016年提出,2017年发布的文章。其主要创新点在于深度可分离卷积,而整个网络实际上也是深度可分离模块的堆叠。
深度可分离卷积被证明是轻量化网络的有效设计,深度可分离卷积由逐深度卷积(Depthwise)和逐点卷积(Pointwise)构成。
对比于标准卷积,逐深度卷积将卷积核拆分成为单通道形式,在不改变输入特征图像的深度的情况下,对每一通道进行卷积操作,这样就得到了和输入特征图通道数一致的输出特征图。
逐点卷积就是1×1卷积。主要作用就是对特征图进行升维和降维。
代码实现(gluon):
class _ReLU6(nn.HybridBlock):
def __init__(self, **kwargs):
super(_ReLU6, self).__init__(**kwargs)
def hybrid_forward(self, F, x):
return F.clip(x, 0, 6, name="relu6")
def _add_conv(out, channels=1, kernel=1, stride=1, pad=0, num_group=1, active=True, relu6=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
out.add(nn.Conv2D(channels, kernel, stride, pad, groups=num_group, use_bias=False))
out.add(norm_layer(scale=True, **({} if norm_kwargs is None else norm_kwargs)))
if active:
out.add(_ReLU6() if relu6 else nn.Activation('relu'))
def _add_conv_dw(out, dw_channels, channels, stride, relu6=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
_add_conv(out, channels=dw_channels, kernel=3, stride=stride,
pad=1, num_group=dw_channels, relu6=relu6,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(out, channels=channels, relu6=relu6,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
class _LinearBottleneck(nn.HybridBlock):
def __init__(self, in_channels, channels, t, stride, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_LinearBottleneck, self).__init__(**kwargs)
self.use_shortcut = stride == 1 and in_channels == channels
with self.name_scope():
self.out = nn.HybridSequential()
if t != 1:
_add_conv(self.out, in_channels * t, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(self.out, in_channels * t, kernel=3, stride=stride, pad=1, num_group=in_channels * t, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(self.out, channels, active=False, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
def hybrid_forward(self, F, x):
out = self.out(x)
if self.use_shortcut:
out = F.elemwise_add(out, x)
return out
class _MobileNetV1(nn.HybridBlock):
def __init__(self, multiplier=1.0, classes=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_MobileNetV1, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
with self.features.name_scope():
_add_conv(self.features, channels=int(32 * multiplier), kernel=3, pad=1, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
dw_channels = [int(x * multiplier) for x in [32, 64] + [128] * 2 + [256] * 2 + [512] * 6 + [1024]]
channels = [int(x * multiplier) for x in [64] + [128] * 2 + [256] * 2 + [512] * 6 + [1024] * 2]
strides = [1, 2] * 3 + [1] * 5 + [2, 1]
for dwc, c, s in zip(dw_channels, channels, strides):
_add_conv_dw(self.features, dw_channels=dwc, channels=c, stride=s, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.features.add(nn.GlobalAvgPool2D())
self.features.add(nn.Flatten())
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_mobilenet_v1(multiplier,class_num, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
net = _MobileNetV1(multiplier,classes = class_num, norm_layer=norm_layer, norm_kwargs=norm_kwargs, **kwargs)
return net
论文地址:https://arxiv.org/pdf/1801.04381.pdf
网络结构:
mobilenetV1的缺点:
V2使用了跟V1类似的深度可分离结构,不同之处也正对应着V1中逐深度卷积的缺点改进:
代码实现(gluon):
class _ReLU6(nn.HybridBlock):
def __init__(self, **kwargs):
super(_ReLU6, self).__init__(**kwargs)
def hybrid_forward(self, F, x):
return F.clip(x, 0, 6, name="relu6")
def _add_conv(out, channels=1, kernel=1, stride=1, pad=0, num_group=1, active=True, relu6=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
out.add(nn.Conv2D(channels, kernel, stride, pad, groups=num_group, use_bias=False))
out.add(norm_layer(scale=True, **({} if norm_kwargs is None else norm_kwargs)))
if active:
out.add(_ReLU6() if relu6 else nn.Activation('relu'))
def _add_conv_dw(out, dw_channels, channels, stride, relu6=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
_add_conv(out, channels=dw_channels, kernel=3, stride=stride,
pad=1, num_group=dw_channels, relu6=relu6,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(out, channels=channels, relu6=relu6,
norm_layer=norm_layer, norm_kwargs=norm_kwargs)
class _LinearBottleneck(nn.HybridBlock):
def __init__(self, in_channels, channels, t, stride, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_LinearBottleneck, self).__init__(**kwargs)
self.use_shortcut = stride == 1 and in_channels == channels
with self.name_scope():
self.out = nn.HybridSequential()
if t != 1:
_add_conv(self.out, in_channels * t, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(self.out, in_channels * t, kernel=3, stride=stride, pad=1, num_group=in_channels * t, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
_add_conv(self.out, channels, active=False, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
def hybrid_forward(self, F, x):
out = self.out(x)
if self.use_shortcut:
out = F.elemwise_add(out, x)
return out
class _MobileNetV2(nn.HybridBlock):
def __init__(self, multiplier=1.0, classes=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_MobileNetV2, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='features_')
with self.features.name_scope():
_add_conv(self.features, int(32 * multiplier), kernel=3, stride=2, pad=1, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
in_channels_group = [int(x * multiplier) for x in [32] + [16] + [24] * 2 + [32] * 3 + [64] * 4 + [96] * 3 + [160] * 3]
channels_group = [int(x * multiplier) for x in [16] + [24] * 2 + [32] * 3 + [64] * 4 + [96] * 3 + [160] * 3 + [320]]
ts = [1] + [6] * 16
strides = [1, 2] * 2 + [1, 1, 2] + [1] * 6 + [2] + [1] * 3
for in_c, c, t, s in zip(in_channels_group, channels_group, ts, strides):
self.features.add(_LinearBottleneck(in_channels=in_c, channels=c, t=t, stride=s, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
last_channels = int(1280 * multiplier) if multiplier > 1.0 else 1280
_add_conv(self.features, last_channels, relu6=True, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.features.add(nn.GlobalAvgPool2D())
self.output = nn.HybridSequential(prefix='output_')
with self.output.name_scope():
self.output.add(nn.Conv2D(classes, 1, use_bias=False, prefix='pred_'), nn.Flatten())
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_mobilenet_v2(multiplier,class_num, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
net = _MobileNetV2(multiplier,classes = class_num, norm_layer=norm_layer, norm_kwargs=norm_kwargs, **kwargs)
return net
论文地址:https://arxiv.org/pdf/1905.02244.pdf
网络结构:
根据MobileNetV3论文总结,网络存在以下3点需要大家注意的:
可以看出V3版本的Large 1.0(V3-Large 1.0)的Top-1是75.2,对于V2 1.0 它的Top-1是72,相当于提升了3.2%
在推理速度方面也有一定的提升,(V3-Large 1.0)在P-1手机上推理时间为51ms,而V2是64ms,很明显V3比V2,不仅准确率更高了,而且速度更快了
V3-Small版本的Top-1是 67.4,而V2 0.35 (0.35表示卷积核的倍率因子)的Top-1只有60.8,准确率提升了6.6%
代码实现(gluon):
class _ReLU6(nn.HybridBlock):
def __init__(self, **kwargs):
super(_ReLU6, self).__init__(**kwargs)
def hybrid_forward(self, F, x):
return F.clip(x, 0, 6, name="relu6")
class _HardSigmoid(nn.HybridBlock):
def __init__(self, **kwargs):
super(_HardSigmoid, self).__init__(**kwargs)
self.act = _ReLU6()
def hybrid_forward(self, F, x):
return self.act(x + 3.) / 6.
class _HardSwish(nn.HybridBlock):
def __init__(self, **kwargs):
super(_HardSwish, self).__init__(**kwargs)
self.act = _HardSigmoid()
def hybrid_forward(self, F, x):
return x * self.act(x)
def _make_divisible(x, divisible_by=8):
return int(np.ceil(x * 1. / divisible_by) * divisible_by)
class _Activation_mobilenetv3(nn.HybridBlock):
def __init__(self, act_func, **kwargs):
super(_Activation_mobilenetv3, self).__init__(**kwargs)
if act_func == "relu":
self.act = nn.Activation('relu')
elif act_func == "relu6":
self.act = _ReLU6()
elif act_func == "hard_sigmoid":
self.act = _HardSigmoid()
elif act_func == "swish":
self.act = nn.Swish()
elif act_func == "hard_swish":
self.act = _HardSwish()
elif act_func == "leaky":
self.act = nn.LeakyReLU(alpha=0.375)
else:
raise NotImplementedError
def hybrid_forward(self, F, x):
return self.act(x)
class _SE(nn.HybridBlock):
def __init__(self, num_out, ratio=4, act_func=("relu", "hard_sigmoid"), use_bn=False, prefix='', **kwargs):
super(_SE, self).__init__(**kwargs)
self.use_bn = use_bn
num_mid = _make_divisible(num_out // ratio)
self.pool = nn.GlobalAvgPool2D()
self.conv1 = nn.Conv2D(channels=num_mid, kernel_size=1, use_bias=True, prefix=('%s_fc1_' % prefix))
self.act1 = _Activation_mobilenetv3(act_func[0])
self.conv2 = nn.Conv2D(channels=num_out, kernel_size=1, use_bias=True, prefix=('%s_fc2_' % prefix))
self.act2 = _Activation_mobilenetv3(act_func[1])
def hybrid_forward(self, F, x):
out = self.pool(x)
out = self.conv1(out)
out = self.act1(out)
out = self.conv2(out)
out = self.act2(out)
return F.broadcast_mul(x, out)
class _Unit(nn.HybridBlock):
def __init__(self, num_out, kernel_size=1, strides=1, pad=0, num_groups=1, use_act=True, act_type="relu", prefix='', norm_layer=nn.BatchNorm, **kwargs):
super(_Unit, self).__init__(**kwargs)
self.use_act = use_act
self.conv = nn.Conv2D(channels=num_out, kernel_size=kernel_size, strides=strides, padding=pad, groups=num_groups, use_bias=False, prefix='%s-conv2d_'%prefix)
self.bn = norm_layer(prefix='%s-batchnorm_'%prefix)
if use_act is True:
self.act = _Activation_mobilenetv3(act_type)
def hybrid_forward(self, F, x):
out = self.conv(x)
out = self.bn(out)
if self.use_act:
out = self.act(out)
return out
class _ResUnit(nn.HybridBlock):
def __init__(self, num_in, num_mid, num_out, kernel_size, act_type="relu", use_se=False, strides=1, prefix='', norm_layer=nn.BatchNorm, **kwargs):
super(_ResUnit, self).__init__(**kwargs)
with self.name_scope():
self.use_se = use_se
self.first_conv = (num_out != num_mid)
self.use_short_cut_conv = True
if self.first_conv:
self.expand = _Unit(num_mid, kernel_size=1, strides=1, pad=0, act_type=act_type, prefix='%s-exp'%prefix, norm_layer=norm_layer)
self.conv1 = _Unit(num_mid, kernel_size=kernel_size, strides=strides, pad=self._get_pad(kernel_size), act_type=act_type, num_groups=num_mid, prefix='%s-depthwise'%prefix, norm_layer=norm_layer)
if use_se:
self.se = _SE(num_mid, prefix='%s-se'%prefix)
self.conv2 = _Unit(num_out, kernel_size=1, strides=1, pad=0, act_type=act_type, use_act=False, prefix='%s-linear'%prefix, norm_layer=norm_layer)
if num_in != num_out or strides != 1:
self.use_short_cut_conv = False
def hybrid_forward(self, F, x):
out = self.expand(x) if self.first_conv else x
out = self.conv1(out)
if self.use_se:
out = self.se(out)
out = self.conv2(out)
if self.use_short_cut_conv:
return x + out
else:
return out
def _get_pad(self, kernel_size):
if kernel_size == 1:
return 0
elif kernel_size == 3:
return 1
elif kernel_size == 5:
return 2
elif kernel_size == 7:
return 3
else:
raise NotImplementedError
class _MobileNetV3(nn.HybridBlock):
def __init__(self, cfg, cls_ch_squeeze, cls_ch_expand, multiplier=1., classes=1000, norm_kwargs=None, last_gamma=False, final_drop=0., use_global_stats=False, name_prefix='', norm_layer=nn.BatchNorm):
super(_MobileNetV3, self).__init__(prefix=name_prefix)
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
if use_global_stats:
norm_kwargs['use_global_stats'] = True
# initialize residual networks
k = multiplier
self.last_gamma = last_gamma
self.norm_kwargs = norm_kwargs
self.inplanes = 16
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
self.features.add(nn.Conv2D(channels=_make_divisible(k*self.inplanes), kernel_size=3, padding=1, strides=2, use_bias=False, prefix='first-3x3-conv-conv2d_'))
self.features.add(norm_layer(prefix='first-3x3-conv-batchnorm_'))
self.features.add(_HardSwish())
i = 0
for layer_cfg in cfg:
layer = self._make_layer(kernel_size=layer_cfg[0], exp_ch=_make_divisible(k * layer_cfg[1]), out_channel=_make_divisible(k * layer_cfg[2]), use_se=layer_cfg[3], act_func=layer_cfg[4], stride=layer_cfg[5], prefix='seq-%d'%i,)
self.features.add(layer)
i += 1
self.features.add(nn.Conv2D(channels= _make_divisible(k*cls_ch_squeeze), kernel_size=1, padding=0, strides=1, use_bias=False, prefix='last-1x1-conv1-conv2d_'))
self.features.add(norm_layer(prefix='last-1x1-conv1-batchnorm_', **({} if norm_kwargs is None else norm_kwargs)))
self.features.add(_HardSwish())
self.features.add(nn.GlobalAvgPool2D())
self.features.add(nn.Conv2D(channels=cls_ch_expand, kernel_size=1, padding=0, strides=1, use_bias=False, prefix='last-1x1-conv2-conv2d_'))
self.features.add(_HardSwish())
if final_drop > 0:
self.features.add(nn.Dropout(final_drop))
self.output = nn.HybridSequential(prefix='output_')
with self.output.name_scope():
self.output.add(
nn.Conv2D(in_channels=cls_ch_expand, channels=classes, kernel_size=1, prefix='fc_'),
nn.Flatten())
def _make_layer(self, kernel_size, exp_ch, out_channel, use_se, act_func, stride=1, prefix=''):
mid_planes = exp_ch
out_planes = out_channel
layer = _ResUnit(self.inplanes, mid_planes, out_planes, kernel_size, act_func, strides=stride, use_se=use_se, prefix=prefix)
self.inplanes = out_planes
return layer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_mobilenet_v3(version, multiplier=1.,class_num=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
if version == "large":
cfg = [
# k, exp, c, se, nl, s,
[3, 16, 16, False, 'relu', 1],
[3, 64, 24, False, 'relu', 2],
[3, 72, 24, False, 'relu', 1],
[5, 72, 40, True, 'relu', 2],
[5, 120, 40, True, 'relu', 1],
[5, 120, 40, True, 'relu', 1],
[3, 240, 80, False, 'hard_swish', 2],
[3, 200, 80, False, 'hard_swish', 1],
[3, 184, 80, False, 'hard_swish', 1],
[3, 184, 80, False, 'hard_swish', 1],
[3, 480, 112, True, 'hard_swish', 1],
[3, 672, 112, True, 'hard_swish', 1],
[5, 672, 160, True, 'hard_swish', 2],
[5, 960, 160, True, 'hard_swish', 1],
[5, 960, 160, True, 'hard_swish', 1],
]
cls_ch_squeeze = 960
cls_ch_expand = 1280
elif version == "small":
cfg = [
# k, exp, c, se, nl, s,
[3, 16, 16, True, 'relu', 2],
[3, 72, 24, False, 'relu', 2],
[3, 88, 24, False, 'relu', 1],
[5, 96, 40, True, 'hard_swish', 2],
[5, 240, 40, True, 'hard_swish', 1],
[5, 240, 40, True, 'hard_swish', 1],
[5, 120, 48, True, 'hard_swish', 1],
[5, 144, 48, True, 'hard_swish', 1],
[5, 288, 96, True, 'hard_swish', 2],
[5, 576, 96, True, 'hard_swish', 1],
[5, 576, 96, True, 'hard_swish', 1],
]
cls_ch_squeeze = 576
cls_ch_expand = 1280
else:
raise NotImplementedError
net = _MobileNetV3(cfg, cls_ch_squeeze, cls_ch_expand,classes=class_num, multiplier=multiplier, final_drop=0.2, norm_layer=norm_layer, **kwargs)
return net
论文地址:https://arxiv.org/pdf/1608.06993.pdf
网络结构:
经典网络DenseNet(Dense Convolutional Network)由Gao Huang等人于2017年提出,论文名为:《Densely Connected Convolutional Networks》,论文见:https://arxiv.org/pdf/1608.06993.pdf
DenseNet以前馈的方式(feed-forward fashion)将每个层与其它层连接起来。在传统卷积神经网络中,对于L层的网络具有L个连接,而在DenseNet中,会有L(L+1)/2个连接。每一层的输入来自前面所有层的输出。
DenseNet网络:
代码实现(gluon):
def _make_dense_block(num_layers, bn_size, growth_rate, dropout, stage_index, norm_layer, norm_kwargs):
out = nn.HybridSequential(prefix='stage%d_'%stage_index)
with out.name_scope():
for _ in range(num_layers):
out.add(_make_dense_layer(growth_rate, bn_size, dropout, norm_layer, norm_kwargs))
return out
def _make_dense_layer(growth_rate, bn_size, dropout, norm_layer, norm_kwargs):
new_features = nn.HybridSequential(prefix='')
new_features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
new_features.add(nn.Activation('relu'))
new_features.add(nn.Conv2D(bn_size * growth_rate, kernel_size=1, use_bias=False))
new_features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
new_features.add(nn.Activation('relu'))
new_features.add(nn.Conv2D(growth_rate, kernel_size=3, padding=1, use_bias=False))
if dropout:
new_features.add(nn.Dropout(dropout))
out = HybridConcurrent(axis=1, prefix='')
out.add(Identity())
out.add(new_features)
return out
def _make_transition(num_output_features, norm_layer, norm_kwargs):
out = nn.HybridSequential(prefix='')
out.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
out.add(nn.Activation('relu'))
out.add(nn.Conv2D(num_output_features, kernel_size=1, use_bias=False))
out.add(nn.AvgPool2D(pool_size=2, strides=2))
return out
class _DenseNet(nn.HybridBlock):
def __init__(self, num_init_features, growth_rate, block_config, bn_size=4, dropout=0, classes=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_DenseNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
self.features.add(nn.Conv2D(num_init_features, kernel_size=7, strides=2, padding=3, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2, padding=1))
# Add dense blocks
num_features = num_init_features
for i, num_layers in enumerate(block_config):
self.features.add(_make_dense_block(num_layers, bn_size, growth_rate, dropout, i+1, norm_layer, norm_kwargs))
num_features = num_features + num_layers * growth_rate
if i != len(block_config) - 1:
self.features.add(_make_transition(num_features // 2, norm_layer, norm_kwargs))
num_features = num_features // 2
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.AvgPool2D(pool_size=7))
self.features.add(nn.Flatten())
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
_densenet_spec = {121: (64, 32, [6, 12, 24, 16]),
161: (96, 48, [6, 12, 36, 24]),
169: (64, 32, [6, 12, 32, 32]),
201: (64, 32, [6, 12, 48, 32])}
def get_densenet(num_layers, class_num, **kwargs):
num_init_features, growth_rate, block_config = _densenet_spec[num_layers]
net = _DenseNet(num_init_features, growth_rate, block_config,classes=class_num, **kwargs)
return net
论文地址:https://arxiv.org/pdf/2004.08955.pdf
网络结构:
ResNeSt在图像分类上中ImageNet数据集上超越了其前辈ResNet、ResNeXt、SENet以及EfficientNet。使用ResNeSt-50为基本骨架的Faster-RCNN比使用ResNet-50的mAP要高出3.08%。使用ResNeSt-50为基本骨架的DeeplabV3比使用ResNet-50的mIOU要高出3.02%。涨点效果非常明显。
代码实现(gluon):
class _ResNeSt(nn.HybridBlock):
def __init__(self, block, layers, cardinality=1, bottleneck_width=64, classes=1000, dilated=False, dilation=1, norm_layer=nn.BatchNorm, norm_kwargs=None, last_gamma=False, deep_stem=False, stem_width=32, avg_down=False, final_drop=0.0, use_global_stats=False, name_prefix='', dropblock_prob=0.0, input_size=224, use_splat=False, radix=2, avd=False, avd_first=False, split_drop_ratio=0):
self.cardinality = cardinality
self.bottleneck_width = bottleneck_width
self.inplanes = stem_width * 2 if deep_stem else 64
self.radix = radix
self.split_drop_ratio = split_drop_ratio
self.avd_first = avd_first
super(_ResNeSt, self).__init__(prefix=name_prefix)
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
if use_global_stats:
norm_kwargs['use_global_stats'] = True
self.norm_kwargs = norm_kwargs
with self.name_scope():
if not deep_stem:
self.conv1 = nn.Conv2D(channels=64, kernel_size=7, strides=2, padding=3, use_bias=False, in_channels=3)
else:
self.conv1 = nn.HybridSequential(prefix='conv1')
self.conv1.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=2, padding=1, use_bias=False, in_channels=3))
self.conv1.add(norm_layer(in_channels=stem_width, **norm_kwargs))
self.conv1.add(nn.Activation('relu'))
self.conv1.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=1, padding=1, use_bias=False, in_channels=stem_width))
self.conv1.add(norm_layer(in_channels=stem_width, **norm_kwargs))
self.conv1.add(nn.Activation('relu'))
self.conv1.add(nn.Conv2D(channels=stem_width * 2, kernel_size=3, strides=1, padding=1, use_bias=False, in_channels=stem_width))
input_size = _update_input_size(input_size, 2)
self.bn1 = norm_layer(in_channels=64 if not deep_stem else stem_width * 2, **norm_kwargs)
self.relu = nn.Activation('relu')
self.maxpool = nn.MaxPool2D(pool_size=3, strides=2, padding=1)
input_size = _update_input_size(input_size, 2)
self.layer1 = self._make_layer(1, block, 64, layers[0], avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, use_splat=use_splat, avd=avd)
self.layer2 = self._make_layer(2, block, 128, layers[1], strides=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, use_splat=use_splat, avd=avd)
input_size = _update_input_size(input_size, 2)
if dilated or dilation == 4:
self.layer3 = self._make_layer(3, block, 256, layers[2], strides=1, dilation=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
self.layer4 = self._make_layer(4, block, 512, layers[3], strides=1, dilation=4, pre_dilation=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
elif dilation == 3:
self.layer3 = self._make_layer(3, block, 256, layers[2], strides=1, dilation=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
self.layer4 = self._make_layer(4, block, 512, layers[3], strides=2, dilation=2, pre_dilation=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
elif dilation == 2:
self.layer3 = self._make_layer(3, block, 256, layers[2], strides=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
self.layer4 = self._make_layer(4, block, 512, layers[3], strides=1, dilation=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
else:
self.layer3 = self._make_layer(3, block, 256, layers[2], strides=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
input_size = _update_input_size(input_size, 2)
self.layer4 = self._make_layer(4, block, 512, layers[3], strides=2, avg_down=avg_down, norm_layer=norm_layer, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd)
input_size = _update_input_size(input_size, 2)
self.avgpool = nn.GlobalAvgPool2D()
self.flat = nn.Flatten()
self.drop = None
if final_drop > 0.0:
self.drop = nn.Dropout(final_drop)
self.fc = nn.Dense(in_units=512 * block.expansion, units=classes)
def _make_layer(self, stage_index, block, planes, blocks, strides=1, dilation=1, pre_dilation=1, avg_down=False, norm_layer=None, last_gamma=False, dropblock_prob=0, input_size=224, use_splat=False, avd=False):
downsample = None
if strides != 1 or self.inplanes != planes * block.expansion:
downsample = nn.HybridSequential(prefix='down%d_' % stage_index)
with downsample.name_scope():
if avg_down:
if pre_dilation == 1:
downsample.add(nn.AvgPool2D(pool_size=strides, strides=strides, ceil_mode=True, count_include_pad=False))
elif strides == 1:
downsample.add(nn.AvgPool2D(pool_size=1, strides=1, ceil_mode=True, count_include_pad=False))
else:
downsample.add(
nn.AvgPool2D(pool_size=pre_dilation * strides, strides=strides, padding=1, ceil_mode=True, count_include_pad=False))
downsample.add(nn.Conv2D(channels=planes * block.expansion, kernel_size=1, strides=1, use_bias=False, in_channels=self.inplanes))
downsample.add(norm_layer(in_channels=planes * block.expansion, **self.norm_kwargs))
else:
downsample.add(nn.Conv2D(channels=planes * block.expansion, kernel_size=1, strides=strides, use_bias=False, in_channels=self.inplanes))
downsample.add(norm_layer(in_channels=planes * block.expansion, **self.norm_kwargs))
layers = nn.HybridSequential(prefix='layers%d_' % stage_index)
with layers.name_scope():
if dilation in (1, 2):
layers.add(block(planes, cardinality=self.cardinality, bottleneck_width=self.bottleneck_width, strides=strides, dilation=pre_dilation, downsample=downsample, previous_dilation=dilation, norm_layer=norm_layer, norm_kwargs=self.norm_kwargs, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd, avd_first=self.avd_first, radix=self.radix, in_channels=self.inplanes, split_drop_ratio=self.split_drop_ratio))
elif dilation == 4:
layers.add(block(planes, cardinality=self.cardinality, bottleneck_width=self.bottleneck_width, strides=strides, dilation=pre_dilation, downsample=downsample, previous_dilation=dilation, norm_layer=norm_layer, norm_kwargs=self.norm_kwargs, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd, avd_first=self.avd_first, radix=self.radix, in_channels=self.inplanes, split_drop_ratio=self.split_drop_ratio))
else:
raise RuntimeError("=> unknown dilation size: {}".format(dilation))
input_size = _update_input_size(input_size, strides)
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.add(block(planes, cardinality=self.cardinality, bottleneck_width=self.bottleneck_width, dilation=dilation, previous_dilation=dilation, norm_layer=norm_layer, norm_kwargs=self.norm_kwargs, last_gamma=last_gamma, dropblock_prob=dropblock_prob, input_size=input_size, use_splat=use_splat, avd=avd, avd_first=self.avd_first, radix=self.radix, in_channels=self.inplanes, split_drop_ratio=self.split_drop_ratio))
return layers
def hybrid_forward(self, F, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = self.flat(x)
if self.drop is not None:
x = self.drop(x)
x = self.fc(x)
return x
class _DropBlock(nn.HybridBlock):
def __init__(self, drop_prob, block_size, c, h, w):
super(_DropBlock, self).__init__()
self.drop_prob = drop_prob
self.block_size = block_size
self.c, self.h, self.w = c, h, w
self.numel = c * h * w
pad_h = max((block_size - 1), 0)
pad_w = max((block_size - 1), 0)
self.padding = (pad_h//2, pad_h-pad_h//2, pad_w//2, pad_w-pad_w//2)
self.dtype = 'float32'
def hybrid_forward(self, F, x):
if not mx.autograd.is_training() or self.drop_prob <= 0:
return x
gamma = self.drop_prob * (self.h * self.w) / (self.block_size ** 2) / ((self.w - self.block_size + 1) * (self.h - self.block_size + 1))
# generate mask
mask = F.random.uniform(0, 1, shape=(1, self.c, self.h, self.w), dtype=self.dtype) < gamma
mask = F.Pooling(mask, pool_type='max',
kernel=(self.block_size, self.block_size), pad=self.padding)
mask = 1 - mask
y = F.broadcast_mul(F.broadcast_mul(x, mask), (1.0 * self.numel / mask.sum(axis=0, exclude=True).expand_dims(1).expand_dims(1).expand_dims(1)))
return y
def cast(self, dtype):
super(_DropBlock, self).cast(dtype)
self.dtype = dtype
def __repr__(self):
reprstr = self.__class__.__name__ + '(' +'drop_prob: {}, block_size{}'.format(self.drop_prob, self.block_size) +')'
return reprstr
def _update_input_size(input_size, stride):
sh, sw = (stride, stride) if isinstance(stride, int) else stride
ih, iw = (input_size, input_size) if isinstance(input_size, int) else input_size
oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
input_size = (oh, ow)
return input_size
class _SplitAttentionConv(nn.HybridBlock):
def __init__(self, channels, kernel_size, strides=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1, radix=2, in_channels=None, r=2, norm_layer=nn.BatchNorm, norm_kwargs=None, drop_ratio=0, *args, **kwargs):
super(_SplitAttentionConv, self).__init__()
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
inter_channels = max(in_channels*radix//2//r, 32)
self.radix = radix
self.cardinality = groups
self.conv = Conv2D(channels*radix, kernel_size, strides, padding, dilation,
groups=groups*radix, *args, in_channels=in_channels, **kwargs)
self.use_bn = norm_layer is not None
if self.use_bn:
self.bn = norm_layer(in_channels=channels*radix, **norm_kwargs)
self.relu = nn.Activation('relu')
self.fc1 = Conv2D(inter_channels, 1, in_channels=channels, groups=self.cardinality)
if self.use_bn:
self.bn1 = norm_layer(in_channels=inter_channels, **norm_kwargs)
self.relu1 = nn.Activation('relu')
if drop_ratio > 0:
self.drop = nn.Dropout(drop_ratio)
else:
self.drop = None
self.fc2 = Conv2D(channels*radix, 1, in_channels=inter_channels, groups=self.cardinality)
self.channels = channels
def hybrid_forward(self, F, x):
x = self.conv(x)
if self.use_bn:
x = self.bn(x)
x = self.relu(x)
if self.radix > 1:
splited = F.reshape(x.expand_dims(1), (0, self.radix, self.channels, 0, 0))
gap = F.sum(splited, axis=1)
else:
gap = x
gap = F.contrib.AdaptiveAvgPooling2D(gap, 1)
gap = self.fc1(gap)
if self.use_bn:
gap = self.bn1(gap)
atten = self.relu1(gap)
if self.drop:
atten = self.drop(atten)
atten = self.fc2(atten).reshape((0, self.cardinality, self.radix, -1)).swapaxes(1, 2)
if self.radix > 1:
atten = F.softmax(atten, axis=1).reshape((0, self.radix, -1, 1, 1))
else:
atten = F.sigmoid(atten).reshape((0, -1, 1, 1))
if self.radix > 1:
outs = F.broadcast_mul(atten, splited)
out = F.sum(outs, axis=1)
else:
out = F.broadcast_mul(atten, x)
return out
class _Bottleneck(nn.HybridBlock):
expansion = 4
def __init__(self, channels, cardinality=1, bottleneck_width=64, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, last_gamma=False, dropblock_prob=0, input_size=None, use_splat=False, radix=2, avd=False, avd_first=False, in_channels=None, split_drop_ratio=0, **kwargs):
super(_Bottleneck, self).__init__()
group_width = int(channels * (bottleneck_width / 64.)) * cardinality
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
self.dropblock_prob = dropblock_prob
self.use_splat = use_splat
self.avd = avd and (strides > 1 or previous_dilation != dilation)
self.avd_first = avd_first
if self.dropblock_prob > 0:
self.dropblock1 = _DropBlock(dropblock_prob, 3, group_width, *input_size)
if self.avd:
if avd_first:
input_size = _update_input_size(input_size, strides)
self.dropblock2 = _DropBlock(dropblock_prob, 3, group_width, *input_size)
if not avd_first:
input_size = _update_input_size(input_size, strides)
else:
input_size = _update_input_size(input_size, strides)
self.dropblock2 = _DropBlock(dropblock_prob, 3, group_width, *input_size)
self.dropblock3 = _DropBlock(dropblock_prob, 3, channels * 4, *input_size)
self.conv1 = nn.Conv2D(channels=group_width, kernel_size=1, use_bias=False, in_channels=in_channels)
self.bn1 = norm_layer(in_channels=group_width, **norm_kwargs)
self.relu1 = nn.Activation('relu')
if self.use_splat:
self.conv2 = _SplitAttentionConv(channels=group_width, kernel_size=3,
strides=1 if self.avd else strides,
padding=dilation, dilation=dilation, groups=cardinality, use_bias=False, in_channels=group_width, norm_layer=norm_layer, norm_kwargs=norm_kwargs, radix=radix, drop_ratio=split_drop_ratio, **kwargs)
else:
self.conv2 = nn.Conv2D(channels=group_width, kernel_size=3,
strides=1 if self.avd else strides,
padding=dilation, dilation=dilation, groups=cardinality,
use_bias=False, in_channels=group_width, **kwargs)
self.bn2 = norm_layer(in_channels=group_width, **norm_kwargs)
self.relu2 = nn.Activation('relu')
self.conv3 = nn.Conv2D(channels=channels * 4, kernel_size=1, use_bias=False,
in_channels=group_width)
if not last_gamma:
self.bn3 = norm_layer(in_channels=channels * 4, **norm_kwargs)
else:
self.bn3 = norm_layer(in_channels=channels * 4, gamma_initializer='zeros',
**norm_kwargs)
if self.avd:
self.avd_layer = nn.AvgPool2D(3, strides, padding=1)
self.relu3 = nn.Activation('relu')
self.downsample = downsample
self.dilation = dilation
self.strides = strides
def hybrid_forward(self, F, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
if self.dropblock_prob > 0:
out = self.dropblock1(out)
out = self.relu1(out)
if self.avd and self.avd_first:
out = self.avd_layer(out)
if self.use_splat:
out = self.conv2(out)
if self.dropblock_prob > 0:
out = self.dropblock2(out)
else:
out = self.conv2(out)
out = self.bn2(out)
if self.dropblock_prob > 0:
out = self.dropblock2(out)
out = self.relu2(out)
if self.avd and not self.avd_first:
out = self.avd_layer(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
residual = self.downsample(x)
if self.dropblock_prob > 0:
out = self.dropblock3(out)
out = out + residual
out = self.relu3(out)
return out
def get_resnest(num_layers,class_num,image_size):
if num_layers==14:
net = _ResNeSt(_Bottleneck, [1, 1, 1, 1],
radix=2, cardinality=1, bottleneck_width=64,
deep_stem=True, avg_down=True,
avd=True, avd_first=False,
use_splat=True, dropblock_prob=0.0,
name_prefix='resnest_14',classes=class_num,input_size=image_size)
elif num_layers == 26:
net = _ResNeSt(_Bottleneck, [2, 2, 2, 2],
radix=2, cardinality=1, bottleneck_width=64,
deep_stem=True, avg_down=True,
avd=True, avd_first=False,
use_splat=True, dropblock_prob=0.1,
name_prefix='resnest_26', classes=class_num,input_size=image_size)
elif num_layers == 50:
net = _ResNeSt(_Bottleneck, [3, 4, 6, 3],
radix=2, cardinality=1, bottleneck_width=64,
deep_stem=True, avg_down=True,
avd=True, avd_first=False,
use_splat=True, dropblock_prob=0.1,
name_prefix='resnest_50', classes=class_num,input_size=image_size)
elif num_layers == 101:
net = _ResNeSt(_Bottleneck, [3, 4, 23, 3],
radix=2, cardinality=1, bottleneck_width=64,
deep_stem=True, avg_down=True, stem_width=64,
avd=True, avd_first=False, use_splat=True, dropblock_prob=0.1,
name_prefix='resnest_101', classes=class_num,input_size=image_size)
elif num_layers == 200:
net = _ResNeSt(_Bottleneck, [3, 24, 36, 3], deep_stem=True, avg_down=True, stem_width=64,
avd=True, use_splat=True, dropblock_prob=0.1, final_drop=0.2,
name_prefix='resnest_200', classes=class_num,input_size=image_size)
elif num_layers == 269:
net = _ResNeSt(_Bottleneck, [3, 30, 48, 8], deep_stem=True, avg_down=True, stem_width=64,
avd=True, use_splat=True, dropblock_prob=0.1, final_drop=0.2,
name_prefix='resnest_', classes=class_num,input_size=image_size)
else:
net = None
return net
论文地址:https://arxiv.org/pdf/1611.05431.pdf
网络结构:
ResNeXt对ResNet进行了改进,采用了多分支的策略,在论文中作者提出了三种等价的模型结构,最后的ResNeXt用了C的结构来构建我们的ResNeXt、这里面和我们的Inception是不同的,在Inception中,每一部分的拓扑结构是不同的,比如一部分是1x1卷积,3x3卷积还有5x5卷积,而我们ResNeXt是用相同的拓扑结构,并在保持参数量的情况下提高了准确率。
代码实现(gluon):
class _Block_resnext(nn.HybridBlock):
def __init__(self, channels, cardinality, bottleneck_width, stride, downsample=False, last_gamma=False, use_se=False, avg_down=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_Block_resnext, self).__init__(**kwargs)
D = int(math.floor(channels * (bottleneck_width / 64)))
group_width = cardinality * D
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Conv2D(group_width, kernel_size=1, use_bias=False))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(group_width, kernel_size=3, strides=stride, padding=1, groups=cardinality, use_bias=False))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(channels * 4, kernel_size=1, use_bias=False))
if last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Conv2D(channels // 4, kernel_size=1, padding=0))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Conv2D(channels * 4, kernel_size=1, padding=0))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if downsample:
self.downsample = nn.HybridSequential(prefix='')
if avg_down:
self.downsample.add(nn.AvgPool2D(pool_size=stride, strides=stride, ceil_mode=True, count_include_pad=False))
self.downsample.add(nn.Conv2D(channels=channels * 4, kernel_size=1, strides=1, use_bias=False))
else:
self.downsample.add(nn.Conv2D(channels * 4, kernel_size=1, strides=stride, use_bias=False))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w)
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(x + residual, act_type='relu')
return x
class _ResNext(nn.HybridBlock):
def __init__(self, layers, cardinality, bottleneck_width, classes=1000, last_gamma=False, use_se=False, deep_stem=False, avg_down=False, stem_width=64, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_ResNext, self).__init__(**kwargs)
self.cardinality = cardinality
self.bottleneck_width = bottleneck_width
channels = 64
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
if not deep_stem:
self.features.add(nn.Conv2D(channels=64, kernel_size=7, strides=2, padding=3, use_bias=False))
else:
self.features.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=2, padding=1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=1, padding=1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.Conv2D(channels=stem_width * 2, kernel_size=3, strides=1, padding=1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(3, 2, 1))
for i, num_layer in enumerate(layers):
stride = 1 if i == 0 else 2
self.features.add(self._make_layer(channels, num_layer, stride, last_gamma, use_se, False if i == 0 else avg_down, i + 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
channels *= 2
self.features.add(nn.GlobalAvgPool2D())
self.output = nn.Dense(classes)
def _make_layer(self, channels, num_layers, stride, last_gamma, use_se, avg_down, stage_index, norm_layer=nn.BatchNorm, norm_kwargs=None):
layer = nn.HybridSequential(prefix='stage%d_' % stage_index)
with layer.name_scope():
layer.add(_Block_resnext(channels, self.cardinality, self.bottleneck_width, stride, True, last_gamma=last_gamma, use_se=use_se, avg_down=avg_down, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(num_layers - 1):
layer.add(_Block_resnext(channels, self.cardinality, self.bottleneck_width, 1, False, last_gamma=last_gamma, use_se=use_se, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
return layer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
_resnext_spec = {50: [3, 4, 6, 3], 101: [3, 4, 23, 3]}
def get_resnextnet(version,num_layers,class_num,use_se=True):
def get_resnext(num_layers,cardinality=32,class_num=1000, bottleneck_width=4, use_se=False, deep_stem=False, avg_down=False, **kwargs):
if use_se:
kwargs['use_se'] = True
layers = _resnext_spec[num_layers]
net = _ResNext(layers, cardinality, bottleneck_width,classes=class_num, use_se=use_se, deep_stem=deep_stem, avg_down=avg_down, **kwargs)
return net
if version=='101e' or version=='50e':
net_model = get_resnext(num_layers=(int)(version.replace('e','')),cardinality=num_layers,class_num=class_num, bottleneck_width = 4,deep_stem=True, avg_down=True)
else:
net_model = get_resnext(num_layers=(int)(version), cardinality=num_layers, class_num=class_num,bottleneck_width=4)
return net_model
论文地址:https://arxiv.org/pdf/1709.01507.pdf
网络结构:
SENET是在论文《Squeeze-and-Excitation Networks》中提出来的,应用在图像处理上的。主要思想:通过建模channel之间的关系来矫正channel的特征,以此提升神经网络的表征能力。特征矫正是使用全局的信息去加强有用的特征,淡化无用的特征。
代码实现(gluon):
class _SEBlock(nn.HybridBlock):
def __init__(self, channels, cardinality, bottleneck_width, stride, downsample=False, downsample_kernel_size=3, avg_down=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_SEBlock, self).__init__(**kwargs)
D = int(math.floor(channels * (bottleneck_width / 64)))
group_width = cardinality * D
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Conv2D(group_width // 2, kernel_size=1, use_bias=False))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(group_width, kernel_size=3, strides=stride, padding=1, groups=cardinality, use_bias=False))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(channels * 4, kernel_size=1, use_bias=False))
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Conv2D(channels // 4, kernel_size=1, padding=0))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Conv2D(channels * 4, kernel_size=1, padding=0))
self.se.add(nn.Activation('sigmoid'))
if downsample:
self.downsample = nn.HybridSequential(prefix='')
if avg_down:
self.downsample.add(nn.AvgPool2D(pool_size=stride, strides=stride, ceil_mode=True, count_include_pad=False))
self.downsample.add(nn.Conv2D(channels=channels * 4, kernel_size=1, strides=1, use_bias=False))
else:
downsample_padding = 1 if downsample_kernel_size == 3 else 0
self.downsample.add(nn.Conv2D(channels * 4, kernel_size=downsample_kernel_size, strides=stride, padding=downsample_padding, use_bias=False))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w)
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(x + residual, act_type='relu')
return x
class _SENet(nn.HybridBlock):
def __init__(self, layers, cardinality, bottleneck_width, avg_down=False, classes=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(_SENet, self).__init__(**kwargs)
self.cardinality = cardinality
self.bottleneck_width = bottleneck_width
channels = 64
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
self.features.add(nn.Conv2D(channels, 3, 2, 1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.Conv2D(channels, 3, 1, 1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.Conv2D(channels * 2, 3, 1, 1, use_bias=False))
self.features.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(3, 2, ceil_mode=True))
for i, num_layer in enumerate(layers):
stride = 1 if i == 0 else 2
self.features.add(self._make_layer(channels, num_layer, stride, i + 1, avg_down=(False if i == 0 else avg_down), norm_layer=norm_layer, norm_kwargs=norm_kwargs))
channels *= 2
self.features.add(nn.GlobalAvgPool2D())
self.features.add(nn.Dropout(0.2))
self.output = nn.Dense(classes)
def _make_layer(self, channels, num_layers, stride, stage_index, avg_down=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
layer = nn.HybridSequential(prefix='stage%d_' % stage_index)
downsample_kernel_size = 1 if stage_index == 1 else 3
with layer.name_scope():
layer.add(_SEBlock(channels, self.cardinality, self.bottleneck_width, stride, True, downsample_kernel_size, avg_down=avg_down, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(num_layers - 1):
layer.add(_SEBlock(channels, self.cardinality, self.bottleneck_width, 1, False, prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
return layer
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
_senet_spec = {50: [3, 4, 6, 3],
101: [3, 4, 23, 3],
152: [3, 8, 36, 3]}
def get_senet(version,num_layers,class_num, cardinality=64, bottleneck_width=4, **kwargs):
layers = _senet_spec[num_layers]
if 'e' in version:
avg_down = True
else:
avg_down = False
net = _SENet(layers, cardinality, bottleneck_width, avg_down,classes=class_num, **kwargs)
return net
代码实现(gluon):
class _SeparableConv2d(nn.HybridBlock):
def __init__(self, inplanes, planes, kernel_size=3, stride=1, dilation=1, bias=False, norm_layer=None, norm_kwargs=None):
super(_SeparableConv2d, self).__init__()
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
self.kernel_size = kernel_size
self.dilation = dilation
self.conv1 = nn.Conv2D(in_channels=inplanes, channels=inplanes, kernel_size=kernel_size, strides=stride, padding=0, dilation=dilation, groups=inplanes, use_bias=bias)
self.bn = norm_layer(in_channels=inplanes, **norm_kwargs)
self.pointwise = nn.Conv2D(in_channels=inplanes, channels=planes, kernel_size=1, use_bias=bias)
def hybrid_forward(self, F, x):
x = self.fixed_padding(x, F, self.kernel_size, dilation=self.dilation)
x = self.conv1(x)
x = self.bn(x)
x = self.pointwise(x)
return x
def fixed_padding(self, inputs, F, kernel_size, dilation):
kernel_size_effective = kernel_size + (kernel_size - 1) * (dilation - 1)
pad_total = kernel_size_effective - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
padded_inputs = F.pad(inputs, mode="constant", constant_value=0, pad_width=(0, 0, 0, 0, pad_beg, pad_end, pad_beg, pad_end))
return padded_inputs
class _Block(nn.HybridBlock):
def __init__(self, inplanes, planes, reps, stride=1, dilation=1, norm_layer=None, norm_kwargs=None, start_with_relu=True, grow_first=True, is_last=False):
super(_Block, self).__init__()
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
if planes != inplanes or stride != 1:
self.skip = nn.Conv2D(in_channels=inplanes, channels=planes, kernel_size=1, strides=stride, use_bias=False)
self.skipbn = norm_layer(in_channels=planes, **norm_kwargs)
else:
self.skip = None
self.relu = nn.Activation('relu')
self.rep = nn.HybridSequential()
filters = inplanes
if grow_first:
if start_with_relu:
self.rep.add(self.relu)
self.rep.add(_SeparableConv2d(inplanes, planes, 3, 1, dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.rep.add(norm_layer(in_channels=planes, **norm_kwargs))
filters = planes
for i in range(reps - 1):
if grow_first or start_with_relu:
self.rep.add(self.relu)
self.rep.add(_SeparableConv2d(filters, filters, 3, 1, dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.rep.add(norm_layer(in_channels=filters, **norm_kwargs))
if not grow_first:
self.rep.add(self.relu)
self.rep.add(_SeparableConv2d(inplanes, planes, 3, 1, dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.rep.add(norm_layer(in_channels=planes, **norm_kwargs))
if stride != 1:
self.rep.add(self.relu)
self.rep.add(_SeparableConv2d(planes, planes, 3, stride, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.rep.add(norm_layer(in_channels=planes, **norm_kwargs))
elif is_last:
self.rep.add(self.relu)
self.rep.add(_SeparableConv2d(planes, planes, 3, 1, dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.rep.add(norm_layer(in_channels=planes, **norm_kwargs))
def hybrid_forward(self, F, inp):
x = self.rep(inp)
if self.skip is not None:
skip = self.skip(inp)
skip = self.skipbn(skip)
else:
skip = inp
x = x + skip
return x
class _Xception65(nn.HybridBlock):
def __init__(self, classes=1000, output_stride=32, norm_layer=nn.BatchNorm, norm_kwargs=None):
super(_Xception65, self).__init__()
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
if output_stride == 32:
entry_block3_stride = 2
exit_block20_stride = 2
middle_block_dilation = 1
exit_block_dilations = (1, 1)
elif output_stride == 16:
entry_block3_stride = 2
exit_block20_stride = 1
middle_block_dilation = 1
exit_block_dilations = (1, 2)
elif output_stride == 8:
entry_block3_stride = 1
exit_block20_stride = 1
middle_block_dilation = 2
exit_block_dilations = (2, 4)
else:
raise NotImplementedError
with self.name_scope():
self.conv1 = nn.Conv2D(in_channels=3, channels=32, kernel_size=3, strides=2, padding=1, use_bias=False)
self.bn1 = norm_layer(in_channels=32, **norm_kwargs)
self.relu = nn.Activation('relu')
self.conv2 = nn.Conv2D(in_channels=32, channels=64, kernel_size=3, strides=1, padding=1, use_bias=False)
self.bn2 = norm_layer(in_channels=64)
self.block1 = _Block(64, 128, reps=2, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False)
self.block2 = _Block(128, 256, reps=2, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False, grow_first=True)
#print('self.block2', self.block2)
self.block3 = _Block(256, 728, reps=2, stride=entry_block3_stride, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=True, is_last=True)
# Middle flow
self.midflow = nn.HybridSequential()
for i in range(4, 20):
self.midflow.add((_Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=True)))
# Exit flow
self.block20 = _Block(728, 1024, reps=2, stride=exit_block20_stride, dilation=exit_block_dilations[0], norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=False, is_last=True)
self.conv3 = _SeparableConv2d(1024, 1536, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn3 = norm_layer(in_channels=1536, **norm_kwargs)
self.conv4 = _SeparableConv2d(1536, 1536, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn4 = norm_layer(in_channels=1536, **norm_kwargs)
self.conv5 = _SeparableConv2d(1536, 2048, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn5 = norm_layer(in_channels=2048, **norm_kwargs)
self.avgpool = nn.GlobalAvgPool2D()
self.flat = nn.Flatten()
self.fc = nn.Dense(in_units=2048, units=classes)
def hybrid_forward(self, F, x):
# Entry flow
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.relu(x)
x = self.block1(x)
# add relu here
x = self.relu(x)
#c1 = x
x = self.block2(x)
#c2 = x
x = self.block3(x)
# Middle flow
x = self.midflow(x)
#c3 = x
# Exit flow
x = self.block20(x)
x = self.relu(x)
x = self.conv3(x)
x = self.bn3(x)
x = self.relu(x)
x = self.conv4(x)
x = self.bn4(x)
x = self.relu(x)
x = self.conv5(x)
x = self.bn5(x)
x = self.relu(x)
x = self.avgpool(x)
x = self.flat(x)
x = self.fc(x)
return x
class _Xception71(nn.HybridBlock):
def __init__(self, classes=1000, output_stride=32, norm_layer=nn.BatchNorm, norm_kwargs=None):
super(_Xception71, self).__init__()
norm_kwargs = norm_kwargs if norm_kwargs is not None else {}
if output_stride == 32:
entry_block3_stride = 2
exit_block20_stride = 2
middle_block_dilation = 1
exit_block_dilations = (1, 1)
elif output_stride == 16:
entry_block3_stride = 2
exit_block20_stride = 1
middle_block_dilation = 1
exit_block_dilations = (1, 2)
elif output_stride == 8:
entry_block3_stride = 1
exit_block20_stride = 1
middle_block_dilation = 2
exit_block_dilations = (2, 4)
else:
raise NotImplementedError
# Entry flow
with self.name_scope():
self.conv1 = nn.Conv2D(in_channels=3, channels=32, kernel_size=3, strides=2, padding=1, use_bias=False)
self.bn1 = norm_layer(in_channels=32, **norm_kwargs)
self.relu = nn.Activation('relu')
self.conv2 = nn.Conv2D(in_channels=32, channels=64, kernel_size=3, strides=1, padding=1, use_bias=False)
self.bn2 = norm_layer(in_channels=64)
self.block1 = _Block(64, 128, reps=2, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False)
self.block2 = nn.HybridSequential()
self.block2.add(_Block(128, 256, reps=2, stride=1, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False, grow_first=True))
self.block2.add(_Block(256, 256, reps=2, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False, grow_first=True))
self.block2.add(_Block(256, 728, reps=2, stride=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=False, grow_first=True))
self.block3 = _Block(728, 728, reps=2, stride=entry_block3_stride, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=True, is_last=True)
# Middle flow
self.midflow = nn.HybridSequential()
for i in range(4, 20):
self.midflow.add((_Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=True)))
# Exit flow
self.block20 = _Block(728, 1024, reps=2, stride=exit_block20_stride, dilation=exit_block_dilations[0], norm_layer=norm_layer, norm_kwargs=norm_kwargs, start_with_relu=True, grow_first=False, is_last=True)
self.conv3 = _SeparableConv2d(1024, 1536, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn3 = norm_layer(in_channels=1536, **norm_kwargs)
self.conv4 = _SeparableConv2d(1536, 1536, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn4 = norm_layer(in_channels=1536, **norm_kwargs)
self.conv5 = _SeparableConv2d(1536, 2048, 3, stride=1, dilation=exit_block_dilations[1], norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.bn5 = norm_layer(in_channels=2048, **norm_kwargs)
self.avgpool = nn.GlobalAvgPool2D()
self.flat = nn.Flatten()
self.fc = nn.Dense(in_units=2048, units=classes)
def hybrid_forward(self, F, x):
# Entry flow
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.relu(x)
x = self.block1(x)
# add relu here
x = self.relu(x)
# low_level_feat = x
x = self.block2(x)
#c2 = x
x = self.block3(x)
# Middle flow
x = self.midflow(x)
#c3 = x
# Exit flow
x = self.block20(x)
x = self.relu(x)
x = self.conv3(x)
x = self.bn3(x)
x = self.relu(x)
x = self.conv4(x)
x = self.bn4(x)
x = self.relu(x)
x = self.conv5(x)
x = self.bn5(x)
x = self.relu(x)
x = self.avgpool(x)
x = self.flat(x)
x = self.fc(x)
return x
def get_xception(version, class_num, **kwargs):
if version == 65:
net = _Xception65(classes=class_num,**kwargs)
elif version == 71:
net = _Xception71(classes=class_num, ** kwargs)
else:
net = None
return net
代码实现(gluon):
def _make_basic_conv_inceptionv3(norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
out = nn.HybridSequential(prefix='')
out.add(nn.Conv2D(use_bias=False, **kwargs))
out.add(norm_layer(epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
out.add(nn.Activation('relu'))
return out
def _make_branch_inceptionv3(use_pool, norm_layer, norm_kwargs, *conv_settings):
out = nn.HybridSequential(prefix='')
if use_pool == 'avg':
out.add(nn.AvgPool2D(pool_size=3, strides=1, padding=1))
elif use_pool == 'max':
out.add(nn.MaxPool2D(pool_size=3, strides=2))
setting_names = ['channels', 'kernel_size', 'strides', 'padding']
for setting in conv_settings:
kwargs = {}
for i, value in enumerate(setting):
if value is not None:
kwargs[setting_names[i]] = value
out.add(_make_basic_conv_inceptionv3(norm_layer, norm_kwargs, **kwargs))
return out
def _make_A(pool_features, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (64, 1, None, None)))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (48, 1, None, None), (64, 5, None, 2)))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (64, 1, None, None), (96, 3, None, 1), (96, 3, None, 1)))
out.add(_make_branch_inceptionv3('avg', norm_layer, norm_kwargs, (pool_features, 1, None, None)))
return out
def _make_B(prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, 3, 2, None)))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (64, 1, None, None), (96, 3, None, 1), (96, 3, 2, None)))
out.add(_make_branch_inceptionv3('max', norm_layer, norm_kwargs))
return out
def _make_C(channels_7x7, prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (192, 1, None, None)))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (channels_7x7, 1, None, None), (channels_7x7, (1, 7), None, (0, 3)), (192, (7, 1), None, (3, 0))))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (channels_7x7, 1, None, None), (channels_7x7, (7, 1), None, (3, 0)), (channels_7x7, (1, 7), None, (0, 3)), (channels_7x7, (7, 1), None, (3, 0)), (192, (1, 7), None, (0, 3))))
out.add(_make_branch_inceptionv3('avg', norm_layer, norm_kwargs, (192, 1, None, None)))
return out
def _make_D(prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (192, 1, None, None), (320, 3, 2, None)))
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (192, 1, None, None), (192, (1, 7), None, (0, 3)), (192, (7, 1), None, (3, 0)), (192, 3, 2, None)))
out.add(_make_branch_inceptionv3('max', norm_layer, norm_kwargs))
return out
def _make_E(prefix, norm_layer, norm_kwargs):
out = HybridConcurrent(axis=1, prefix=prefix)
with out.name_scope():
out.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (320, 1, None, None)))
branch_3x3 = nn.HybridSequential(prefix='')
out.add(branch_3x3)
branch_3x3.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, 1, None, None)))
branch_3x3_split = HybridConcurrent(axis=1, prefix='')
branch_3x3_split.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, (1, 3), None, (0, 1))))
branch_3x3_split.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, (3, 1), None, (1, 0))))
branch_3x3.add(branch_3x3_split)
branch_3x3dbl = nn.HybridSequential(prefix='')
out.add(branch_3x3dbl)
branch_3x3dbl.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (448, 1, None, None), (384, 3, None, 1)))
branch_3x3dbl_split = HybridConcurrent(axis=1, prefix='')
branch_3x3dbl.add(branch_3x3dbl_split)
branch_3x3dbl_split.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, (1, 3), None, (0, 1))))
branch_3x3dbl_split.add(_make_branch_inceptionv3(None, norm_layer, norm_kwargs, (384, (3, 1), None, (1, 0))))
out.add(_make_branch_inceptionv3('avg', norm_layer, norm_kwargs, (192, 1, None, None)))
return out
def make_aux_inceptionv3(classes, norm_layer, norm_kwargs):
out = nn.HybridSequential(prefix='')
out.add(nn.AvgPool2D(pool_size=5, strides=3))
out.add(_make_basic_conv_inceptionv3(channels=128, kernel_size=1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
out.add(_make_basic_conv_inceptionv3(channels=768, kernel_size=5, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
out.add(nn.Flatten())
out.add(nn.Dense(classes))
return out
class _Inception3(nn.HybridBlock):
def __init__(self, classes=1000, norm_layer=nn.BatchNorm, norm_kwargs=None, partial_bn=False, **kwargs):
super(_Inception3, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
self.features.add(_make_basic_conv_inceptionv3(channels=32, kernel_size=3, strides=2, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
if partial_bn:
if norm_kwargs is not None:
norm_kwargs['use_global_stats'] = True
else:
norm_kwargs = {}
norm_kwargs['use_global_stats'] = True
self.features.add(_make_basic_conv_inceptionv3(channels=32, kernel_size=3, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.features.add(_make_basic_conv_inceptionv3(channels=64, kernel_size=3, padding=1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(_make_basic_conv_inceptionv3(channels=80, kernel_size=1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.features.add(_make_basic_conv_inceptionv3(channels=192, kernel_size=3, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(_make_A(32, 'A1_', norm_layer, norm_kwargs))
self.features.add(_make_A(64, 'A2_', norm_layer, norm_kwargs))
self.features.add(_make_A(64, 'A3_', norm_layer, norm_kwargs))
self.features.add(_make_B('B_', norm_layer, norm_kwargs))
self.features.add(_make_C(128, 'C1_', norm_layer, norm_kwargs))
self.features.add(_make_C(160, 'C2_', norm_layer, norm_kwargs))
self.features.add(_make_C(160, 'C3_', norm_layer, norm_kwargs))
self.features.add(_make_C(192, 'C4_', norm_layer, norm_kwargs))
self.features.add(_make_D('D_', norm_layer, norm_kwargs))
self.features.add(_make_E('E1_', norm_layer, norm_kwargs))
self.features.add(_make_E('E2_', norm_layer, norm_kwargs))
self.features.add(nn.AvgPool2D(pool_size=8))
self.features.add(nn.Dropout(0.5))
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_inception_v3(class_num, partial_bn=False, **kwargs):
net = _Inception3(classes=class_num,partial_bn = partial_bn,**kwargs)
return net
def _make_fire(squeeze_channels, expand1x1_channels, expand3x3_channels):
out = nn.HybridSequential(prefix='')
out.add(_make_fire_conv(squeeze_channels, 1))
paths = HybridConcurrent(axis=1, prefix='')
paths.add(_make_fire_conv(expand1x1_channels, 1))
paths.add(_make_fire_conv(expand3x3_channels, 3, 1))
out.add(paths)
return out
def _make_fire_conv(channels, kernel_size, padding=0):
out = nn.HybridSequential(prefix='')
out.add(nn.Conv2D(channels, kernel_size, padding=padding))
out.add(nn.Activation('relu'))
return out
class _SqueezeNet(nn.HybridBlock):
def __init__(self, version, classes=1000, **kwargs):
super(_SqueezeNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
if version == '1.0':
self.features.add(nn.Conv2D(96, kernel_size=7, strides=2))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(pool_size=2, strides=2, ceil_mode=True))
self.features.add(_make_fire(16, 64, 64))
self.features.add(_make_fire(16, 64, 64))
self.features.add(_make_fire(32, 128, 128))
self.features.add(nn.MaxPool2D(pool_size=2, strides=2, ceil_mode=True))
self.features.add(_make_fire(32, 128, 128))
self.features.add(_make_fire(48, 192, 192))
self.features.add(_make_fire(48, 192, 192))
self.features.add(_make_fire(64, 256, 256))
self.features.add(nn.MaxPool2D(pool_size=2, strides=2, ceil_mode=True))
self.features.add(_make_fire(64, 256, 256))
else:
self.features.add(nn.Conv2D(64, kernel_size=3, strides=2))
self.features.add(nn.Activation('relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True))
self.features.add(_make_fire(16, 64, 64))
self.features.add(_make_fire(16, 64, 64))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True))
self.features.add(_make_fire(32, 128, 128))
self.features.add(_make_fire(32, 128, 128))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2, ceil_mode=True))
self.features.add(_make_fire(48, 192, 192))
self.features.add(_make_fire(48, 192, 192))
self.features.add(_make_fire(64, 256, 256))
self.features.add(_make_fire(64, 256, 256))
self.features.add(nn.Dropout(0.5))
self.output = nn.HybridSequential(prefix='')
self.output.add(nn.Conv2D(classes, kernel_size=1))
self.output.add(nn.Activation('relu'))
self.output.add(nn.AvgPool2D((1, 1)))
self.output.add(nn.Flatten())
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_squeezenet(version,class_num, **kwargs):
net = _SqueezeNet(version,classes = class_num, **kwargs)
return net
代码实现(gluon):
class _NinNet(nn.HybridBlock):
def __init__(self, classes=1000, **kwargs):
super(_NinNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
with self.features.name_scope():
self.features.add(nn.Conv2D(96, 11, 4, 0, activation='relu'))
self.features.add(nn.Conv2D(96, kernel_size=1, activation='relu'))
self.features.add(nn.Conv2D(96, kernel_size=1, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(256, 5, 1, 2, activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=1, activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=1, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(384, 3, 1, 1, activation='relu'))
self.features.add(nn.Conv2D(384, kernel_size=1, activation='relu'))
self.features.add(nn.Conv2D(384, kernel_size=1, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Dropout(0.5))
self.features.add(nn.Conv2D(classes, 3, 1, 1, activation='relu'))
self.features.add(nn.Conv2D(classes, kernel_size=1, activation='relu'))
self.features.add(nn.Conv2D(classes, kernel_size=1, activation='relu'))
self.features.add(nn.GlobalAvgPool2D())
self.features.add(nn.Flatten())
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
def get_ninnet(class_num):
return _NinNet(classes=class_num)
代码实现(gluon):
class MaxPoolPad(nn.HybridBlock):
def __init__(self):
super(MaxPoolPad, self).__init__()
self.pool = nn.MaxPool2D(3, strides=2, padding=1)
def hybrid_forward(self, F, x):
x = F.pad(x, pad_width=(0, 0, 0, 0, 1, 0, 1, 0), mode='constant', constant_value=0)
x = self.pool(x)
x = F.slice(x, begin=(0, 0, 1, 1), end=(None, None, None, None))
return x
class AvgPoolPad(nn.HybridBlock):
def __init__(self, stride=2, padding=1):
super(AvgPoolPad, self).__init__()
self.pool = nn.AvgPool2D(3, strides=stride, padding=padding, count_include_pad=False)
def hybrid_forward(self, F, x):
x = F.pad(x, pad_width=(0, 0, 0, 0, 1, 0, 1, 0), mode='constant', constant_value=0)
x = self.pool(x)
x = F.slice(x, begin=(0, 0, 1, 1), end=(None, None, None, None))
return x
class SeparableConv2d_nasnet(nn.HybridBlock):
def __init__(self, in_channels, channels, dw_kernel, dw_stride, dw_padding, use_bias=False):
super(SeparableConv2d_nasnet, self).__init__()
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Conv2D(in_channels, kernel_size=dw_kernel, strides=dw_stride, padding=dw_padding, use_bias=use_bias, groups=in_channels))
self.body.add(nn.Conv2D(channels, kernel_size=1, strides=1, use_bias=use_bias))
def hybrid_forward(self, F, x):
x = self.body(x)
return x
class BranchSeparables(nn.HybridBlock):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding, norm_layer, norm_kwargs, use_bias=False):
super(BranchSeparables, self).__init__()
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Activation('relu'))
self.body.add(SeparableConv2d_nasnet(in_channels, in_channels, kernel_size, stride, padding, use_bias=use_bias))
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(SeparableConv2d_nasnet(in_channels, out_channels, kernel_size, 1, padding, use_bias=use_bias))
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
def hybrid_forward(self, F, x):
x = self.body(x)
return(x)
class BranchSeparablesStem(nn.HybridBlock):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding, norm_layer, norm_kwargs, use_bias=False):
super(BranchSeparablesStem, self).__init__()
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Activation('relu'))
self.body.add(SeparableConv2d_nasnet(in_channels, out_channels, kernel_size, stride, padding, use_bias=use_bias))
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(SeparableConv2d_nasnet(out_channels, out_channels, kernel_size, 1, padding, use_bias=use_bias))
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
def hybrid_forward(self, F, x):
x = self.body(x)
return(x)
class BranchSeparablesReduction(nn.HybridBlock):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding,
z_padding=1, use_bias=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
super(BranchSeparablesReduction, self).__init__()
self.z_padding = z_padding
self.separable = SeparableConv2d_nasnet(in_channels, in_channels, kernel_size, stride, padding, use_bias=use_bias)
self.body = nn.HybridSequential(prefix='')
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(SeparableConv2d_nasnet(in_channels, out_channels, kernel_size, 1, padding, use_bias=use_bias))
self.body.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
def hybrid_forward(self, F, x):
x = F.Activation(x, act_type='relu')
x = F.pad(x, pad_width=(0, 0, 0, 0, self.z_padding, 0, self.z_padding, 0), mode='constant', constant_value=0)
x = self.separable(x)
x = F.slice(x, begin=(0, 0, 1, 1), end=(None, None, None, None))
x = self.body(x)
return(x)
class CellStem0(nn.HybridBlock):
def __init__(self, stem_filters, norm_layer, norm_kwargs, num_filters=42):
super(CellStem0, self).__init__()
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(num_filters, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.comb_iter_0_left = BranchSeparables(num_filters, num_filters, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_0_right = BranchSeparablesStem(stem_filters, num_filters, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_1_left = nn.MaxPool2D(3, strides=2, padding=1)
self.comb_iter_1_right = BranchSeparablesStem(stem_filters, num_filters, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_2_left = nn.AvgPool2D(3, strides=2, padding=1, count_include_pad=False)
self.comb_iter_2_right = BranchSeparablesStem(stem_filters, num_filters, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparables(num_filters, num_filters, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_4_right = nn.MaxPool2D(3, strides=2, padding=1)
def hybrid_forward(self, F, x):
x1 = self.conv_1x1(x)
x_comb_iter_0_left = self.comb_iter_0_left(x1)
x_comb_iter_0_right = self.comb_iter_0_right(x)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x1)
x_comb_iter_1_right = self.comb_iter_1_right(x)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x1)
x_comb_iter_2_right = self.comb_iter_2_right(x)
x_comb_iter_2 = x_comb_iter_2_left + x_comb_iter_2_right
x_comb_iter_3_right = self.comb_iter_3_right(x_comb_iter_0)
x_comb_iter_3 = x_comb_iter_3_right + x_comb_iter_1
x_comb_iter_4_left = self.comb_iter_4_left(x_comb_iter_0)
x_comb_iter_4_right = self.comb_iter_4_right(x1)
x_comb_iter_4 = x_comb_iter_4_left + x_comb_iter_4_right
x_out = F.concat(x_comb_iter_1, x_comb_iter_2, x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out
class CellStem1(nn.HybridBlock):
def __init__(self, num_filters, norm_layer, norm_kwargs):
super(CellStem1, self).__init__()
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(num_filters, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.path_1 = nn.HybridSequential(prefix='')
self.path_1.add(nn.AvgPool2D(1, strides=2, count_include_pad=False))
self.path_1.add(nn.Conv2D(num_filters//2, 1, strides=1, use_bias=False))
self.path_2 = nn.HybridSequential(prefix='')
# No nn.ZeroPad2D in gluon
self.path_2.add(nn.AvgPool2D(1, strides=2, count_include_pad=False))
self.path_2.add(nn.Conv2D(num_filters//2, 1, strides=1, use_bias=False))
self.final_path_bn = norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs))
self.comb_iter_0_left = BranchSeparables(num_filters, num_filters, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_0_right = BranchSeparables(num_filters, num_filters, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_1_left = nn.MaxPool2D(3, strides=2, padding=1)
self.comb_iter_1_right = BranchSeparables(num_filters, num_filters, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_2_left = nn.AvgPool2D(3, strides=2, padding=1, count_include_pad=False)
self.comb_iter_2_right = BranchSeparables(num_filters, num_filters, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparables(num_filters, num_filters, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_4_right = nn.MaxPool2D(3, strides=2, padding=1)
def hybrid_forward(self, F, x_conv0, x_stem_0):
x_left = self.conv_1x1(x_stem_0)
x_relu = F.Activation(x_conv0, act_type='relu')
x_path1 = self.path_1(x_relu)
x_path2 = F.pad(x_relu, pad_width=(0, 0, 0, 0, 0, 1, 0, 1), mode='constant', constant_value=0)
x_path2 = F.slice(x_path2, begin=(0, 0, 1, 1), end=(None, None, None, None))
x_path2 = self.path_2(x_path2)
x_right = self.final_path_bn(F.concat(x_path1, x_path2, dim=1))
x_comb_iter_0_left = self.comb_iter_0_left(x_left)
x_comb_iter_0_right = self.comb_iter_0_right(x_right)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x_left)
x_comb_iter_1_right = self.comb_iter_1_right(x_right)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x_left)
x_comb_iter_2_right = self.comb_iter_2_right(x_right)
x_comb_iter_2 = x_comb_iter_2_left + x_comb_iter_2_right
x_comb_iter_3_right = self.comb_iter_3_right(x_comb_iter_0)
x_comb_iter_3 = x_comb_iter_3_right + x_comb_iter_1
x_comb_iter_4_left = self.comb_iter_4_left(x_comb_iter_0)
x_comb_iter_4_right = self.comb_iter_4_right(x_left)
x_comb_iter_4 = x_comb_iter_4_left + x_comb_iter_4_right
x_out = F.concat(x_comb_iter_1, x_comb_iter_2, x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out
class FirstCell(nn.HybridBlock):
def __init__(self, out_channels_left, out_channels_right, norm_layer, norm_kwargs):
super(FirstCell, self).__init__()
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(out_channels_right, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.path_1 = nn.HybridSequential(prefix='')
self.path_1.add(nn.AvgPool2D(1, strides=2, count_include_pad=False))
self.path_1.add(nn.Conv2D(out_channels_left, 1, strides=1, use_bias=False))
self.path_2 = nn.HybridSequential(prefix='')
# No nn.ZeroPad2D in gluon
self.path_2.add(nn.AvgPool2D(1, strides=2, count_include_pad=False))
self.path_2.add(nn.Conv2D(out_channels_left, 1, strides=1, use_bias=False))
self.final_path_bn = norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs))
self.comb_iter_0_left = BranchSeparables(out_channels_right, out_channels_right, 5, 1, 2, norm_layer, norm_kwargs)
self.comb_iter_0_right = BranchSeparables(out_channels_right, out_channels_right, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_1_left = BranchSeparables(out_channels_right, out_channels_right, 5, 1, 2, norm_layer, norm_kwargs)
self.comb_iter_1_right = BranchSeparables(out_channels_right, out_channels_right, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_2_left = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_3_left = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparables(out_channels_right, out_channels_right, 3, 1, 1, norm_layer, norm_kwargs)
def hybrid_forward(self, F, x, x_prev):
x_relu = F.Activation(x_prev, act_type='relu')
x_path1 = self.path_1(x_relu)
x_path2 = F.pad(x_relu, pad_width=(0, 0, 0, 0, 0, 1, 0, 1), mode='constant', constant_value=0)
x_path2 = F.slice(x_path2, begin=(0, 0, 1, 1), end=(None, None, None, None))
x_path2 = self.path_2(x_path2)
x_left = self.final_path_bn(F.concat(x_path1, x_path2, dim=1))
x_right = self.conv_1x1(x)
x_comb_iter_0_left = self.comb_iter_0_left(x_right)
x_comb_iter_0_right = self.comb_iter_0_right(x_left)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x_left)
x_comb_iter_1_right = self.comb_iter_1_right(x_left)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x_right)
x_comb_iter_2 = x_comb_iter_2_left + x_left
x_comb_iter_3_left = self.comb_iter_3_left(x_left)
x_comb_iter_3_right = self.comb_iter_3_right(x_left)
x_comb_iter_3 = x_comb_iter_3_left + x_comb_iter_3_right
x_comb_iter_4_left = self.comb_iter_4_left(x_right)
x_comb_iter_4 = x_comb_iter_4_left + x_right
x_out = F.concat(x_left, x_comb_iter_0, x_comb_iter_1, x_comb_iter_2,
x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out, x
class NormalCell(nn.HybridBlock):
def __init__(self, out_channels_left, out_channels_right, norm_layer, norm_kwargs):
super(NormalCell, self).__init__()
self.conv_prev_1x1 = nn.HybridSequential(prefix='')
self.conv_prev_1x1.add(nn.Activation('relu'))
self.conv_prev_1x1.add(nn.Conv2D(out_channels_left, 1, strides=1, use_bias=False))
self.conv_prev_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(out_channels_right, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.comb_iter_0_left = BranchSeparables(out_channels_right, out_channels_right, 5, 1, 2, norm_layer, norm_kwargs)
self.comb_iter_0_right = BranchSeparables(out_channels_left, out_channels_left, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_1_left = BranchSeparables(out_channels_left, out_channels_left, 5, 1, 2, norm_layer, norm_kwargs)
self.comb_iter_1_right = BranchSeparables(out_channels_left, out_channels_left, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_2_left = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_3_left = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparables(out_channels_right, out_channels_right, 3, 1, 1, norm_layer, norm_kwargs)
def hybrid_forward(self, F, x, x_prev):
x_left = self.conv_prev_1x1(x_prev)
x_right = self.conv_1x1(x)
x_comb_iter_0_left = self.comb_iter_0_left(x_right)
x_comb_iter_0_right = self.comb_iter_0_right(x_left)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x_left)
x_comb_iter_1_right = self.comb_iter_1_right(x_left)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x_right)
x_comb_iter_2 = x_comb_iter_2_left + x_left
x_comb_iter_3_left = self.comb_iter_3_left(x_left)
x_comb_iter_3_right = self.comb_iter_3_right(x_left)
x_comb_iter_3 = x_comb_iter_3_left + x_comb_iter_3_right
x_comb_iter_4_left = self.comb_iter_4_left(x_right)
x_comb_iter_4 = x_comb_iter_4_left + x_right
x_out = F.concat(x_left, x_comb_iter_0, x_comb_iter_1, x_comb_iter_2, x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out, x
class ReductionCell0(nn.HybridBlock):
def __init__(self, out_channels_left, out_channels_right, norm_layer, norm_kwargs):
super(ReductionCell0, self).__init__()
self.conv_prev_1x1 = nn.HybridSequential(prefix='')
self.conv_prev_1x1.add(nn.Activation('relu'))
self.conv_prev_1x1.add(nn.Conv2D(out_channels_left, 1, strides=1, use_bias=False))
self.conv_prev_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(out_channels_right, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.comb_iter_0_left = BranchSeparablesReduction(out_channels_right, out_channels_right, 5, 2, 2, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.comb_iter_0_right = BranchSeparablesReduction(out_channels_right, out_channels_right, 7, 2, 3, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.comb_iter_1_left = MaxPoolPad()
self.comb_iter_1_right = BranchSeparablesReduction(out_channels_right, out_channels_right, 7, 2, 3, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.comb_iter_2_left = AvgPoolPad()
self.comb_iter_2_right = BranchSeparablesReduction(out_channels_right, out_channels_right, 5, 2, 2, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparablesReduction(out_channels_right, out_channels_right, 3, 1, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.comb_iter_4_right = MaxPoolPad()
def hybrid_forward(self, F, x, x_prev):
x_left = self.conv_prev_1x1(x_prev)
x_right = self.conv_1x1(x)
x_comb_iter_0_left = self.comb_iter_0_left(x_right)
x_comb_iter_0_right = self.comb_iter_0_right(x_left)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x_right)
x_comb_iter_1_right = self.comb_iter_1_right(x_left)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x_right)
x_comb_iter_2_right = self.comb_iter_2_right(x_left)
x_comb_iter_2 = x_comb_iter_2_left + x_comb_iter_2_right
x_comb_iter_3_right = self.comb_iter_3_right(x_comb_iter_0)
x_comb_iter_3 = x_comb_iter_3_right + x_comb_iter_1
x_comb_iter_4_left = self.comb_iter_4_left(x_comb_iter_0)
x_comb_iter_4_right = self.comb_iter_4_right(x_right)
x_comb_iter_4 = x_comb_iter_4_left + x_comb_iter_4_right
x_out = F.concat(x_comb_iter_1, x_comb_iter_2, x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out, x
class ReductionCell1(nn.HybridBlock):
def __init__(self, out_channels_left, out_channels_right, norm_layer, norm_kwargs):
super(ReductionCell1, self).__init__()
self.conv_prev_1x1 = nn.HybridSequential(prefix='')
self.conv_prev_1x1.add(nn.Activation('relu'))
self.conv_prev_1x1.add(nn.Conv2D(out_channels_left, 1, strides=1, use_bias=False))
self.conv_prev_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.conv_1x1 = nn.HybridSequential(prefix='')
self.conv_1x1.add(nn.Activation('relu'))
self.conv_1x1.add(nn.Conv2D(out_channels_right, 1, strides=1, use_bias=False))
self.conv_1x1.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.comb_iter_0_left = BranchSeparables(out_channels_right, out_channels_right, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_0_right = BranchSeparables(out_channels_right, out_channels_right, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_1_left = nn.MaxPool2D(3, strides=2, padding=1)
self.comb_iter_1_right = BranchSeparables(out_channels_right, out_channels_right, 7, 2, 3, norm_layer, norm_kwargs)
self.comb_iter_2_left = nn.AvgPool2D(3, strides=2, padding=1, count_include_pad=False)
self.comb_iter_2_right = BranchSeparables(out_channels_right, out_channels_right, 5, 2, 2, norm_layer, norm_kwargs)
self.comb_iter_3_right = nn.AvgPool2D(3, strides=1, padding=1, count_include_pad=False)
self.comb_iter_4_left = BranchSeparables(out_channels_right, out_channels_right, 3, 1, 1, norm_layer, norm_kwargs)
self.comb_iter_4_right = nn.MaxPool2D(3, strides=2, padding=1)
def hybrid_forward(self, F, x, x_prev):
x_left = self.conv_prev_1x1(x_prev)
x_right = self.conv_1x1(x)
x_comb_iter_0_left = self.comb_iter_0_left(x_right)
x_comb_iter_0_right = self.comb_iter_0_right(x_left)
x_comb_iter_0 = x_comb_iter_0_left + x_comb_iter_0_right
x_comb_iter_1_left = self.comb_iter_1_left(x_right)
x_comb_iter_1_right = self.comb_iter_1_right(x_left)
x_comb_iter_1 = x_comb_iter_1_left + x_comb_iter_1_right
x_comb_iter_2_left = self.comb_iter_2_left(x_right)
x_comb_iter_2_right = self.comb_iter_2_right(x_left)
x_comb_iter_2 = x_comb_iter_2_left + x_comb_iter_2_right
x_comb_iter_3_right = self.comb_iter_3_right(x_comb_iter_0)
x_comb_iter_3 = x_comb_iter_3_right + x_comb_iter_1
x_comb_iter_4_left = self.comb_iter_4_left(x_comb_iter_0)
x_comb_iter_4_right = self.comb_iter_4_right(x_right)
x_comb_iter_4 = x_comb_iter_4_left + x_comb_iter_4_right
x_out = F.concat(x_comb_iter_1, x_comb_iter_2, x_comb_iter_3, x_comb_iter_4, dim=1)
return x_out, x
class NASNetALarge(nn.HybridBlock):
def __init__(self, repeat=6, penultimate_filters=4032, stem_filters=96, filters_multiplier=2, classes=1000, use_aux=False, norm_layer=nn.BatchNorm, norm_kwargs=None):
super(NASNetALarge, self).__init__()
filters = penultimate_filters // 24
self.conv0 = nn.HybridSequential(prefix='')
self.conv0.add(nn.Conv2D(stem_filters, 3, padding=0, strides=2, use_bias=False))
self.conv0.add(norm_layer(momentum=0.1, epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.cell_stem_0 = CellStem0(stem_filters, norm_layer, norm_kwargs, num_filters=filters // (filters_multiplier ** 2))
self.cell_stem_1 = CellStem1(filters // filters_multiplier, norm_layer, norm_kwargs)
self.norm_1 = nn.HybridSequential(prefix='')
self.norm_1.add(FirstCell(out_channels_left=filters//2, out_channels_right=filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(repeat - 1):
self.norm_1.add(NormalCell(out_channels_left=filters, out_channels_right=filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.reduction_cell_0 = ReductionCell0(out_channels_left=2*filters, out_channels_right=2*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.norm_2 = nn.HybridSequential(prefix='')
self.norm_2.add(FirstCell(out_channels_left=filters, out_channels_right=2*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(repeat - 1):
self.norm_2.add(NormalCell(out_channels_left=2*filters, out_channels_right=2*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
if use_aux:
self.out_aux = nn.HybridSequential(prefix='')
self.out_aux.add(nn.Conv2D(filters // 3, kernel_size=1, use_bias=False))
self.out_aux.add(norm_layer(epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.out_aux.add(nn.Activation('relu'))
self.out_aux.add(nn.Conv2D(2*filters, kernel_size=5, use_bias=False))
self.out_aux.add(norm_layer(epsilon=0.001, **({} if norm_kwargs is None else norm_kwargs)))
self.out_aux.add(nn.Activation('relu'))
self.out_aux.add(nn.Dense(classes))
else:
self.out_aux = None
self.reduction_cell_1 = ReductionCell1(out_channels_left=4*filters, out_channels_right=4*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.norm_3 = nn.HybridSequential(prefix='')
self.norm_3.add(FirstCell(out_channels_left=2*filters, out_channels_right=4*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
for _ in range(repeat - 1):
self.norm_3.add(NormalCell(out_channels_left=4*filters, out_channels_right=4*filters, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
self.out = nn.HybridSequential(prefix='')
self.out.add(nn.Activation('relu'))
self.out.add(nn.GlobalAvgPool2D())
self.out.add(nn.Dropout(0.5))
self.out.add(nn.Dense(classes))
def hybrid_forward(self, F, x):
x_conv0 = self.conv0(x)
x_stem_0 = self.cell_stem_0(x_conv0)
x_stem_1 = self.cell_stem_1(x_conv0, x_stem_0)
x = x_stem_1
x_prev = x_stem_0
for cell in self.norm_1._children.values():
x, x_prev = cell(x, x_prev)
x, x_prev = self.reduction_cell_0(x, x_prev)
for cell in self.norm_2._children.values():
x, x_prev = cell(x, x_prev)
if self.out_aux:
x_aux = F.contrib.AdaptiveAvgPooling2D(x, output_size=5)
x_aux = self.out_aux(x_aux)
x, x_prev = self.reduction_cell_1(x, x_prev)
for cell in self.norm_3._children.values():
x, x_prev = cell(x, x_prev)
x = self.out(x)
if self.out_aux:
return x, x_aux
else:
return x
def get_nasnet(repeat=6, penultimate_filters=4032,class_num=1000,**kwargs):
net = NASNetALarge(repeat=repeat, penultimate_filters=penultimate_filters,classes=class_num, **kwargs)
return net
代码实现(gluon):
def _conv3x3(channels, stride, in_channels):
return nn.Conv2D(channels, kernel_size=3, strides=stride, padding=1, use_bias=False, in_channels=in_channels)
class BasicBlockV1(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(BasicBlockV1, self).__init__(**kwargs)
self.body = nn.HybridSequential(prefix='')
self.body.add(_conv3x3(channels, stride, in_channels))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(_conv3x3(channels, 1, channels))
if not last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if downsample:
self.downsample = nn.HybridSequential(prefix='')
self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride, use_bias=False, in_channels=in_channels))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(residual+x, act_type='relu')
return x
class BottleneckV1(nn.HybridBlock):
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(BottleneckV1, self).__init__(**kwargs)
self.body = nn.HybridSequential(prefix='')
self.body.add(nn.Conv2D(channels//4, kernel_size=1, strides=stride))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(_conv3x3(channels//4, 1, channels//4))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(nn.Conv2D(channels, kernel_size=1, strides=1))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if not last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if downsample:
self.downsample = nn.HybridSequential(prefix='')
self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride, use_bias=False, in_channels=in_channels))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(x + residual, act_type='relu')
return x
class HRBasicBlock(BasicBlockV1):
expansion = 1
class HRBottleneck(BottleneckV1):
expansion = 4
class OrigHRBottleneck(nn.HybridBlock):
expansion = 4
def __init__(self, channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(OrigHRBottleneck, self).__init__(**kwargs)
self.body = nn.HybridSequential(prefix='')
# add use_bias=False here to match with the original implementation
self.body.add(nn.Conv2D(channels//4, kernel_size=1, strides=1, use_bias=False))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
self.body.add(_conv3x3(channels//4, stride, channels//4))
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
self.body.add(nn.Activation('relu'))
# add use_bias=False here to match with the original implementation
self.body.add(nn.Conv2D(channels, kernel_size=1, strides=1, use_bias=False))
if use_se:
self.se = nn.HybridSequential(prefix='')
self.se.add(nn.Dense(channels // 16, use_bias=False))
self.se.add(nn.Activation('relu'))
self.se.add(nn.Dense(channels, use_bias=False))
self.se.add(nn.Activation('sigmoid'))
else:
self.se = None
if not last_gamma:
self.body.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.body.add(norm_layer(gamma_initializer='zeros', **({} if norm_kwargs is None else norm_kwargs)))
if downsample:
self.downsample = nn.HybridSequential(prefix='')
self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride, use_bias=False, in_channels=in_channels))
self.downsample.add(norm_layer(**({} if norm_kwargs is None else norm_kwargs)))
else:
self.downsample = None
def hybrid_forward(self, F, x):
residual = x
x = self.body(x)
if self.se:
w = F.contrib.AdaptiveAvgPooling2D(x, output_size=1)
w = self.se(w)
x = F.broadcast_mul(x, w.expand_dims(axis=2).expand_dims(axis=2))
if self.downsample:
residual = self.downsample(residual)
x = F.Activation(x + residual, act_type='relu')
return x
class HighResolutionModule(nn.HybridBlock):
def __init__(self, num_branches, blocks, num_blocks, num_channels, fuse_method, num_inchannels=None, multi_scale_output=True, interp_type='nearest', norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(HighResolutionModule, self).__init__(**kwargs)
if num_inchannels is not None:
self.num_inchannels = num_inchannels
else:
self.num_inchannels = num_channels
self.fuse_method = fuse_method
self.num_branches = num_branches
self.multi_scale_output = multi_scale_output
self.interp_type = interp_type
self.branches = self._make_branches(
num_branches, blocks, num_blocks, num_channels)
self.fuse_layers = self._make_fuse_layers(norm_layer=norm_layer, norm_kwargs=norm_kwargs)
def _make_one_branch(self, branch_index, block, num_blocks, num_channels, stride=1):
downsample = stride != 1 or self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion
layers = nn.HybridSequential()
layers.add(block(num_channels[branch_index]* block.expansion, stride, downsample, self.num_inchannels[branch_index]))
self.num_inchannels[branch_index] = num_channels[branch_index] * block.expansion
for i in range(1, num_blocks[branch_index]):
layers.add(block(num_channels[branch_index]* block.expansion, 1, False, self.num_inchannels[branch_index]))
return layers
def _make_branches(self, num_branches, block, num_blocks, num_channels):
branches = nn.HybridSequential()
for i in range(num_branches):
branches.add(
self._make_one_branch(i, block, num_blocks, num_channels)
)
return branches
def _make_fuse_layers(self, norm_layer=nn.BatchNorm, norm_kwargs=None):
if self.num_branches == 1:
return None
num_branches = self.num_branches
num_inchannels = self.num_inchannels
fuse_layers = nn.HybridSequential()
for i in range(num_branches if self.multi_scale_output else 1):
fuse_layer = nn.HybridSequential()
for j in range(num_branches):
if j > i:
seq = nn.HybridSequential()
seq.add(
nn.Conv2D(num_inchannels[i], 1, 1, 0, use_bias=False),
norm_layer(**({} if norm_kwargs is None else norm_kwargs))
)
fuse_layer.add(seq)
elif j == i:
fuse_layer.add(contrib.nn.Identity())
else:
conv3x3s = nn.HybridSequential()
for k in range(i-j):
if k == i - j - 1:
num_outchannels_conv3x3 = num_inchannels[i]
conv3x3s.add(
nn.Conv2D(num_outchannels_conv3x3, 3, 2, 1, use_bias=False),
norm_layer(**({} if norm_kwargs is None else norm_kwargs))
)
else:
num_outchannels_conv3x3 = num_inchannels[j]
conv3x3s.add(
nn.Conv2D(num_outchannels_conv3x3, 3, 2, 1, use_bias=False),
norm_layer(**({} if norm_kwargs is None else norm_kwargs)),
nn.Activation('relu')
)
fuse_layer.add(conv3x3s)
fuse_layers.add(fuse_layer)
return fuse_layers
def get_num_inchannels(self):
return self.num_inchannels
def hybrid_forward(self, F, x, *args, **kwargs):
x = self.branches[0](x)
if self.num_branches == 1:
return [x]
X = []
X.append(x)
for i in range(1, self.num_branches):
X.append(self.branches[i](args[i-1]))
x_fuse = []
for i in range(len(self.fuse_layers)):
y = X[0] if i == 0 else self.fuse_layers[i][0](X[0])
for j in range(1, self.num_branches):
if j > i:
if self.interp_type == 'nearest':
y = F.broadcast_add(y, F.UpSampling(
self.fuse_layers[i][j](X[j]),
scale=2**(j-i),
sample_type='nearest'))
elif self.interp_type == 'bilinear':
y = F.broadcast_add(y, F.contrib.BilinearResize2D(
self.fuse_layers[i][j](X[j]),
scale_height=2**(j-i),
scale_width=2**(j-i),
align_corners=False
))
elif self.interp_type == 'bilinear_like':
y = F.broadcast_add(y, F.contrib.BilinearResize2D(
self.fuse_layers[i][j](X[j]),
like=X[i],
mode='like',
align_corners=False
))
else:
raise NotImplementedError
else:
y = y + self.fuse_layers[i][j](X[j])
x_fuse.append(F.relu(y))
return x_fuse
BLOCKS_DICT = {
'BASIC': HRBasicBlock,
'BOTTLENECK': OrigHRBottleneck
}
class HighResolutionBaseNet(nn.HybridBlock):
def __init__(self, cfg, stage_interp_type='nearst', norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
self.stage_interp_type = stage_interp_type
super(HighResolutionBaseNet, self).__init__()
self.conv1 = nn.Conv2D(64, kernel_size=3, strides=2, padding=1, use_bias=False)
self.bn1 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
self.conv2 = nn.Conv2D(64, kernel_size=3, strides=2, padding=1, use_bias=False)
self.bn2 = norm_layer(**({} if norm_kwargs is None else norm_kwargs))
self.stage1_cfg = cfg[0]
num_channels = self.stage1_cfg[3][0]
block = BLOCKS_DICT[self.stage1_cfg[1]]
num_blocks = self.stage1_cfg[2][0]
self.layer1 = self._make_layer(block, num_channels, num_blocks, inplanes=64)
stage1_out_channel = block.expansion*num_channels
self.stage2_cfg = cfg[1]
num_channels = self.stage2_cfg[3]
block = BLOCKS_DICT[self.stage2_cfg[1]]
num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
self.transition1 = self._make_transition_layer([stage1_out_channel], num_channels, norm_layer, norm_kwargs)
self.stage2, pre_stage_channels = self._make_stage(self.stage2_cfg, num_channels)
self.stage3_cfg = cfg[2]
num_channels = self.stage3_cfg[3]
block = BLOCKS_DICT[self.stage3_cfg[1]]
num_channels = [
num_channels[i] * block.expansion for i in range(len(num_channels))]
self.transition2 = self._make_transition_layer(
pre_stage_channels, num_channels, norm_layer, norm_kwargs)
self.stage3, pre_stage_channels = self._make_stage(
self.stage3_cfg, num_channels)
self.stage4_cfg = cfg[3]
num_channels = self.stage4_cfg[3]
block = BLOCKS_DICT[self.stage4_cfg[1]]
num_channels = [
num_channels[i] * block.expansion for i in range(len(num_channels))]
self.transition3 = self._make_transition_layer(
pre_stage_channels, num_channels, norm_layer, norm_kwargs)
self.stage4, pre_stage_channels = self._make_stage(
self.stage4_cfg, num_channels, multi_scale_output=True)
self.pre_stage_channels = pre_stage_channels
def _make_transition_layer(self, num_channels_pre_layer, num_channels_cur_layer, norm_layer=nn.BatchNorm, norm_kwargs=None):
num_branches_cur = len(num_channels_cur_layer)
num_branches_pre = len(num_channels_pre_layer)
transition_layers = nn.HybridSequential()
for i in range(num_branches_cur):
if i < num_branches_pre:
if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
transition_layer = nn.HybridSequential()
transition_layer.add(
nn.Conv2D(num_channels_cur_layer[i], 3, 1, 1, use_bias=False, in_channels=num_channels_pre_layer[i]),
norm_layer(**({} if norm_kwargs is None else norm_kwargs)),
nn.Activation('relu')
)
transition_layers.add(transition_layer)
else:
transition_layers.add(contrib.nn.Identity())
else:
conv3x3s = nn.HybridSequential()
for j in range(i+1-num_branches_pre):
inchannels = num_channels_pre_layer[-1]
outchannels = num_channels_cur_layer[i] if j == i-num_branches_pre else inchannels
cba = nn.HybridSequential()
cba.add(
nn.Conv2D(outchannels, 3, 2, 1, use_bias=False, in_channels=inchannels),
norm_layer(**({} if norm_kwargs is None else norm_kwargs)),
nn.Activation('relu')
)
conv3x3s.add(cba)
transition_layers.add(conv3x3s)
return transition_layers
def _make_layer(self, block, planes, blocks, inplanes=0, stride=1):
downsample = stride != 1 or inplanes != planes * block.expansion
layers = nn.HybridSequential()
layers.add(block(planes* block.expansion, stride, downsample, inplanes))
for i in range(1, blocks):
layers.add(block(planes* block.expansion, 1, False, inplanes))
return layers
def _make_stage(self, layer_config, num_inchannels, multi_scale_output=True):
num_modules = layer_config[0]
num_blocks = layer_config[2]
num_branches = len(num_blocks)
num_channels = layer_config[3]
block = BLOCKS_DICT[layer_config[1]]
fuse_method = layer_config[4]
blocks = nn.HybridSequential()
for i in range(num_modules):
# multi_scale_output is only used last module
if not multi_scale_output and i == num_modules - 1:
reset_multi_scale_output = False
else:
reset_multi_scale_output = True
hrm = HighResolutionModule(num_branches, block, num_blocks, num_channels, fuse_method, num_inchannels, reset_multi_scale_output, self.stage_interp_type)
blocks.add(hrm)
num_inchannels = hrm.get_num_inchannels()
return blocks, num_inchannels
def hybrid_forward(self, F, x):
x = self.conv1(x)
x = self.bn1(x)
x = F.relu(x)
x = self.conv2(x)
x = self.bn2(x)
x = F.relu(x)
x = self.layer1(x)
x_list = []
for i in range(len(self.stage2_cfg[2])):
x_list.append(self.transition1[i](x))
y_list = x_list
for s in self.stage2:
y_list = s(*y_list)
x_list = []
for i in range(len(self.stage3_cfg[2])):
if i < len(self.stage2_cfg[2]):
x_list.append(self.transition2[i](y_list[i]))
else:
x_list.append(self.transition2[i](y_list[-1]))
y_list = x_list
for s in self.stage3:
y_list = s(*y_list)
x_list = []
for i in range(len(self.stage4_cfg[2])):
if i < len(self.stage3_cfg[2]):
x_list.append(self.transition3[i](y_list[i]))
else:
x_list.append(self.transition3[i](y_list[-1]))
y_list = x_list
for s in self.stage4:
y_list = s(*y_list)
return y_list
class HighResolutionClsNet(HighResolutionBaseNet):
def __init__(self, config, stage_interp_type='nearest', norm_layer=nn.BatchNorm, norm_kwargs=None,num_classes=1000, **kwargs):
super(HighResolutionClsNet, self).__init__(config, stage_interp_type=stage_interp_type, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
self.incre_blocks, self.downsamp_blocks, self.final_layer = self._make_head(self.pre_stage_channels, norm_layer, norm_kwargs)
self.avg = nn.GlobalAvgPool2D()
self.classifier = nn.Dense(num_classes, in_units=2048)
def hybrid_forward(self, F, x):
y_list = super(HighResolutionClsNet, self).hybrid_forward(F, x)
y = self.incre_blocks[0](y_list[0])
for i in range(len(self.downsamp_blocks)):
y = self.incre_blocks[i+1](y_list[i+1]) + self.downsamp_blocks[i](y)
y = self.final_layer(y)
y = self.avg(y)
y = self.classifier(y)
return y
def _make_head(self, pre_stage_channels, norm_layer=nn.BatchNorm, norm_kwargs=None):
head_block = BLOCKS_DICT['BOTTLENECK']
head_channels = [32, 64, 128, 256]
incre_blocks = nn.HybridSequential()
for i, channels in enumerate(pre_stage_channels):
incre_block = self._make_layer(head_block, head_channels[i], 1, channels, stride=1)
incre_blocks.add(incre_block)
downsamp_blocks = nn.HybridSequential()
for i in range(len(pre_stage_channels)-1):
in_channels = head_channels[i] * head_block.expansion
out_channels = head_channels[i+1] * head_block.expansion
downsamp_block = nn.HybridSequential()
downsamp_block.add(
nn.Conv2D(out_channels, 3, 2, 1, in_channels=in_channels),
norm_layer(**({} if norm_kwargs is None else norm_kwargs)),
nn.Activation('relu')
)
downsamp_blocks.add(downsamp_block)
final_layer = nn.HybridSequential()
final_layer.add(
nn.Conv2D(2048, 1, 1, 0, in_channels=head_channels[3] * head_block.expansion),
norm_layer(**({} if norm_kwargs is None else norm_kwargs)),
nn.Activation('relu')
)
return incre_blocks, downsamp_blocks, final_layer
HRNET_SPEC = {}
HRNET_SPEC['w18'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [18, 36], 'SUM'),
(4, 'BASIC', [4]*3, [18, 36, 72], 'SUM'),
(3, 'BASIC', [4]*4, [18, 36, 72, 144], 'SUM')
]
HRNET_SPEC['w18_small_v1'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [1], [32], 'SUM'),
(1, 'BASIC', [2]*2, [16, 32], 'SUM'),
(1, 'BASIC', [2]*3, [16, 32, 64], 'SUM'),
(1, 'BASIC', [2]*4, [16, 32, 64, 128], 'SUM')
]
HRNET_SPEC['w18_small_v2'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [2], [64], 'SUM'),
(1, 'BASIC', [2]*2, [18, 36], 'SUM'),
(3, 'BASIC', [2]*3, [18, 36, 72], 'SUM'),
(2, 'BASIC', [2]*4, [18, 36, 72, 144], 'SUM')
]
HRNET_SPEC['w30'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [30, 60], 'SUM'),
(4, 'BASIC', [4]*3, [30, 60, 120], 'SUM'),
(3, 'BASIC', [4]*4, [30, 60, 120, 240], 'SUM')
]
HRNET_SPEC['w32'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [32, 64], 'SUM'),
(4, 'BASIC', [4]*3, [32, 64, 128], 'SUM'),
(3, 'BASIC', [4]*4, [32, 64, 128, 256], 'SUM')
]
HRNET_SPEC['w40'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [40, 80], 'SUM'),
(4, 'BASIC', [4]*3, [40, 80, 160], 'SUM'),
(3, 'BASIC', [4]*4, [40, 80, 160, 320], 'SUM')
]
HRNET_SPEC['w44'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [44, 88], 'SUM'),
(4, 'BASIC', [4]*3, [44, 88, 176], 'SUM'),
(3, 'BASIC', [4]*4, [44, 88, 176, 352], 'SUM')
]
HRNET_SPEC['w48'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [48, 96], 'SUM'),
(4, 'BASIC', [4]*3, [48, 96, 192], 'SUM'),
(3, 'BASIC', [4]*4, [48, 96, 192, 384], 'SUM')
]
HRNET_SPEC['w64'] = [
#modules, block_type, blocks, channels, fuse_method
(1, 'BOTTLENECK', [4], [64], 'SUM'),
(1, 'BASIC', [4]*2, [64, 128], 'SUM'),
(4, 'BASIC', [4]*3, [64, 128, 256], 'SUM'),
(3, 'BASIC', [4]*4, [64, 128, 256, 512], 'SUM')
]
def get_hrnet(model_name, num_classes=1000, stage_interp_type='nearest', norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
spec = HRNET_SPEC[model_name]
net = HighResolutionClsNet(spec, stage_interp_type, norm_layer, norm_kwargs, num_classes = num_classes, **kwargs)
return net
代码实现(gluon):
def conv3x3_dla(in_planes, out_planes, stride=1):
return nn.Conv2D(channels=out_planes, kernel_size=3, strides=stride, padding=1, use_bias=False, in_channels=in_planes)
class BasicBlock(nn.HybridBlock):
def __init__(self, inplanes, planes, stride=1, dilation=1, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(BasicBlock, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
with self.name_scope():
self.conv1 = nn.Conv2D(in_channels=inplanes, channels=planes, kernel_size=3, strides=stride, padding=dilation, use_bias=False, dilation=dilation)
self.bn1 = norm_layer(in_channels=planes, **norm_kwargs)
self.relu = nn.Activation('relu')
self.conv2 = nn.Conv2D(in_channels=planes, channels=planes, kernel_size=3, strides=1, padding=dilation, use_bias=False, dilation=dilation)
self.bn2 = norm_layer(in_channels=planes, **norm_kwargs)
self.stride = stride
def hybrid_forward(self, F, x, residual=None):
if residual is None:
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = out + residual
out = self.relu(out)
return out
class Bottleneck_dla(nn.HybridBlock):
expansion = 2
def __init__(self, inplanes, planes, stride=1, dilation=1, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(Bottleneck_dla, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
expansion = Bottleneck_dla.expansion
bottle_planes = planes // expansion
with self.name_scope():
self.conv1 = nn.Conv2D(in_channels=inplanes, channels=bottle_planes, kernel_size=1, use_bias=False)
self.bn1 = norm_layer(in_channels=bottle_planes, **norm_kwargs)
self.conv2 = nn.Conv2D(in_channels=bottle_planes, channels=bottle_planes, kernel_size=3, strides=stride, padding=dilation, use_bias=False, dilation=dilation)
self.bn2 = norm_layer(in_channels=bottle_planes, **norm_kwargs)
self.conv3 = nn.Conv2D(in_channels=bottle_planes, channels=planes, kernel_size=1, use_bias=False)
self.bn3 = norm_layer(**norm_kwargs)
self.relu = nn.Activation('relu')
self.stride = stride
def hybrid_forward(self, F, x, residual=None):
if residual is None:
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
out = out + residual
out = self.relu(out)
return out
class BottleneckX(nn.HybridBlock):
expansion = 2
cardinality = 32
def __init__(self, inplanes, planes, stride=1, dilation=1, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(BottleneckX, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
cardinality = BottleneckX.cardinality
bottle_planes = planes * cardinality // 32
with self.name_scope():
self.conv1 = nn.Conv2D(in_channels=inplanes, channels=bottle_planes, kernel_size=1, use_bias=False)
self.bn1 = norm_layer(in_channels=bottle_planes, **norm_kwargs)
self.conv2 = nn.Conv2D(in_channels=bottle_planes, channels=bottle_planes, kernel_size=3, strides=stride, padding=dilation, use_bias=False, dilation=dilation, groups=cardinality)
self.bn2 = norm_layer(in_channels=bottle_planes, **norm_kwargs)
self.conv3 = nn.Conv2D(in_channels=bottle_planes, channels=planes, kernel_size=1, use_bias=False)
self.bn3 = norm_layer(**norm_kwargs)
self.relu = nn.Activation('relu')
self.stride = stride
def hybrid_forward(self, F, x, residual=None):
if residual is None:
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
out = out + residual
out = self.relu(out)
return out
class Root(nn.HybridBlock):
def __init__(self, in_channels, out_channels, kernel_size, residual, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(Root, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
with self.name_scope():
self.conv = nn.Conv2D(in_channels=in_channels, channels=out_channels, kernel_size=1, strides=1, use_bias=False, padding=(kernel_size - 1) // 2)
self.bn = norm_layer(in_channels=out_channels, **norm_kwargs)
self.relu = nn.Activation('relu')
self.residual = residual
def hybrid_forward(self, F, *x):
children = x
x = self.conv(F.concat(*x, dim=1))
x = self.bn(x)
if self.residual:
x = x + children[0]
x = self.relu(x)
return x
class Tree(nn.HybridBlock):
def __init__(self, levels, block, in_channels, out_channels, stride=1, level_root=False, root_dim=0, root_kernel_size=1, dilation=1, root_residual=False, norm_layer=nn.BatchNorm, norm_kwargs=None, **kwargs):
super(Tree, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
if root_dim == 0:
root_dim = 2 * out_channels
if level_root:
root_dim = root_dim + in_channels
with self.name_scope():
self.downsample = nn.HybridSequential()
self.project = nn.HybridSequential()
if levels == 1:
self.tree1 = block(in_channels, out_channels, stride, dilation=dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='block_tree_1_')
self.tree2 = block(out_channels, out_channels, 1, dilation=dilation, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='block_tree_2_')
if in_channels != out_channels:
self.project.add(*[
nn.Conv2D(in_channels=in_channels, channels=out_channels, kernel_size=1, strides=1, use_bias=False, prefix='proj_conv0_'),
norm_layer(in_channels=out_channels, prefix='proj_bn0_', **norm_kwargs)])
else:
self.tree1 = Tree(levels - 1, block, in_channels, out_channels, stride, root_dim=0, root_kernel_size=root_kernel_size, dilation=dilation, root_residual=root_residual, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='tree_1_')
self.tree2 = Tree(levels - 1, block, out_channels, out_channels, root_dim=root_dim + out_channels, root_kernel_size=root_kernel_size, dilation=dilation, root_residual=root_residual, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='tree_2_')
if levels == 1:
self.root = Root(root_dim, out_channels, root_kernel_size, root_residual, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='root_')
self.level_root = level_root
self.root_dim = root_dim
self.levels = levels
if stride > 1:
self.downsample.add(nn.MaxPool2D(stride, strides=stride, prefix='maxpool'))
def hybrid_forward(self, F, x, residual=None, children=None):
children = [] if children is None else children
bottom = self.downsample(x)
residual = self.project(bottom)
if self.level_root:
children.append(bottom)
x1 = self.tree1(x, residual)
if self.levels == 1:
x2 = self.tree2(x1)
x = self.root(x2, x1, *children)
else:
children.append(x1)
x = self.tree2(x1, None, children)
return x
class DLA(nn.HybridBlock):
def __init__(self, levels, channels, classes=1000, block=BasicBlock, momentum=0.9, norm_layer=nn.BatchNorm, norm_kwargs=None, residual_root=False, linear_root=False, use_feature=False, **kwargs):
super(DLA, self).__init__(**kwargs)
if norm_kwargs is None:
norm_kwargs = {}
norm_kwargs['momentum'] = momentum
self._use_feature = use_feature
self.channels = channels
self.base_layer = nn.HybridSequential('base')
self.base_layer.add(nn.Conv2D(in_channels=3, channels=channels[0], kernel_size=7, strides=1, padding=3, use_bias=False))
self.base_layer.add(norm_layer(in_channels=channels[0], **norm_kwargs))
self.base_layer.add(nn.Activation('relu'))
self.level0 = self._make_conv_level(channels[0], channels[0], levels[0], norm_layer, norm_kwargs)
self.level1 = self._make_conv_level(channels[0], channels[1], levels[1], norm_layer, norm_kwargs, stride=2)
self.level2 = Tree(levels[2], block, channels[1], channels[2], 2, level_root=False, root_residual=residual_root, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='level2_')
self.level3 = Tree(levels[3], block, channels[2], channels[3], 2, level_root=True, root_residual=residual_root, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='level3_')
self.level4 = Tree(levels[4], block, channels[3], channels[4], 2, level_root=True, root_residual=residual_root, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='level4_')
self.level5 = Tree(levels[5], block, channels[4], channels[5], 2, level_root=True, root_residual=residual_root, norm_layer=norm_layer, norm_kwargs=norm_kwargs, prefix='level5_')
if not self._use_feature:
self.global_avg_pool = nn.GlobalAvgPool2D()
self.fc = nn.Dense(units=classes)
def _make_level(self, block, inplanes, planes, blocks, norm_layer, norm_kwargs, stride=1):
downsample = None
if stride != 1 or inplanes != planes:
downsample = nn.HybridSequential()
downsample.add(*[
nn.MaxPool2D(stride, strides=stride),
nn.Conv2D(channels=planes, in_channels=inplanes, kernel_size=1, strides=1, use_bias=False),
norm_layer(in_channels=planes, **norm_kwargs)])
layers = []
layers.append(block(inplanes, planes, stride, norm_layer=norm_layer, norm_kwargs=norm_kwargs, downsample=downsample))
for _ in range(1, blocks):
layers.append(block(inplanes, planes, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
curr_level = nn.HybridSequential()
curr_level.add(*layers)
return curr_level
def _make_conv_level(self, inplanes, planes, convs, norm_layer, norm_kwargs, stride=1, dilation=1):
modules = []
for i in range(convs):
modules.extend([
nn.Conv2D(in_channels=inplanes, channels=planes, kernel_size=3, strides=stride if i == 0 else 1, padding=dilation, use_bias=False, dilation=dilation),
norm_layer(**norm_kwargs),
nn.Activation('relu')])
inplanes = planes
curr_level = nn.HybridSequential()
curr_level.add(*modules)
return curr_level
def hybrid_forward(self, F, x):
y = []
x = self.base_layer(x)
for i in range(6):
x = getattr(self, 'level{}'.format(i))(x)
if self._use_feature:
y.append(x)
else:
y.append(F.flatten(self.global_avg_pool(x)))
if self._use_feature:
return y
flat = F.concat(*y, dim=1)
out = self.fc(flat)
return out
def get_dla(class_num):
net = DLA(levels=[1, 1, 1, 2, 2, 1], channels=[16, 32, 64, 128, 256, 512],classes=class_num, block=BasicBlock)
return net
因ImageNet数据集过于庞大,在学习训练上比较费资源,因此本次数据集使用中国象棋数据集,中国象棋红黑棋子一共有14种类,经过预处理后会得到单独象棋图像,如图:
数据集主要以文件夹形式进行区分,每一个类别代表一个文件夹,如图:
在训练的时候直接把此文件夹目录放入代码中即可,对于训练集和验证集,代码会在这里的文件夹中做一个根据提供的划分比例随机拆分。
self.ctx = [mx.gpu(USE_GPU[i]) for i in range(len(USE_GPU))] if len(USE_GPU) > 0 else [mx.cpu()]
self.image_size = image_size
def get_data_iter(TrainDir,ValDir,image_size,batch_size,num_workers):
ValDir = TrainDir if ValDir is None else ValDir
train_ds = gluon.data.vision.ImageFolderDataset(TrainDir, flag=1)
test_ds = gluon.data.vision.ImageFolderDataset(ValDir, flag=1)
class_names = train_ds.synsets
transform_train = transforms.Compose([
transforms.Resize(image_size), #
# transforms.RandomResizedCrop(self.image_size),
transforms.RandomFlipLeftRight(),
transforms.RandomFlipTopBottom(), #
transforms.RandomBrightness(0.5), #
transforms.RandomColorJitter(brightness=0.4, contrast=0.4, saturation=0.4),
transforms.RandomLighting(alpha=0.1),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
transform_val = transforms.Compose([
transforms.Resize(image_size),
# transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
train_data = gluon.data.DataLoader(train_ds.transform_first(transform_train), batch_size=batch_size, shuffle=True, num_workers=num_workers)
val_data = gluon.data.DataLoader(test_ds.transform_first(transform_val), batch_size=batch_size, shuffle=False, num_workers=num_workers)
return train_data,val_data,class_names
self.model,self.model_name = get_classification_model(self.ModelType,len(self.class_names),version,num_layers,self.image_size,self.ctx,Pre_Model)
with tqdm(total=len(self.train_iter), desc=f'Epoch_Train {epoch + 1}/{trainNum}', postfix=dict,mininterval=0.3) as pbar:
for i, batch in enumerate(self.train_iter):
data = gluon.utils.split_and_load(batch[0], ctx_list=self.ctx, batch_axis=0, even_split=False)
label = gluon.utils.split_and_load(batch[1], ctx_list=self.ctx, batch_axis=0, even_split=False)
with autograd.record():
outputs = [self.model(X) for X in data]
loss = [L(yhat, y) for yhat, y in zip(outputs, label)]
for l in loss:
l.backward()
trainer.step(batch[0].shape[0])
train_loss += sum([l.mean().asscalar() for l in loss]) / len(loss)
metric.update(label, outputs)
pbar.set_postfix(**{'train_loss': train_loss / (i+1),'step/s': time.time() - start_time})
pbar.update(1)
start_time = time.time()
_, train_acc = metric.get()
def predict(self,image):
start_timer = time.time()
img = nd.array(image)
img = self.transform_test(img)
img = img.expand_dims(axis=0)
# img = nd.transpose(img, (2, 0, 1))
img = img.as_in_context(self.ctx[0])
out = self.model(img)
preds = []
preds.extend(out.argmax(axis=1).astype(int).asnumpy())
out = nd.SoftmaxActivation(out)
out = out.asnumpy()[0]
return out[preds[0]],self.class_names[str(preds[0])],time.time()-start_timer
if __name__ == '__main__':
RunModel = False
if RunModel:
ctu = Ctu_Classification(USE_GPU=[0], image_size=224)
ctu.InitModel(r"E:\Ctu\Ctu_Project\WZ_DL\DataSet\DataSet_Classification_Chess\DataImage", batch_size=4, Pre_Model=None,ModeType=16, version=None,num_layers=0.25)
ctu.train(trainNum=500, learning_rate=0.001, optim=0, ModelPath="./Classification_Model",StopNum=30)
del ctu
else:
ctu = Ctu_Classification(USE_GPU=[0])
ctu.LoadModel(ModelPath="./Classification_Model_mobilenetv1_0.25")
cv2.namedWindow("origin", 0)
cv2.resizeWindow("origin", 640, 480)
for root, dirs, files in os.walk(r"E:\Ctu\Ctu_Project\WZ_DL\DataSet\DataSet_Classification_Chess\test"):
for f in files:
img_cv = cv2_imread(os.path.join(root, f), is_color=True, model=True)
res = ctu.predict(img_cv)
print(f, res)
cv2.imshow("origin", img_cv)
cv2.waitKey()
del ctu