Pytorch之CNN:基于Pytorch框架实现经典卷积神经网络的算法(LeNet、AlexNet、VGG、NIN、GoogleNet、ResNet)——从代码认知CNN经典架构

Pytorch之CNN:基于Pytorch框架实现经典卷积神经网络的算法(LeNet、AlexNet、VGG、NIN、GoogleNet、ResNet)——从代码认知CNN经典架构

 

 

目录

CNN经典算法细讲

1、LeNet-5

2、AlexNet

3、VGGNet

4、GoogLeNet/Inception

5、ResNet

CNN轻量化经典结构及其演化

1、SqueezeNet

2、MobileNet

3、ShuffleNet

代码实现

1、LeNet

2、AlexNet

3、VGG

4、NIN

5、GoogleNet

6、ResNet

全部代码综合实现

1、基于数据测试对各种算法测试


 

CNN经典算法论文及其代码实现

1、LeNet-5

DL之LeNet-5:LeNet-5算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

#1998,LeNet-5网络,较早的卷积神经网络
def LeNet():
    """
        3个卷积层(包括2个池化层):论文中的C5看作卷积层其实为F层,故真正结构=2个C层+3个F层
        2个全连接层(F6)
    """
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            nn.Conv2D(channels=20, kernel_size=5, activation='relu'),
            nn.MaxPool2D(pool_size=2, strides=2),
            nn.Conv2D(channels=50, kernel_size=3, activation='relu'),
            nn.MaxPool2D(pool_size=2, strides=2),
            #-------------------------------------开始2个全连接层
            nn.Flatten(),
            nn.Dense(128, activation='relu'),
            nn.Dense(10)
        )
    return net

2、AlexNet

DL之AlexNet:AlexNet算法的简介、论文介绍、设计思路、关键步骤、实现代码等配图集合之详细攻略

#2012,AlextNet-7网络,对leNet的一个扩展,得益于数据集和硬件资源的发展
def AlexNet():
    """
        5个卷积层(包含3个MaxPooling)
        2个全连接层(包含2个DropOut),
              输出层为Softmax层
    """
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            nn.Conv2D(channels=96, kernel_size=11, strides=4, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            nn.Conv2D(channels=256, kernel_size=5, padding=2, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            nn.Conv2D(channels=384, kernel_size=3, padding=1, activation='relu'),
            nn.Conv2D(channels=384, kernel_size=3, padding=1, activation='relu'),
            nn.Conv2D(channels=256, kernel_size=3, padding=1, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            #-------------------------------------开始2个全连接层
            nn.Flatten(),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(.5),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(.5),
            nn.Dense(10)
        )
    return net

 

3、VGGNet

DL之VGGNet:VGGNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

#2014,VGG16~19
#(1)、通过引入了函数和循环的方式,可以快速创建任意层数的神经网络 
def VGGNet(architecture):
    '''
    VGG16:16层=五组卷积层【2*2+3*3】+3个全连接层
    VGG19=16个卷积层【2*2+3*4】+3个全连接层。
    '''
    def vgg_block(num_convs, channals):  #定义一个网络的基本结构,由若干卷积层和一个池化层构成
        """
        VGG的一个关键是使用很多有着相对小的kernel(3×3)的卷积层然后接上一个池化层,之后再将这个模块重复多次。因此先定义一个这样的块:
        num_convs卷积层的层数、 channals通道数
        """
        net = nn.Sequential()
        for _ in range(num_convs):
            net.add(nn.Conv2D(channels=channals, kernel_size=3, padding=1, activation='relu'))
        net.add(nn.MaxPool2D(pool_size=2, strides=2))
        return net

    def vgg_stack(architecture):        #堆叠vgg_block
        """
        定义所有卷积层的网络结构,通过参数将定义的网络结构封装起来
        architecture指定的网络结构参数
        """
        net = nn.Sequential()
        for (num_convs, channals) in architecture:
            net.add(vgg_block(num_convs, channals))
        return net
    
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            vgg_stack(architecture),
            #-------------------------------------开始3个全连接层,在卷积层之后,采用了两个全连接层,然后使用输出层输出结果。
            nn.Flatten(),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(0.5),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(0.5),
            nn.Dense(10)
        )
    return net

 

4、NIN网络

DL之NIN:Network in Network算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

#2014,NIN
#(1)、除了使用了1×1卷积外,NiN在最后不是使用全连接,而是使用通道数为输出类别个数的mlpconv,外接一个平均池化层来将每个通道里的数值平均成一个标量。
def NiNNet():  #通过串联多个卷积层和全连接层
    '''
    NIN堆叠了3个mlpconv层+最后的1个全局平均池化层
    '''
    def mlpconv(channels, kernel_size, padding, strides=1, max_pooling=True):
        #通过构造一个正常的卷积层,和两个kernel=1的卷积层(功能相当于全连接层)构造
        net = nn.Sequential()
        net.add(
            nn.Conv2D(channels=channels, kernel_size=kernel_size, strides=strides, padding=padding, activation='relu'),
            nn.Conv2D(channels=channels, kernel_size=1, padding=0, strides=1, activation='relu'),
            nn.Conv2D(channels=channels, kernel_size=1, padding=0, strides=1, activation='relu'))
        if max_pooling:
            net.add(nn.MaxPool2D(pool_size=3, strides=2))
        return net
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            mlpconv(96, 11, 0, strides=4),
            mlpconv(256, 5, 2),
            mlpconv(384, 3, 1),
            nn.Dropout(0.5),  
            mlpconv(10, 3, 1, max_pooling=False),  # 目标类为10类
            nn.GlobalAvgPool2D(), # 输入为 batch_size x 10 x 5 x 5, 通过AvgPool2D转成 batch_size x 10 x 1 x 1。
            # 使用全局池化可以避免估算pool_size大小
            nn.Flatten()# 转成 batch_size x 10
        )
    return net

 

 

5、GoogLeNet/Inception

DL之GoogleNet:GoogleNet(InceptionV1)算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之BN-Inception:BN-Inception算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之InceptionV2/V3:InceptionV2 & InceptionV3算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略DL之InceptionV4/ResNet:InceptionV4/Inception-ResNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

#2014,GoogLeNet:
#(1)、GoogLeNet加入了更加结构化的Inception块来使得我们可以使用更大的通道,更多的层,同时控制计算量和模型大小在合理范围内。
def GoogLeNet(num_class):
    '''
    带有权重的共22层网络
    网络头部:主干网络为传统的卷积操作Conv→Pool-2x→Conv→Pool
    网络中部:9个Stacked Inception Modules+2个auxiliary classifiers
    网络尾部:尾部采用GAP(无优化参数)代替FC实现分类输出—无参数故不会过拟合
    '''
    class GoogleNet(nn.Block): #通过串联Inception来构造深层网络结构
        def __init__(self, num_classes, verbose=False, **kwargs):
            super(GoogleNet, self).__init__(**kwargs)
            self.verbose = verbose
            # add name_scope on the outer most Sequential
            with self.name_scope():
                b1 = nn.Sequential()     # -----------------block 1
                b1.add(
                    nn.Conv2D(64, kernel_size=7, strides=2,
                              padding=3, activation='relu'),
                    nn.MaxPool2D(pool_size=3, strides=2) )
                
                b2 = nn.Sequential()     # -----------------block 2
                b2.add(
                    nn.Conv2D(64, kernel_size=1),
                    nn.Conv2D(192, kernel_size=3, padding=1),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b3 = nn.Sequential()     # -----------------block 3
                b3.add(
                    Inception(64, 96, 128, 16, 32, 32),
                    Inception(128, 128, 192, 32, 96, 64),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b4 = nn.Sequential()     # -----------------block 4
                b4.add(
                    Inception(192, 96, 208, 16, 48, 64),
                    Inception(160, 112, 224, 24, 64, 64),
                    Inception(128, 128, 256, 24, 64, 64),
                    Inception(112, 144, 288, 32, 64, 64),
                    Inception(256, 160, 320, 32, 128, 128),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b5 = nn.Sequential()     # -----------------block 5
                b5.add(
                    Inception(256, 160, 320, 32, 128, 128),
                    Inception(384, 192, 384, 48, 128, 128),
                    nn.AvgPool2D(pool_size=2) )

                b6 = nn.Sequential()     # -----------------block 6
                b6.add(
                    nn.Flatten(),
                    nn.Dense(num_classes) )
                # chain blocks together
                self.net = nn.Sequential()
                self.net.add(b1, b2, b3, b4, b5, b6)

        def forward(self, x):
            out = x
            for i, b in enumerate(self.net):
                out = b(out)
                if self.verbose:
                    print('Block %d output: %s' % (i + 1, out.shape))
            return out

    class Inception(nn.Block):  #网络结构的并联单元
        def __init__(self, n1_1, n2_1, n2_3, n3_1, n3_5, n4_1, **kwargs):
            super(Inception, self).__init__(**kwargs)
            self.p1_convs_1 = nn.Conv2D(n1_1, kernel_size=1, activation='relu')   # path1
            self.p2_convs_1 = nn.Conv2D(n2_1, kernel_size=1, activation='relu')   # path2
            self.p2_convs_3 = nn.Conv2D(n2_3, kernel_size=1, activation='relu')  
            self.p3_convs_1 = nn.Conv2D(n3_1, kernel_size=1, activation='relu')   # path3
            self.p3_convs_5 = nn.Conv2D(n3_5, kernel_size=1, activation='relu')
            self.p4_pool_3 = nn.MaxPool2D(pool_size=3, padding=1, strides=1)      # path4
            self.p4_convs_1 = nn.Conv2D(n4_1, kernel_size=1, activation='relu')

        def forward(self, x):
            p1 = self.p1_convs_1(x)
            p2 = self.p2_convs_3(self.p2_convs_1(x))
            p3 = self.p3_convs_5(self.p3_convs_1(x))
            p4 = self.p4_convs_1(self.p4_pool_3(x))
            return nd.concat(p1, p2, p3, p4, dim=1)

    net = GoogleNet(num_class)
    return net

 

 

6、ResNet

DL之ResNet:ResNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

#2015,ResNet:深度残差网络,通过增加跨层的连接来解决梯度逐层回传时变小的问题。虽然这个想法之前就提出过了,但ResNet真正的把效果做好了。
def ResNet(num_classes): 
    '''
        基于VGG19的架构,首先把网络增加到34层,增加过后的网络叫做plain network。
        在plain network基础上,增加残差模块,得到Residual残差网络。
    '''
    class Residual(nn.Block):
        """
        构造跨层连接,ResNet沿用了VGG的那种全用3×3卷积,但在卷积和池化层之间加入了批量归一层来加速训练。
        每次跨层连接跨过两层卷积。这里我们定义一个这样的残差块。注意到如果输入的通道数和输出不一样时(same_shape=False),
        我们使用一个额外的1×1卷积来做通道变化,同时使用strides=2来把长宽减半。
        """
        def __init__(self, channels, same_shape=True, **kwargs):
            super(Residual, self).__init__(**kwargs)
            self.same_shape = same_shape
            strides = 1 if same_shape else 2
            self.conv1 = nn.Conv2D(channels, kernel_size=3, padding=1, strides=strides)
            self.bn1 = nn.BatchNorm()
            self.conv2 = nn.Conv2D(channels, kernel_size=3, padding=1)
            self.bn2 = nn.BatchNorm()
            if not same_shape:
                self.conv3 = nn.Conv2D(channels, kernel_size=1, strides=strides)

        def forward(self, x):
            out = nd.relu(self.bn1(self.conv1(x)))
            out = self.bn2(self.conv2(out))
            if not self.same_shape:
                x = self.conv3(x)
            return nd.relu(out + x)

    class ResNet(nn.Block):
        """
        类似GoogLeNet主体是由Inception块串联而成,ResNet的主体部分串联多个Residual块。
        另外注意到一点是,这里我们没用池化层来减小数据长宽,而是通过有通道变化的Residual块里面的使用strides=2的卷积层。
        """
        def __init__(self, num_classes, verbose=False, **kwargs):
            super(ResNet, self).__init__(**kwargs)
            self.verbose = verbose
            # add name_scope on the outermost Sequential
            with self.name_scope():
                b1 = nn.Conv2D(64, kernel_size=7, strides=2) # ------block 1
                b2 = nn.Sequential()                         # ------block 2
                b2.add(
                    nn.MaxPool2D(pool_size=3, strides=2),
                    Residual(64),
                    Residual(64)  )

                b3 = nn.Sequential()                         # ------block 3
                b3.add(
                    Residual(128, same_shape=False),
                    Residual(128) )

                b4 = nn.Sequential()                         # ------block 4
                b4.add(
                    Residual(256, same_shape=False),
                    Residual(256) )

                b5 = nn.Sequential()                         # ------block 5
                b5.add(
                    Residual(512, same_shape=False),
                    Residual(512) )
                
                b6 = nn.Sequential()                         # ------block 6
                b6.add(
                    nn.AvgPool2D(pool_size=3),
                    nn.Dense(num_classes)
                )
                # chain all blocks together
                self.net = nn.Sequential()
                self.net.add(b1, b2, b3, b4, b5, b6)

        def forward(self, x):
            out = x
            for i, b in enumerate(self.net):
                out = b(out)
                if self.verbose:
                    print('Block %d output: %s' % (i + 1, out.shape))
            return out
    net = ResNet(num_classes)
    return net

 

 

 

CNN轻量化经典结构及其演化

1、SqueezeNet

DL之SqueezeNet:SqueezeNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

 

2、MobileNet

DL之MobileNet:MobileNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

 

3、ShuffleNet

DL之ShuffleNet:ShuffleNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

 

全部代码综合实现

1、基于数据测试对各种算法测试



#Pytorch之CNN:基于Pytorch框架实现经典卷积神经网络的算法(LeNet、AlexNet、VGG、NIN、GoogleNet、ResNet)——从代码认知CNN经典架构

import torch.nn as nn 

#1998,LeNet-5网络,较早的卷积神经网络
def LeNet():
    """
        3个卷积层(包括2个池化层):论文中的C5看作卷积层其实为F层,故真正结构=2个C层+3个F层
        2个全连接层(F6)
    """
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            nn.Conv2D(channels=20, kernel_size=5, activation='relu'),
            nn.MaxPool2D(pool_size=2, strides=2),
            nn.Conv2D(channels=50, kernel_size=3, activation='relu'),
            nn.MaxPool2D(pool_size=2, strides=2),
            #-------------------------------------开始2个全连接层
            nn.Flatten(),
            nn.Dense(128, activation='relu'),
            nn.Dense(10)
        )
    return net


#2012,AlextNet-7网络,对leNet的一个扩展,得益于数据集和硬件资源的发展
def AlexNet():
    """
        5个卷积层(包含3个MaxPooling)
        2个全连接层(包含2个DropOut),
              输出层为Softmax层
    """
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            nn.Conv2D(channels=96, kernel_size=11, strides=4, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            nn.Conv2D(channels=256, kernel_size=5, padding=2, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            nn.Conv2D(channels=384, kernel_size=3, padding=1, activation='relu'),
            nn.Conv2D(channels=384, kernel_size=3, padding=1, activation='relu'),
            nn.Conv2D(channels=256, kernel_size=3, padding=1, activation='relu'),
            nn.MaxPool2D(pool_size=3, strides=2),
            #-------------------------------------开始2个全连接层
            nn.Flatten(),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(.5),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(.5),
            nn.Dense(10)
        )
    return net



#2014,VGG16~19
#(1)、通过引入了函数和循环的方式,可以快速创建任意层数的神经网络 
def VGGNet(architecture):
    '''
    VGG16:16层=五组卷积层【2*2+3*3】+3个全连接层
    VGG19=16个卷积层【2*2+3*4】+3个全连接层。
    '''
    def vgg_block(num_convs, channals):  #定义一个网络的基本结构,由若干卷积层和一个池化层构成
        """
        VGG的一个关键是使用很多有着相对小的kernel(3×3)的卷积层然后接上一个池化层,之后再将这个模块重复多次。因此先定义一个这样的块:
        num_convs卷积层的层数、 channals通道数
        """
        net = nn.Sequential()
        for _ in range(num_convs):
            net.add(nn.Conv2D(channels=channals, kernel_size=3, padding=1, activation='relu'))
        net.add(nn.MaxPool2D(pool_size=2, strides=2))
        return net

    def vgg_stack(architecture):        #堆叠vgg_block
        """
        定义所有卷积层的网络结构,通过参数将定义的网络结构封装起来
        architecture指定的网络结构参数
        """
        net = nn.Sequential()
        for (num_convs, channals) in architecture:
            net.add(vgg_block(num_convs, channals))
        return net
    
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            vgg_stack(architecture),
            #-------------------------------------开始3个全连接层,在卷积层之后,采用了两个全连接层,然后使用输出层输出结果。
            nn.Flatten(),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(0.5),
            nn.Dense(4096, activation='relu'),
            nn.Dropout(0.5),
            nn.Dense(10)
        )
    return net



#2014,NIN
#(1)、除了使用了1×1卷积外,NiN在最后不是使用全连接,而是使用通道数为输出类别个数的mlpconv,外接一个平均池化层来将每个通道里的数值平均成一个标量。
def NiNNet():  #通过串联多个卷积层和全连接层
    '''
    NIN堆叠了3个mlpconv层+最后的1个全局平均池化层
    '''
    def mlpconv(channels, kernel_size, padding, strides=1, max_pooling=True):
        #通过构造一个正常的卷积层,和两个kernel=1的卷积层(功能相当于全连接层)构造
        net = nn.Sequential()
        net.add(
            nn.Conv2D(channels=channels, kernel_size=kernel_size, strides=strides, padding=padding, activation='relu'),
            nn.Conv2D(channels=channels, kernel_size=1, padding=0, strides=1, activation='relu'),
            nn.Conv2D(channels=channels, kernel_size=1, padding=0, strides=1, activation='relu'))
        if max_pooling:
            net.add(nn.MaxPool2D(pool_size=3, strides=2))
        return net
    net = nn.Sequential()
    with net.name_scope():
        net.add(
            mlpconv(96, 11, 0, strides=4),
            mlpconv(256, 5, 2),
            mlpconv(384, 3, 1),
            nn.Dropout(0.5),  
            mlpconv(10, 3, 1, max_pooling=False),  # 目标类为10类
            nn.GlobalAvgPool2D(), # 输入为 batch_size x 10 x 5 x 5, 通过AvgPool2D转成 batch_size x 10 x 1 x 1。
            # 使用全局池化可以避免估算pool_size大小
            nn.Flatten()# 转成 batch_size x 10
        )
    return net


#2014,GoogLeNet:
#(1)、GoogLeNet加入了更加结构化的Inception块来使得我们可以使用更大的通道,更多的层,同时控制计算量和模型大小在合理范围内。
def GoogLeNet(num_class):
    '''
    带有权重的共22层网络
    网络头部:主干网络为传统的卷积操作Conv→Pool-2x→Conv→Pool
    网络中部:9个Stacked Inception Modules+2个auxiliary classifiers
    网络尾部:尾部采用GAP(无优化参数)代替FC实现分类输出—无参数故不会过拟合
    '''
    class GoogleNet(nn.Block): #通过串联Inception来构造深层网络结构
        def __init__(self, num_classes, verbose=False, **kwargs):
            super(GoogleNet, self).__init__(**kwargs)
            self.verbose = verbose
            # add name_scope on the outer most Sequential
            with self.name_scope():
                b1 = nn.Sequential()     # -----------------block 1
                b1.add(
                    nn.Conv2D(64, kernel_size=7, strides=2,
                              padding=3, activation='relu'),
                    nn.MaxPool2D(pool_size=3, strides=2) )
                
                b2 = nn.Sequential()     # -----------------block 2
                b2.add(
                    nn.Conv2D(64, kernel_size=1),
                    nn.Conv2D(192, kernel_size=3, padding=1),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b3 = nn.Sequential()     # -----------------block 3
                b3.add(
                    Inception(64, 96, 128, 16, 32, 32),
                    Inception(128, 128, 192, 32, 96, 64),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b4 = nn.Sequential()     # -----------------block 4
                b4.add(
                    Inception(192, 96, 208, 16, 48, 64),
                    Inception(160, 112, 224, 24, 64, 64),
                    Inception(128, 128, 256, 24, 64, 64),
                    Inception(112, 144, 288, 32, 64, 64),
                    Inception(256, 160, 320, 32, 128, 128),
                    nn.MaxPool2D(pool_size=3, strides=2) )

                b5 = nn.Sequential()     # -----------------block 5
                b5.add(
                    Inception(256, 160, 320, 32, 128, 128),
                    Inception(384, 192, 384, 48, 128, 128),
                    nn.AvgPool2D(pool_size=2) )

                b6 = nn.Sequential()     # -----------------block 6
                b6.add(
                    nn.Flatten(),
                    nn.Dense(num_classes) )
                # chain blocks together
                self.net = nn.Sequential()
                self.net.add(b1, b2, b3, b4, b5, b6)

        def forward(self, x):
            out = x
            for i, b in enumerate(self.net):
                out = b(out)
                if self.verbose:
                    print('Block %d output: %s' % (i + 1, out.shape))
            return out

    class Inception(nn.Block):  #网络结构的并联单元
        def __init__(self, n1_1, n2_1, n2_3, n3_1, n3_5, n4_1, **kwargs):
            super(Inception, self).__init__(**kwargs)
            self.p1_convs_1 = nn.Conv2D(n1_1, kernel_size=1, activation='relu')   # path1
            self.p2_convs_1 = nn.Conv2D(n2_1, kernel_size=1, activation='relu')   # path2
            self.p2_convs_3 = nn.Conv2D(n2_3, kernel_size=1, activation='relu')  
            self.p3_convs_1 = nn.Conv2D(n3_1, kernel_size=1, activation='relu')   # path3
            self.p3_convs_5 = nn.Conv2D(n3_5, kernel_size=1, activation='relu')
            self.p4_pool_3 = nn.MaxPool2D(pool_size=3, padding=1, strides=1)      # path4
            self.p4_convs_1 = nn.Conv2D(n4_1, kernel_size=1, activation='relu')

        def forward(self, x):
            p1 = self.p1_convs_1(x)
            p2 = self.p2_convs_3(self.p2_convs_1(x))
            p3 = self.p3_convs_5(self.p3_convs_1(x))
            p4 = self.p4_convs_1(self.p4_pool_3(x))
            return nd.concat(p1, p2, p3, p4, dim=1)

    net = GoogleNet(num_class)
    return net


#2015,ResNet:深度残差网络,通过增加跨层的连接来解决梯度逐层回传时变小的问题。虽然这个想法之前就提出过了,但ResNet真正的把效果做好了。
def ResNet(num_classes): 
    '''
        基于VGG19的架构,首先把网络增加到34层,增加过后的网络叫做plain network。
        在plain network基础上,增加残差模块,得到Residual残差网络。
    '''
    class Residual(nn.Block):
        """
        构造跨层连接,ResNet沿用了VGG的那种全用3×3卷积,但在卷积和池化层之间加入了批量归一层来加速训练。
        每次跨层连接跨过两层卷积。这里我们定义一个这样的残差块。注意到如果输入的通道数和输出不一样时(same_shape=False),
        我们使用一个额外的1×1卷积来做通道变化,同时使用strides=2来把长宽减半。
        """
        def __init__(self, channels, same_shape=True, **kwargs):
            super(Residual, self).__init__(**kwargs)
            self.same_shape = same_shape
            strides = 1 if same_shape else 2
            self.conv1 = nn.Conv2D(channels, kernel_size=3, padding=1, strides=strides)
            self.bn1 = nn.BatchNorm()
            self.conv2 = nn.Conv2D(channels, kernel_size=3, padding=1)
            self.bn2 = nn.BatchNorm()
            if not same_shape:
                self.conv3 = nn.Conv2D(channels, kernel_size=1, strides=strides)

        def forward(self, x):
            out = nd.relu(self.bn1(self.conv1(x)))
            out = self.bn2(self.conv2(out))
            if not self.same_shape:
                x = self.conv3(x)
            return nd.relu(out + x)

    class ResNet(nn.Block):
        """
        类似GoogLeNet主体是由Inception块串联而成,ResNet的主体部分串联多个Residual块。
        另外注意到一点是,这里我们没用池化层来减小数据长宽,而是通过有通道变化的Residual块里面的使用strides=2的卷积层。
        """
        def __init__(self, num_classes, verbose=False, **kwargs):
            super(ResNet, self).__init__(**kwargs)
            self.verbose = verbose
            # add name_scope on the outermost Sequential
            with self.name_scope():
                b1 = nn.Conv2D(64, kernel_size=7, strides=2) # ------block 1
                b2 = nn.Sequential()                         # ------block 2
                b2.add(
                    nn.MaxPool2D(pool_size=3, strides=2),
                    Residual(64),
                    Residual(64)  )

                b3 = nn.Sequential()                         # ------block 3
                b3.add(
                    Residual(128, same_shape=False),
                    Residual(128) )

                b4 = nn.Sequential()                         # ------block 4
                b4.add(
                    Residual(256, same_shape=False),
                    Residual(256) )

                b5 = nn.Sequential()                         # ------block 5
                b5.add(
                    Residual(512, same_shape=False),
                    Residual(512) )
                
                b6 = nn.Sequential()                         # ------block 6
                b6.add(
                    nn.AvgPool2D(pool_size=3),
                    nn.Dense(num_classes)
                )
                # chain all blocks together
                self.net = nn.Sequential()
                self.net.add(b1, b2, b3, b4, b5, b6)

        def forward(self, x):
            out = x
            for i, b in enumerate(self.net):
                out = b(out)
                if self.verbose:
                    print('Block %d output: %s' % (i + 1, out.shape))
            return out
    net = ResNet(num_classes)
    return net




def do_exp():
    ctx = utils.try_gpu()       # 初始化

    
    # batch_size = 256     # 获取数据
    train_data, test_data = utils.load_data_fashion_mnist(batch_size=64, resize=224)

    # net = LeNet()
    # net = AlexNet()

    # architecture = ((2, 64), (2, 128), (2, 256), (2, 512), (2, 512))
    # net = VGGNet(architecture)

    # net = NiNNet()
    # net = GoogLeNet(10)
    net = ResNet(10)
    net.initialize(ctx=ctx, init=init.Xavier())
    print('initialize weight on', ctx)

    # 训练
    loss = gluon.loss.SoftmaxCrossEntropyLoss()
    trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.01})
    utils.train(train_data, test_data, net, loss, trainer, ctx, num_epochs=1)


if __name__ == '__main__':
    do_exp()
    

 

 

 

 

 

你可能感兴趣的:(CV,PyTorch)