基于CNN的图片分类结构演变+Keras实现

图片分类就是输入一张图片,输出该图片对应的类别(狗,猫,船,鸟),或者说输出该图片属于哪种分类的可能性最大。
人类分辨一张动物的图片可能是通过图片里动物的边缘,线条等等特征。计算机也是一样,借助卷积神经网络从图片中提取不同层级的特征,建立物体抽象的概念。
随着研究的深入,用作图像分类的网络结构越来越精妙,以下将详细介绍几个主要的神经网络及其结构。

1、LeNet-5

下图是广为流传LeNet的网络结构,麻雀虽小五脏俱全。卷积层、池化层、全连接层,这些都是现代CNN网络的基本组件。随着网络越来越深,图像的宽度和高度都在缩小,信道数量一直在增加。目前,一个或多个卷积层后边跟一个池化层,再接上一个全连接层的排列方式很常用。



def LeNet_build(inputShape, classes):
    '''
    LeNet_5网络
    INPUT  -> 输入数据格式(32, 32, 1), 待分类数(10)
    '''
    model = Sequential()

    model.add(Conv2D(filters=6, kernel_size=(5, 5), activation='tanh',  padding='valid', input_shape=inputShape))        
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(filters=16, kernel_size=(5, 5), activation='tanh',  padding='valid'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(120, activation='tanh'))
    model.add(Dense(84, activation='tanh'))
    model.add(Dense(classes, activation='softmax'))

    # 模型编译
    sgd = SGD(lr=0.05, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='categorical_crossentropy', 
                  optimizer=sgd, 
                  metrics=['accuracy'])
    # model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

2、AlexNet

AlexNet 可以说是具有历史意义的一个网络结构,可以说在AlexNet之前,深度学习已经沉寂了很久。历史的转折在2012年到来,AlexNet 在当年的ImageNet图像分类竞赛中,top-5错误率比上一年的冠军下降了十个百分点,而且远远超过当年的第二名。
AlexNet的诞生开启了深度学习的时代,虽然后来大量比AlexNet更快速更准确的卷积神经网络结构相继出现,但是AlexNet作为开创者依旧有着很多值得学习参考的地方,之后在Imagenet上取得更好结果的ZF-net,VGG等网络,都是在其基础上修改得到。




该网络包括:卷积层 5个,池化层 3个,全连接层:3个(其中包含输出层) 。

创新
(1)采用了ReLU激活函数,而不像早期卷积神经网络所采用的Tanh或Sigmoid激活函数
(2)用多层小卷积代替单个大卷积层。
(3)提出了LRN层(局部响应归一化层),增强模型泛化能力。
(4)每个全连接层后面加上Dropout层减少了模型的过拟合问题。
(5)使用了数据增强。

def AlexNet_build(inputShape, classes):
    '''
    AlexNet网络
    INPUT  -> 输入数据格式(227, 227, 3), 待分类数(1000)
    '''
    model = Sequential()

    model.add(Conv2D(96, (11,11), strides=(4,4), input_shape=inputShape, activation='relu', padding='valid', kernel_initializer='uniform'))  
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))

    model.add(Conv2D(256, (5,5), strides=(1,1), activation='relu', padding='same', kernel_initializer='uniform'))  
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))

    model.add(Conv2D(384, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='uniform'))  
    model.add(Conv2D(384, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='uniform'))  
    model.add(Conv2D(256, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))

    model.add(Dense(4096, activation='relu'))  
    model.add(Dropout(0.5))

    model.add(Dense(classes, activation='softmax'))

    # 模型编译
    model.compile(loss='categorical_crossentropy', 
                  optimizer='sgd',
                  metrics=['accuracy'])
    # model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

3、ZFnet

在论文《Visualizing and Understanding Convolutional Networks 》中提出,在AlexNet基础上进行了一些细节的改动,将卷积核的大小减小,网络结构上并没有太大的突破。该论文主要提出了一种可视化方法,发现学习到的特征远不是无法解释的,而是特征间存在层次性,层数越深,特征不变性越强,类别的判别能力越强。

创新
提出了一种神经网络可视化方法,揭露了激发模型中每层单独的特征图,也允许观察在训练阶段特征的演变过程且诊断出模型的潜在问题

def ZFNet_build(inputShape, classes):
    '''
    ZFNet网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数(1000)
    '''
    model = Sequential()

    model.add(Conv2D(96,(7,7), strides=(2,2), input_shape=inputShape, padding='valid', activation='relu', kernel_initializer='uniform'))  
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))  

    model.add(Conv2D(256,(5,5), strides=(2,2), padding='same', activation='relu', kernel_initializer='uniform')) 
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2)))

    model.add(Conv2D(384,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(384,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(256,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
    
    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))  
    model.add(Dropout(0.5))

    model.add(Dense(4096, activation='relu'))  
    model.add(Dropout(0.5))

    model.add(Dense(classes, activation='softmax'))

    # 模型编译
    model.compile(loss='categorical_crossentropy', 
                  optimizer='sgd',
                  metrics=['accuracy'])
    # model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

4、VGGnet

VGG网络架构于2014年首次出现在Simonyan和Zisserman两人的论文中,该论文名为《VeryDeepConvolutionalNetworksforLargeScaleImageRecognition》。它主要的贡献是展示出网络的深度是算法优良性能的关键部分。
VGGNet包含很多级别的网络,深度从11层到19层不等,比较常用的是VGGNet-16和VGGNet-19。
VGG-16网络结构很规整,没有那么多的超参数,专注于构建简单的网络,都是几个卷积层后面跟一个可以压缩图像大小的池化层。即:全部使用3X3的小型卷积核和2X2的最大池化层。
由于深度以及全连接节点数量的原因,VGG16的weights超过533MB,VGG19超过574MB,这使得部署VGG很令人讨厌。虽然在许多深度学习图像分类问题中仍在使用VGG架构,但是小规模的网络架构更受欢迎(比如SqueezeNet,GoogleNet等等)。


创新
总结出了卷积神经网络的深度增加和小卷积核的使用对网络的最终分类识别效果有很大的作用。

def VGG13_build(inputShape, classes):
    '''
    VGG13网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数(1000)
    '''
    model = Sequential()

    model.add(Conv2D(64,(3,3), strides=(1,1), input_shape=inputShape, padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(64,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(128,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(128,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(256,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(256,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))

    model.add(Dropout(0.5))  
    model.add(Dense(4096, activation='relu'))

    model.add(Dropout(0.5))  
    model.add(Dense(classes, activation='softmax'))  

    # 模型编译
    model.compile(loss='categorical_crossentropy', 
                  optimizer='sgd',
                  metrics=['accuracy'])
    # model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

def VGG16_build(inputShape, classes):
    '''
    VGG16网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数(1000)
    '''
    model = Sequential()

    model.add(Conv2D(64,(3,3), strides=(1,1), input_shape=inputShape, padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(64,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(128,(3,2), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(128,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(256,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(256,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(Conv2D(512,(3,3), strides=(1,1), padding='same', activation='relu', kernel_initializer='uniform'))  
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))

    model.add(Dropout(0.5))  
    model.add(Dense(4096, activation='relu'))

    model.add(Dropout(0.5))  
    model.add(Dense(classes, activation='softmax'))  

    # 模型编译
    model.compile(loss='categorical_crossentropy', 
                  optimizer='sgd',
                  metrics=['accuracy'])
    # model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

5、GoogLeNet(Inception V1)

2014年,ImageNet第一名采用了名为GoogLeNet的网络结构,GoogLeNet是后续Inception_vN系列的原型。它引入一种用于深度网络的新型构建模块,现在这一模块被称为「Inception_Module」。
Inception模块的目的是扮演一个“多级特征提取器”,在网络深度相同的模块内计算1×1、3×3还有5×5的卷积——这些过滤器的输出在输入至网络下一层之前先被堆栈到channel dimension。

创新
提出了一种多级特征提取器结构(Inception_Module),增加了网络的宽度,提升网络对尺度的适应性

def GoogLeNet_build(inputShape, classes):
    '''
    GoogLeNet网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数(1000)
    '''
    def Inception_block(inputs, num_filter):
        branch1x1 = Conv2D(num_filter, kernel_size=(1, 1), activation='relu', strides=(1, 1), padding='same')(inputs)
        branch1x1 = BatchNormalization(axis=3)(branch1x1)

        branch3x3 = Conv2D(num_filter, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(inputs)
        branch3x3 = BatchNormalization(axis=3)(branch3x3)

        branch5x5 = Conv2D(num_filter, kernel_size=(5, 5), activation='relu', strides=(1, 1), padding='same')(inputs)
        branch5x5 = BatchNormalization(axis=3)(branch5x5) 

        branchpool = MaxPooling2D(pool_size=(3,3), strides=(1, 1), padding='same')(inputs)

        x = concatenate([branch1x1, branch3x3, branch5x5, branchpool], axis=3)
        return x

    inputs = Input(shape=inputShape)

    x = Conv2D(64, kernel_size=(7, 7), activation='relu', strides=(2, 2), padding='same')(inputs)
    x = BatchNormalization(axis=3)(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)

    x = Conv2D(192, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
    x = BatchNormalization(axis=3)(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)

    x = Inception_block(x, 64)
    x = Inception_block(x, 120)
    x = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(x)

    x = Inception_block(x, 128)
    x = Inception_block(x, 128)
    x = Inception_block(x, 128)
    x = Inception_block(x, 132)
    x = Inception_block(x, 208)
    x = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(x)

    x = Inception_block(x, 208)
    x = Inception_block(x, 256)

    x = AveragePooling2D(pool_size=(7,7), strides=(7,7), padding='same')(x)
    x = Dropout(0.4)(x)
    x = Dense(10, activation='relu')(x)
    x = Dense(10, activation='softmax')(x)
    model = Model(inputs, x)
    model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
    model.summary()
    return model

6、ResNet

深度学习网络的深度对最后的分类和识别的效果有着很大的影响,所以正常想法就是能把网络设计的越深越好,一个非常深的网络的主要好处是:它可以代表非常复杂的函数。它还可以从许多不同层次的抽象中学习特征,从边缘(在较低的层)到非常复杂的特性(在更深的层次)。然而,使用更深层的网络并不总是有用的,其中原因之一即是网络越深,梯度消失的现象就越来越明显,网络的训练效果也不会很好。
2015年,ImageNet第一名采用了一种新的网络结构ResNet突破了这一难题。其中最核心的部分是一种基于恒等映射的残差学习结构(Residual_Block)。


左边的是标准的残差学习模块,右边是针对50层以上优化后的结构

创新
引入直连思想,提出了一种残差学习结构(Residual_Block),建立前面层与后面层之间的“短路连接”,使得神经网络可以构建得更深

def ResNet34_build(inputShape, classes):
    '''
    ResNet34网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数
    '''
    def Residual_Block(inputs, num_filter, kernel_size, strides=(1, 1), with_conv_short_cut=False):
        x = Conv2D(num_filter, kernel_size= kernel_size, activation='relu', strides=strides, padding='same')(inputs)
        x = BatchNormalization(axis= 3)(x)
        x = Conv2D(num_filter, kernel_size= kernel_size, activation='relu', padding='same')(x)
        x = BatchNormalization(axis= 3)(x)

        if with_conv_short_cut:
            shortcut = Conv2D(num_filter, kernel_size=kernel_size, activation='relu', strides=strides, padding='same')(inputs)
            shortcut = BatchNormalization(axis= 3)(shortcut)
            x = add([x, shortcut])
            return x
        else:
            x = add([x, inputs])
            return x

    inputs = Input(shape= inputShape)
    x = ZeroPadding2D((3, 3))(inputs)

    x = Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='valid')(x)
    x = BatchNormalization(axis=3)(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)

    x = Residual_Block(x, num_filter=64, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=64, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=64, kernel_size=(3, 3))

    x = Residual_Block(x, num_filter=128, kernel_size=(3, 3), strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block(x, num_filter=128, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=128, kernel_size=(3, 3))

    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3), strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=256, kernel_size=(3, 3))

    x = Residual_Block(x, num_filter=512, kernel_size=(3, 3), strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block(x, num_filter=512, kernel_size=(3, 3))
    x = Residual_Block(x, num_filter=512, kernel_size=(3, 3))

    x = AveragePooling2D(pool_size=(7, 7))(x)
    x = Flatten()(x)
    x = Dense(classes, activation='softmax')(x)

    model = Model(inputs= inputs, outputs= x)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

def ResNet50_build(inputShape, classes):
    '''
    ResNet50网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数
    '''
    def Residual_Block_V2(inputs, filters, strides=(1, 1), with_conv_short_cut=False):
        F1, F2, F3 = filters
        x = Conv2D(F1, kernel_size=(1, 1), activation='relu', strides=strides, padding='same')(inputs)
        x = BatchNormalization(axis= 3)(x)
        x = Conv2D(F2, kernel_size=(3, 3), activation='relu', padding='same')(x)
        x = BatchNormalization(axis= 3)(x)
        x = Conv2D(F3, kernel_size=(1, 1), activation='relu', padding='same')(x)
        x = BatchNormalization(axis= 3)(x)

        if with_conv_short_cut:
            shortcut = Conv2D(F3, kernel_size=(1, 1), activation='relu', strides=strides, padding='same')(inputs)
            shortcut = BatchNormalization(axis= 3)(shortcut)
            x = add([x, shortcut])
            return x
        else:
            x = add([x, inputs])
            return x

    inputs = Input(shape= inputShape)
    x = ZeroPadding2D((3, 3))(inputs)

    x = Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='valid')(x)
    x = BatchNormalization(axis=3)(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)

    x = Residual_Block_V2(x, filters=[64,64,256], strides=(1, 1), with_conv_short_cut=True)
    x = Residual_Block_V2(x, filters=[64,64,256])
    x = Residual_Block_V2(x, filters=[64,64,256])

    x = Residual_Block_V2(x, filters=[128,128,512], strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block_V2(x, filters=[128,128,512])
    x = Residual_Block_V2(x, filters=[128,128,512])

    x = Residual_Block_V2(x, filters=[256,256,1024], strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block_V2(x, filters=[256,256,1024])
    x = Residual_Block_V2(x, filters=[256,256,1024])
    x = Residual_Block_V2(x, filters=[256,256,1024])
    x = Residual_Block_V2(x, filters=[256,256,1024])
    x = Residual_Block_V2(x, filters=[256,256,1024])

    x = Residual_Block_V2(x, filters=[512,512,2048], strides=(2, 2), with_conv_short_cut=True)
    x = Residual_Block_V2(x, filters=[512,512,2048])
    x = Residual_Block_V2(x, filters=[512,512,2048])

    x = AveragePooling2D(pool_size=(7, 7))(x)
    x = Flatten()(x)
    x = Dense(classes, activation='softmax')(x)

    model = Model(inputs= inputs, outputs= x)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return model

7、DenseNet

CVPR 2017的最佳论文提出了一种新的模型结构DenseNet,它的基本思路与ResNet一致,但是它建立的是前面所有层与后面层的密集连接(dense connection)。DenseNet的另一大特色是通过特征在channel上的连接来实现特征重用(feature reuse)。这些特点让DenseNet在参数和计算成本更少的情形下实现比ResNet更优的性能。




密集连接仅用在同一个dense block里的,不同dense block之间是没有密集连接的。另外,在同层深度下获得更好的收敛率,额外代价就是其恐怖如斯的内存占用。

创新
采用密集连接缓解梯度消失问题,加强特征传播,鼓励特征复用,极大的减少了参数量

def Densenet121_build(inputShape, classes, filter_growth=0, reduction=0.5):
    '''
    Densenet121网络
    INPUT  -> 输入数据格式(224, 224, 3), 待分类数, filters增长量, 压缩率(对通道数进行调整)
    '''
    num_filter = 64
    num_layers_list = [6,12,24,16]
    compression_rate = 1.0 - reduction

    def dense_block(inputs, num_filter, num_layers):
        # 密集连接结构

        concat_feature = inputs
        
        for i in range(num_layers):

            # 1×1 Convolution (Bottleneck layer)
            inter_channel = num_filter * 4
            x = BatchNormalization(axis=3)(concat_feature)
            x = Activation('relu')(x)
            x = Conv2D(num_filter, kernel_size=(1, 1), kernel_initializer='he_normal', padding='same', use_bias=False)(x)

            # 3×3 Convolution
            x = BatchNormalization(axis=3)(x)
            x = Activation('relu')(x)
            x = Conv2D(num_filter, kernel_size=(3, 3), kernel_initializer='he_normal', padding='same', use_bias=False)(x)
            
            concat_feature = concatenate([inputs, x], axis=3)

            if filter_growth >0:
                num_filter += filter_growth

        return concat_feature, num_filter
    
    def transition_block(inputs, num_filter, compression_rate):
        # dense_block之间的连接结构

        x = BatchNormalization(axis=3)(inputs)
        x = Activation('relu')(x)
        x = Conv2D(int(num_filter * compression_rate), kernel_size=(1, 1), use_bias=False)(x)
        x = AveragePooling2D(pool_size=(2, 2), strides=(2, 2))(x)

        return x

    inputs = Input(shape= inputShape)
    x = ZeroPadding2D((3, 3))(inputs)

    x = Conv2D(num_filter, kernel_size=(7, 7), kernel_initializer='he_normal', activation='relu', padding='same', strides=(2, 2), use_bias=False)(x)
    x = BatchNormalization(axis=3)(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)
    
    for block_index in range(3):
        x, num_filter = dense_block(x, num_filter, num_layers_list[block_index])
        x = transition_block(x, num_filter, compression_rate)
        num_filter = int(num_filter * compression_rate)

    # 最后一个dense_block
    x, nb_filter = dense_block(x, num_filter, num_layers_list[-1])
    x = BatchNormalization(axis=3)(x)
    x = Activation('relu')(x)
    x = GlobalAveragePooling2D()(x)

    x = Dense(classes, activation='softmax')(x)

    model = Model(inputs= inputs, outputs= x)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.summary()
    # plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)
    return x

你可能感兴趣的:(基于CNN的图片分类结构演变+Keras实现)