应用于大规模图像识别的深层卷积神经网络:VGG

VGG网络

    • 1.论文背景
    • 2.论文工作
    • 3.网络搭建

1.论文背景

VGG论文背景:VGG网络 《VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION》是ILSVRC2014图像分类比赛中第二名的网络,由英国牛津大学的Visual Geometry Group组提出。

2.论文工作

VGG数据预处理:各通道减去RGB在训练集上的均值。

VGG网络结构:主要看VGG16(下图D)和VGG19(下图E):
应用于大规模图像识别的深层卷积神经网络:VGG_第1张图片

VGG完成的工作:

  • 试验了小感受野卷积(用3个3x3卷积核代替7x7卷积核,2个3x3卷积核代替5x5卷积核,这样做的主要目的是在保证具有相同感知野的条件下,提升了网络的深度,在一定程度上提升了神经网络的效果。设置stride和padding使卷积层不改变特征图size,通过max pooling减小特征图size)以及增加网络层级深度(达到16层或19层)可以提高分类结果的正确率。
  • 讨论了AlexNet中的LRN层:没有帮助改进效果,反而增加内存占用和计算时间。
  • 加快模型收敛:在特定的层使用了预训练得到的数据进行参数的初始化,比如对于较浅的网络,如上图中的A网络可以直接使用随机数进行随机初始化,而对于比较深的网络,则使用前面已经训练好的较浅的网络中的参数值对其前几层的卷积层和最后的全连接层进行初始化。

3.网络搭建

VGG.py

"""VGG
# References:
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](
    https://arxiv.org/abs/1409.1556) (ICLR 2015)
"""

from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.utils import *

# Build VGG16 with Keras Sequential Model
def VGG16(input_shape=(224,224,3),classes=1000,include_top=True):

    # check include_top and classes
    if include_top and classes!=1000:
        raise ValueError("if include_top is True,classes should be 1000.")

    model = Sequential()

    # Block 1
    model.add(Conv2D(input_shape=input_shape,filters=64,kernel_size=3,strides=1,padding='same',activation='relu',name='block1_conv1'))
    model.add(Conv2D(64,3,strides=1,padding='same',activation='relu',name='block1_conv2'))
    model.add(MaxPooling2D(2,2,'same',name='block1_maxpool'))

    # Block 2
    model.add(Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv1'))
    model.add(Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv2'))
    model.add(MaxPooling2D(2,2,'same',name='block2_maxpool'))

    # Block 3
    model.add(Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv1'))
    model.add(Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv2'))
    model.add(Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block3_maxpool'))

    # Block 4
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv1'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv2'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block4_maxpool'))

    # Block 5
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv1'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv2'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block5_maxpool'))

    # include fc layer
    if include_top:
        model.add(Flatten(name='flatten'))
        model.add(Dense(4096,activation='relu',name='fc_layer1'))
        model.add(Dense(4096,activation='relu',name='fc_layer2'))
        model.add(Dense(classes,activation='softmax',name='predictions_layer'))

    return model

# Build VGG19 with Keras Functional API
def VGG19(input_shape=(224,224,3),classes=1000,include_top=True):

    # check include_top and classes
    if include_top and classes!=1000:
        raise ValueError("if include_top is True,classes should be 1000.")
    
    input_ = tf.keras.Input(shape=input_shape)
    
    # Block 1
    net = Conv2D(64,3,strides=1,padding='same',activation='relu',name='block1_conv1')(input_)
    net = Conv2D(64,3,strides=1,padding='same',activation='relu',name='block1_conv2')(net)
    net = MaxPooling2D(2,2,'same',name='block1_maxpool')(net)

    # Block 2
    net = Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv1')(net)
    net = Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv2')(net)
    net = MaxPooling2D(2,2,'same',name='block2_maxpool')(net)

    # Block 3
    net = Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv1')(net)
    net = Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv2')(net)
    net = Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv3')(net)
    net = Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv4')(net)
    net = MaxPooling2D(2,2,'same',name='block3_maxpool')(net)

    # Block 4
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv1')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv2')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv3')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv4')(net)
    net = MaxPooling2D(2,2,'same',name='block4_maxpool')(net)

    # Block 5
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv1')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv2')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv3')(net)
    net = Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv4')(net)
    net = MaxPooling2D(2,2,'same',name='block5_maxpool')(net)

    if include_top:
        net = Flatten(name='flatten')(net)
        net = Dense(4096, activation='relu', name='fc1')(net)
        net = Dense(4096, activation='relu', name='fc2')(net)
        net = Dense(classes, activation='softmax', name='predictions')(net)

    model = tf.keras.Model(input_, net, name='VGG19')

    return model

if __name__=='__main__':
    
    # set env and gpu
    import tensorflow as tf
    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
    os.environ['CUDA_VISIBLE_DEVICES']='-1'
     
    phy_gpus = tf.config.experimental.list_physical_devices('GPU')
    for gpu in phy_gpus:
        tf.config.experimental.set_memory_growth(gpu,True)
    
    # test model
    model = VGG16(weights=None,input_shape=(224,224,3),include_top=True,classes=1000)
    model.summary()
    #   Total params: 138,357,544
    #   Trainable params: 138,357,544
    #   Non-trainable params: 0

    #model = VGG19(weights=None,input_shape=(224,224,3),include_top=True,classes=1000)
    #model.summary()
    #   Total params: 143,667,240
    #   Trainable params: 143,667,240
    #   Non-trainable params: 0

你可能感兴趣的:(计算机视觉与深度学习)