完整代码:VGGNet实现完整代码
VGGNet由牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司研究员一起研发的深度卷积神经网络。VGGNet获得了ILSVRC 2014年比赛的亚军和定位项目的冠军,在top5上的错误率为7.5%。它主要的贡献是展示出网络的深度(depth)是算法优良性能的关键部分。VGGNet作者总结出LRN层作用不大。
论文:Very Deep Convolutional Networks for Large-Scale Image Recognition
VGGNet的网络结构如下图所示,VGGNet包含多层网络,深度从11层到19层不等,较为常用的是VGG16和VGG19。
VGGNet参数表:
接下来我们以VGG16为例,即上图中的D,介绍VGGNet。
操作 | 输入数据维度 | 输出数据维度 | 卷积核 | 卷积步长 | 池化核大小 | 池化步长 | 填充 | 激活函数 |
---|---|---|---|---|---|---|---|---|
输入层 | — | (224,224,3) | — | — | — | — | — | — |
卷积层 | (224,224,3) | (224,224,64) | (3,3,64) | 1 | — | — | same | relu |
卷积层 | (224,224,64) | (224,224,64) | (3,3,64) | 1 | — | — | same | relu |
最大池化层 | (224,224,64) | (112,112,64) | — | — | (2,2) | 2 | valid | — |
卷积层 | (112,112,64) | (112,112,128) | (3,3,128) | 1 | — | — | same | relu |
卷积层 | (112,112,128) | (112,112,128) | (3,3,128) | 1 | — | — | same | relu |
最大池化层 | (112,112,128) | (56,56,128) | — | — | (2,2) | 2 | valid | — |
卷积层 | (56,56,128) | (56,56,256) | (3,3,256) | 1 | — | — | same | relu |
卷积层 | (56,56,256) | (56,56,256) | (3,3,256) | 1 | — | — | same | relu |
卷积层 | (56,56,256) | (56,56,256) | (3,3,256) | 1 | — | — | same | relu |
最大池化层 | (56,56,256) | (28,28,256) | — | — | (2,2) | 2 | valid | — |
卷积层 | (28,28,256) | (28,28,512) | (3,3,512) | 1 | — | — | same | relu |
卷积层 | (28,28,512) | (28,28,512) | (3,3,512) | 1 | — | — | same | relu |
卷积层 | (28,28,512) | (28,28,512) | (3,3,512) | 1 | — | — | same | relu |
最大池化层 | (28,28,512) | (14,14,512) | — | — | (2,2) | 2 | valid | — |
卷积层 | (14,14,512) | (14,14,512) | (3,3,512) | 1 | — | — | same | relu |
卷积层 | (14,14,512) | (14,14,512) | (3,3,512) | 1 | — | — | same | relu |
卷积层 | (14,14,512) | (14,14,512) | (3,3,512) | 1 | — | — | same | relu |
最大池化层 | (14,14,512) | (7,7,512) | — | — | (2,2) | 2 | valid | — |
全连接层 | (25088,) | (4096,) | — | — | — | — | — | relu |
全连接层 | (4096,) | (4096,) | — | — | — | — | — | relu |
全连接层(输出层) | (4096,) | (1000,) | – | – | – | — | — | softmax |
import os
from keras import layers
from keras import models
def VGG16(input_shape=None, classes=None):
img_input = layers.Input(shape=input_shape)
# Block 1
x = layers.Conv2D(64, (3, 3),activation='relu',padding='same',name='block1_conv1')(img_input)
x = layers.Conv2D(64, (3, 3),activation='relu',padding='same',name='block1_conv2')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = layers.Conv2D(128, (3, 3),activation='relu',padding='same',name='block2_conv1')(x)
x = layers.Conv2D(128, (3, 3),activation='relu',padding='same',name='block2_conv2')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = layers.Conv2D(256, (3, 3),activation='relu',padding='same',name='block3_conv1')(x)
x = layers.Conv2D(256, (3, 3),activation='relu',padding='same',name='block3_conv2')(x)
x = layers.Conv2D(256, (3, 3),activation='relu',padding='same',name='block3_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block4_conv1')(x)
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block4_conv2')(x)
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block4_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block5_conv1')(x)
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block5_conv2')(x)
x = layers.Conv2D(512, (3, 3),activation='relu',padding='same',name='block5_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
x = layers.Flatten(name='flatten')(x)
x = layers.Dense(4096, activation='relu', name='fc1')(x)
x = layers.Dense(4096, activation='relu', name='fc2')(x)
x = layers.Dense(classes, activation='softmax', name='predictions')(x)
# Create model.
model = models.Model(img_input, x, name='vgg16')
return model