✨博客主页:米开朗琪罗~
✨博主爱好:羽毛球
✨年轻人要:Living for the moment(活在当下)!
推荐专栏:【图像处理】【千锤百炼Python】【深度学习】【排序算法】
VGG网络是由牛津大学计算机视觉组和谷歌DeepMind公司共同设计的。
VGG网络并且在2014年在ILSVRC大赛上获得了定位项目的第一名和分类项目的第二名。
作者通过VGG论证了一个非常重要的结论:CNN的深度与小卷积核的使用对图像识别与分类有很大影响!
论文链接:Very Deep Convolutional Networks for Large-Scale Image Recognition
目前最常用的结构是VGG-16和VGG-19。
VGG网络共有六种不同的结构,不管哪种结构都包含5组卷积,且每组卷积后都跟一个最大池化层,最后跟3个全连接层。
各结构的参数情况对比:
网络深、卷积核小(全部为3×3或1×1)、池化核小(全部为2×2);
使用了1×1卷积核;
论文作者指出,LRN虽然在AlexNet中有一定作用,但在VGG中没有很好效果,且会增加多余计算,因此VGG中取消LRN;
VGG增加了对权重的正则化,且对FC层进行Dropout正则化价,目的是降低过拟合的风险;
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, Dropout, MaxPooling2D, BatchNormalization
import matplotlib.pyplot as plt
from keras.utils.vis_utils import plot_model
model = Sequential()
#layer_1
model.add(Conv2D(64, (3, 3), strides=(1, 1), input_shape=(224, 224, 3), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(Conv2D(64, (3, 3), strides=(1, 1), padding='same', kernel_initializer='uniform', activation='relu'))
model.add(MaxPooling2D((2, 2)))
#layer_2
model.add(Conv2D(128, (3, 3), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(Conv2D(128, (3, 3), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(MaxPooling2D((2, 2)))
#layer_3
model.add(Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu'))
model.add(Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu'))
model.add(Conv2D(256, (1, 1), strides=(1, 1), padding='same', activation='relu'))
model.add(MaxPooling2D((2, 2)))
#layer_4
model.add(Conv2D(512, (3, 3), strides=(1, 1), padding='same', activation='relu'))
model.add(Conv2D(512, (3, 3), strides=(1, 1), padding='same', activation='relu'))
model.add(Conv2D(512, (1, 1), strides=(1, 1), padding='same', activation='relu'))
model.add(MaxPooling2D((2, 2)))
#layer_5
model.add(Conv2D(512, (3, 3), strides=(1, 1) ,padding='same', activation='relu'))
model.add(Conv2D(512, (3, 3), strides=(1, 1), padding='same', activation='relu'))
model.add(Conv2D(512, (1, 1), strides=(1, 1), padding='same', activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(4096 ,activation='relu'))
model.add(Dense(4096, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(10, activation='softmax'))
print(model.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
conv2d_1 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
conv2d_3 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 56, 56, 128) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
conv2d_5 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
conv2d_6 (Conv2D) (None, 56, 56, 256) 65792
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 28, 28, 256) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_8 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
conv2d_9 (Conv2D) (None, 28, 28, 512) 262656
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv2d_11 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv2d_12 (Conv2D) (None, 14, 14, 512) 262656
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
dense (Dense) (None, 4096) 102764544
_________________________________________________________________
dense_1 (Dense) (None, 4096) 16781312
_________________________________________________________________
dense_2 (Dense) (None, 1000) 4097000
_________________________________________________________________
dense_3 (Dense) (None, 10) 10010
=================================================================
Total params: 133,648,962
Trainable params: 133,648,962
Non-trainable params: 0
尽管VGG于2014年被提出,但是现在仍然有许多人在使用!
很多框架都可以直接通过API调用VGG预训练模型!
许多视觉任务也使用VGG某一层的输出作为损失函数(如感知损失)。