GoogLeNet首次出现在ILSVRC 2014的比赛中(Inception V1),就以较大优势取得了第一名(top-5错误率6.67%,VGGNet—7.32%)。GoogLeNet有22层深,比[AlexNet]的8层和[VGGNet]的19层还要深。但是GoogLeNet只有500万的参数数量,是AlexNet的1/12(但远胜于AlexNet的准确率),VGGNet的参数是AlexNet的3倍,在内存或计算资源有限时,GoogLeNet是比较好的选择。
Inception Module的基本结构
4个分支,通过 1 × 1 1\times1 1×1卷积来进行低成本的跨通道的特征变换。
Inception Module可以让网络的深度和宽度高效率的扩充,提升准确率且不至于过拟合。
人脑神经元的连接是稀疏的,因此研究者认为大型神经网络的合理的连接方式应该也是稀疏的(对于非常大型、非常深的神经网络,可以减轻过拟合并降低计算量)。文中提到的稀疏结构基于Hebbian原理。
Hebbian原理: 神经反射活动的持续和重复会导致神经元连接稳定性的持久提升,当两个神经元细胞A和B距离很近,并且A参与了对B重复、持续的兴奋,那么某些代谢变化会导致A将作为能使B兴奋的细胞。如图2所示,将上一层高度相关(correlated)的的节点聚类,并将聚类出来的每一个小簇(cluster)连接到一起。
一个“好”的稀疏结构,应该是符合Hebbian原理的,我们应该把相关性高的一簇神经元节点连接在一起。 在图片数据中,邻近区域的数据相关性高,因此相邻的像素点被卷积操作连接在一起。我们可能有多个卷积核,在同一空间位置但在不同通道的卷积核的输出结果相关性极高,通过 1 × 1 1\times1 1×1卷积能够把这些相关性很高的、在同一个空间位置但是不同通道的特征连接到一起。
如图3所示,GoogLeNet(Inception v1)有22层深,除了最后一层的输出其中间节点的分类效果也很好。因此在InceptionNet中,还使用到了辅助分类节点(auxiliary classifiers),即将中间某一层的输出用作分类,并按一个较小的权重(0.3)加到最终分类结果中。这相当与做了模型融合,同时给网络增加了反向传播的梯度信号,也提供了额外的正则化,对于整个InceptionNet的训练很有裨益。
使用了异步的SGD训练,学习速率每迭代8个epoch降低4%。同时,Inception V1也使用了Multi-Scale、Multi-Crop等数据增强方法,并在不同的采样数据上训练了7个模型进行融合。
- 学了了VGGNet ,用两个 3 × 3 3\times3 3×3的卷积代替 5 × 5 5\times5 5×5的大卷积(用以降低参数量并降低过拟合),还提出了著名的Batch Normalization方法。
- BN是一个非常有效的正则化方法,可以让大型卷积网络的训练速度加快很多倍,同时收敛后的分类准确率也可以得到大幅提高。BN用于神经网络某层时,会对每一个mini-batch数据的内部进行标准化(normalization)处理,使输出规范化到 N ∼ ( 0 , 1 ) N\sim(0,1) N∼(0,1)的正态分布,减少了Internal Covariate Shift(内部神经元分布的改变)。
- BN的论文指出,传统的深度神经网络在训练时,每一层的输入的分布都在变化,导致训练变得困难,我们只能使用一个很小的学习速率解决这个问题。而对每一层使用BN之后,能够有效的解决这个问题,学习速率可以增大很多倍,达到之前的准确率所需要的迭代次数只有1/14,训练时间大大缩短。(BN在某种意义上还起到了正则化的作用,所以可以减少或者取消Dropout,简化网络结构。)
- 单纯的使用BN获得的增益并不明显,还需要一些相应的调整:
- 增大学习速率并加快学习衰减速度以适用BN规范化后的数据;
- 去除Dropout并减轻L2正则(BN已祈祷正则化的作用);
- 去除LRN;
- 更彻底地对训练样本进行shuffle;
- 减少数据增强中对数据地光学畸变(BN训练更快,每个样本被训练的次数更少,因此更真实的样本对训练更有帮助);
使用这些措施后,Inception V2的在训练达到Inception V1的准确率时快了14倍,并且在模型的训练时的准确率上限更高。
主要有两方面的改造:
- 一是引入了Factorization into small convolutions 的思想, 将一个较大的二维卷积拆成两个较小的一维卷积,比如将 7 × 7 7\times7 7×7卷积拆成 1 × 7 1\times7 1×7卷积和 7 × 1 7\times1 7×1卷积,或者将 3 × 3 3\times3 3×3的卷积拆成 1 × 3 1\times3 1×3和 3 × 1 3\times1 3×1卷积,如图4所示。
图4. 将一个3×3卷积拆成1×3卷积和3×1卷积 一方面节约了大量参数,加速运算并减轻了过拟合(将 7 × 7 7\times7 7×7卷积拆成 1 × 7 1\times7 1×7卷积和 7 × 1 7\times1 7×1卷积,比拆成3个 3 × 3 3\times3 3×3卷积更节约参数),同时增加了一层非线性扩展模型表达能力。这种非对称的卷积结构拆分,其结果比对称地拆为几个相同的小卷积核效果更明显,可以处理更多、更丰富的空间特征,增加特征多样性。
- 另一方面是优化了Inception Module的结构, 现在Inception Module有 35 × 35 , 17 × 17 , 8 × 8 35\times35,17\times17,8\times8 35×35,17×17,8×8三种不同的结构,如图5所示。
图5. Inception V3中三种结构的Inception Module 这些Inception Module只在网络的后部出现,前部还是普通的卷积层。并且Inception V3除了在Inception Module中使用分支,还在分支中使用了分支(8×8的结构中),可以说是Network In Network In Network。
Inception V3的网络结构如下表:
类型 kernel尺寸/步长(或注释) 输入尺寸 卷积 3×3 / 2 299×299×3 卷积 3×3 / 1 149×149×32 卷积 3×3 / 1 147×147×32 池化 3×3 / 2 147×147×64 卷积 3×3 / 1 73×73×64 卷积 3×3 / 2 71×71×80 卷积 3×3 / 1 35×35×192 Inception 模块组 3个Inception Module 35×35×288 Inception 模块组 5个Inception Module 17×17×768 Inception 模块组 3个Inception Module 8×8×1280 池化 8×8 8×8×2048 线性 logits 1×1×2048 Softmax 分类输出 1×1×1000
与Inception V3相比V4主要是结合了微软的ResNet。
实现的是Inception V3,网络结构如上表。 Inception V3相对比较复杂,所以使用tf.contrib.slim辅助设计网络。contrib.slim中的一些功能和组建可以大大减少设计Inception Net的代码量,只需要少量的代码即可构建好有42层深的Inception V3。
实现代码:
import tensorflow as tf
from datetime import datetime
import time
import math
slim = tf.contrib.slim
#trunc_normal:产生截断的正态分布
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)
num_batches = 100
'''
inception_arg_scope:用来生成网络中经常用到的函数的默认参数,比如卷积的激活函数、权重初始化方式、标准化器等。
L2正则的weight_decay默认值为0.00004,标准差stddev默认值为0.1,参数batch_norm_var_collection默认值为moving_vars
'''
def inception_v3_arg_scope(weight_decay=0.00004,
stddev=0.1,
batch_norm_var_collection='moving_vars'):
'''
参数字典
'''
batch_norm_params = {
'decay': 0.9997, #衰减系数
'epsilon': 0.001,
'updates_collections': tf.GraphKeys.UPDATE_OPS,
'variables_collections': {
'beta': None,
'gama': None,
'moving_mean': [batch_norm_var_collection],
'moving_variance': [batch_norm_var_collection],
}
}
'''
slim.arg_scope,可以给函数的参数自动赋予某些默认值
'''
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_regularizer=slim.l2_regularizer(weight_decay)):
with slim.arg_scope(
[slim.conv2d],
weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
activation_fn=tf.nn.relu,
normalizer_fn=slim.batch_norm,
normalizer_params=batch_norm_params
) as sc:
return sc
'''
inception_v3_base:生成Inception V3网络的卷积部分
参数inputs为输入的图片数据的tensor,scope为包含了函数默认参数的环境。
输出为:35*35*192
'''
def inception_v3_base(inputs, scope=None):
end_points = {} #保存关键节点供之后使用
with tf.variable_scope(scope, 'InceptionV3', [inputs]):
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
stride=1, padding='VALID'): #为函数设置默认值
net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3')
net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')
net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3')
net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')
net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')
net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')
net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')
'''
第1个Inception模块组,包括三个模块
'''
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
stride=1, padding='SAME'):
# 第1个Inception Module
with tf.variable_scope('Mixed_5b'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) #输出通道数为64+64+96+32=256,padding是'SAME',输出为35*35*256
# 第2个Inception Module
with tf.variable_scope('Mixed_5c'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) #输出通道数为64+64+96+64=288,padding是'SAME',输出为35*35*288
# 第3个Inception Module
with tf.variable_scope('Mixed_5d'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) #输出通道数为64+64+96+64=288,padding是'SAME',输出为35*35*288
'''
第2个模块组Inception,5个模块
'''
# 第1个Inception Module
with tf.variable_scope('Mixed_6a'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 384, [3, 3],
stride=2, padding='VALID', scope='Conv2d_1a_1x1') #图片缩小为17x17
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_1a_1x1')
branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_1b_3x3')
branch_1 = slim.conv2d(branch_1, 96, [3, 3],
stride=2, padding='VALID', scope='Conv2d_1c_3x3')
with tf.variable_scope('Branch_2'):
branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
net = tf.concat([branch_0, branch_1, branch_2], 3) #输出通道数为384+96+256=768,输出尺寸为17*17*768
#后4个Module中都用到了'Factorization into small convolutions'
# 第2个Inception Module
with tf.variable_scope('Mixed_6b'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_2a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_2a_1x1')
branch_1 = slim.conv2d(branch_1, 128, [1, 7], scope='Conv2d_2b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_2c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_2a_1x1')
branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_2b_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_2c_1x7')
branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_2d_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_2e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_2a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_2b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) #输出通道为192+192+192+192=768,输出尺寸为17*17*768
# 第3个Inception Module
with tf.variable_scope('Mixed_6c'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_2a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_2b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_2c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2b_7x1')
branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_2c_1x7')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2d_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_2e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_2a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_2b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) #输出通道为192+192+192+192=768,输出尺寸为17*17*768
# 第4个Inception Module
with tf.variable_scope('Mixed_6d'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_2a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_2b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_2c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2b_7x1')
branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_2c_1x7')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2d_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_2e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_2a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_2b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 输出通道为192+192+192+192=768,输出尺寸为17*17*768
# 第5个Inception Module
with tf.variable_scope('Mixed_6e'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_2a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_2b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_2c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_2a_1x1')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2b_7x1')
branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_2c_1x7')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_2d_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_2e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_2a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_2b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 输出通道为192+192+192+192=768,输出尺寸为17*17*768
end_points['Mixed_6e'] = net
'''
第3个Inception模块组,包含3个Inception Module
'''
# 第1个Inception Module
with tf.variable_scope('Mixed_7a'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_3a_1x1')
branch_0 = slim.conv2d(branch_0, 320, [3, 3],
stride=2, padding='VALID', scope='Conv2d_3b_3x3')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_3a_1x1')
branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_3b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_3c_7x1')
branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2,
padding='VALID', scope='Conv2d_3d_3x3')
with tf.variable_scope('Branch_2'):
branch_2 = slim.max_pool2d(net, [3, 3], stride=2,
padding='VALID', scope='MaxPool_3a_3x3')
net = tf.concat([branch_0, branch_1, branch_2], 3) #输出通道数为320+192+768=1280,输出尺寸为8*8*1280
# 第2个Inception Module
with tf.variable_scope('Mixed_7b'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_3a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_3a_1x1')
branch_1 = tf.concat([
slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_3b_1x3'),
slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_3c_3x1')
], 3)
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_3a_1x1')
branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_3b_3x3')
branch_2 = tf.concat([
slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_3c_1x3'),
slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_3d_3x1')
], 3)
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_3a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_3b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 输出通道数为320+768+768+192=2048,输出尺寸为8*8*2048
# 第3个Inception Module
with tf.variable_scope('Mixed_7c'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_3a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_3a_1x1')
branch_1 = tf.concat([
slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_3b_1x3'),
slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_3c_3x1')
], 3)
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_3a_1x1')
branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_3b_3x3')
branch_2 = tf.concat([
slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_3c_1x3'),
slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_3d_3x1')
], 3)
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_3a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_3b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 输出通道数为320+768+768+192=2048,输出尺寸为8*8*2048
return net, end_points
'''
inception_v3:全局平均池化、Softmax和Auxiliary Logits
num_classes:分类数量
is_training:标志是否是训练过程,只有训练时Batch Normalization和Dropout才会被启用
dropout_keep_prob:Dropout所需保留节点的比例,默认为0.8
prediction_fn:最后用来进行分类的函数
spatial_squeeze:标志是否对输出进行squeeze操作(即去除维数为1的维度,如5x3x1转为5x3)
reuse:标志是否会对网络和Variable进行重复使用
scope:包含了默认参数的环境
'''
def inception_v3(inputs,
num_classes=100,
is_training=True,
dropout_keep_prob=0.8,
prediction_fn=slim.softmax,
spatial_squeeze=True,
reuse=None,
scope='InceptionV3'):
with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes],
reuse=reuse) as scope:
with slim.arg_scope([slim.batch_norm, slim.dropout],
is_training=is_training):
net, end_points = inception_v3_base(inputs, scope=scope)
'''
Auxiliary Logits作为辅助分类的节点,对分类结果预测有很大帮助
'''
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
stride=1, padding='SAME'):
aux_logits = end_points['Mixed_6e']
with tf.variable_scope('AuxLogits'):
aux_logits = slim.avg_pool2d(aux_logits,
[5, 5], stride=3, padding='VALID',
scope='AvgPool_1a_5x5')
aux_logits = slim.conv2d(aux_logits,
128, [1, 1], scope='Conv2d_1b_1x1')
aux_logits = slim.conv2d(aux_logits,
768, [5, 5], weights_initializer=trunc_normal(0.01),
padding='VALID', scope='Conv2d_1c_5x5')
aux_logits = slim.conv2d(aux_logits,
num_classes, [1, 1], activation_fn=None,
normalizer_fn=None, weights_initializer=trunc_normal(0.001),
scope='Conv2d_1d_1x1')
if spatial_squeeze:
aux_logits = tf.squeeze(aux_logits, [1, 2],
name='SpatialSqueeze')
end_points['AuxLogits'] = aux_logits
'''
Logits进行正常的分类预测的逻辑
'''
with tf.variable_scope('Logits'):
net = slim.avg_pool2d(net, [8, 8],
padding='VALID', scope='AvgPool_1a_8x8')
net = slim.dropout(net, keep_prob=dropout_keep_prob,
scope='Dropout_1b')
end_points['PreLogits'] = net
logits = slim.conv2d(net, num_classes, [1, 1],
activation_fn=None, normalizer_fn=None, scope='Conv2d_1c_1x1')
if spatial_squeeze:
logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
end_points['Logits'] = logits
end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
return logits, end_points
'''
time_tensorflow_run:评估每轮计算时间的函数
'''
def time_tensorflow_run(session, target, info_string):
num_steps_burn_in = 10
total_duration = 0.0
total_duration_squared = 0.0
for i in range(num_batches + num_steps_burn_in):
start_time = time.time()
_ = session.run(target)
duration = time.time() - start_time
if i >= num_steps_burn_in:
if not i % 10:
print('%s: step %d, duration = %.3f' %(datetime.now(), i - num_steps_burn_in, duration))
total_duration += duration
total_duration_squared += duration * duration
mn = total_duration / num_batches
vr = total_duration_squared / num_batches - mn * mn
sd = math.sqrt(vr)
print('%s: %s across %d steps, %.3f +/- %.3f sec / batch'%(datetime.now(), info_string, num_batches, mn, sd))
def main():
batch_size = 32
height, width = 299, 299
inputs = tf.random_uniform((batch_size, height, width, 3))
with slim.arg_scope(inception_v3_arg_scope()):
logits, end_points = inception_v3(inputs, is_training=False)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# 前馈时间计算
time_tensorflow_run(sess, logits, 'Forward')
if __name__ == '__main__':
main()
Inception V3是一个非常复杂、精妙的模型,其中用到了非常多之前积累下来的设计大型卷积网络的经验和技巧。
Inception V3作为一个极深的卷积神经网络,拥有非常精妙的设计和构造,整个网络的结构和分支非常复杂,Inception V3中有很多的设计CNN的思想和Trick可以借鉴。