文章列表
1.TensorFlow入门深度学习–01.基础知识. .
2.TensorFlow入门深度学习–02.基础知识. .
3.TensorFlow入门深度学习–03.softmax-regression实现MNIST数据分类. .
4.TensorFlow入门深度学习–04.自编码器(对添加高斯白噪声后的MNIST图像去噪).
5.TensorFlow入门深度学习–05.多层感知器实现MNIST数据分类.
6.TensorFlow入门深度学习–06.可视化工具TensorBoard.
7.TensorFlow入门深度学习–07.卷积神经网络概述.
8.TensorFlow入门深度学习–08.AlexNet(对MNIST数据分类).
9.TensorFlow入门深度学习–09.tf.contrib.slim用法详解.
10.TensorFlow入门深度学习–10.VGGNets16(slim实现).
11.TensorFlow入门深度学习–11.GoogLeNet(Inception V3 slim实现).
…
tf.contrib.slim可以大大减少复杂网络的代码量,使得构建、训练以及评估变得简单,在下一节中我们将用它来实现VGGNet以及42层深的InceptionV3模型,这里我们将通过一个简化的CNN模型来介绍slim的功能。Slim是放在tensorflow.contrib这个库下的,导入方法:
import tensorflow.contrib.slim as slim
但是tensorflow.contrib库中的代码没有官方支持,可能会被修改或删除,其中的代码可能会被合并到核心tensorflow中,所以使用之前需进行一些测试。
首先我们通过建立一个卷积-池化-卷积-池化-全连接-全连接的CNN模型来识别MNIST数据。实现源码如下:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
x_image = tf.reshape(x, [-1, 28, 28, 1])
#conv1
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
#conv2
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
#fc1
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
#dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
#fc2
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
#Train and Evaluate the Model
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv, labels=y_))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
mnist = input_data.read_data_sets('./../MNISTDat', one_hot=True)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_:batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob: 0.5})
从上面的代码可以看出,conv1对应的17-19行与conv2对应的22-25行重复度很高,可以将其抽象成函数,不过我们这里用slim来实现,实现如下:
with slim.arg_scope([slim.conv2d],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
biases_initializer=tf.constant_initializer(0.1),
padding='SAME'):
#conv1
h_conv1 = slim.conv2d(x_image, 32, [5, 5])
h_pool1 = slim.max_pool2d(h_conv1, [2, 2] , padding='SAME')
#conv2
h_conv2 = slim.conv2d(h_pool1, 64, [5, 5])
h_pool2 = slim.max_pool2d(h_conv2, [2, 2] , padding='SAME')
arg_scope后的slim.conv2d用中括号括起来,表明我们想要对slim.conv2d这个操作的相关参数进行默认值设置,中括号后面就跟着我们设置的默认值,如激活函数设为relu: activation_fn=tf.nn.relu,卷积核初始化成标准差为0.1的截断正太分布:weights_initializer=tf.truncated_normal_initializer(stddev=0.1),偏置项初始化为0.1:biases_initializer=tf.constant_initializer(0.1)),而卷积核的维度5*5是在具体用时设置的,卷积核的个数32也是在具体用时设置的(也是偏置项的个数),卷积核的深度由输入项x_image的通道数决定。slim.conv2d中包含了卷积核及偏置项初始化、卷积操作、偏置项求和以及非线性激活这几个操作。此外,下采用改为slim.max_pool2d来实现,对padding方式同样可以用arg_scope设置默认值来实现:
with slim.arg_scope([slim.conv2d],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
biases_initializer=tf.constant_initializer(0.1),
padding='SAME'):
with slim.arg_scope([slim.max_pool2d], padding='SAME'):
#conv1
h_conv1 = slim.conv2d(x_image, 32, [5, 5])
h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
#conv2
h_conv2 = slim.conv2d(h_pool1, 64, [5, 5])
h_pool2 = slim.max_pool2d(h_conv2, [2, 2])
从上面还可以看出用arg_scope设置多个操作op的默认值的设置方法。再接着回头看原来的程序,全连接层26-34行的重复率也很高,并且权重系数也是采用截断正太分布来实现,而偏置项也是常亮化为0.1,激活函数也是relu,所以全连接层可以与卷积层采用同样的默认参数设置,但是原来的卷积层默认参数中多了一个padding=’SAME’这一设置,而全连接中没有,所以需要在默认设置中将这一项去掉,这样,我们可以在上面的声明中添加修改如下:
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
biases_initializer=tf.constant_initializer(0.1)):
with slim.arg_scope([slim.max_pool2d], padding='SAME'):
#conv1
h_conv1 = slim.conv2d(x_image, 32, [5, 5] , padding='SAME')
h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
#conv2
h_conv2 = slim.conv2d(h_pool1, 64, [5, 5] , padding='SAME')
h_pool2 = slim.max_pool2d(h_conv2, [2, 2])
#Densely Connected Layer
h_pool2_flat = slim.flatten(h_pool2)
h_fc1 = slim.fully_connected(h_pool2_flat, 1024)
#dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = slim.dropout(h_fc1, keep_prob)
#Readout Layer
y_conv = slim.fully_connected(h_fc1_drop, 10, activation_fn=None)
在arg_scope之后的中括号内添加slim.fully_connected这一项,那么slim.fully_connected将与slim.conv2共同享有后面的默认参数设置。对比slim.max_pool2d的默认参数设置发现,有共同默认设置的操作op可以放在同一个arg_scope中进行声明。此外,在h_pool2转换成一维向量时用的是slim.flatten这一方法。修改后的slim版程序如下:
import tensorflow as tf
import tensorflow.contrib.slim as slim
from tensorflow.examples.tutorials.mnist import input_data
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
x_image = tf.reshape(x, [-1, 28, 28, 1])
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
biases_initializer=tf.constant_initializer(0.1)):
with slim.arg_scope([slim.max_pool2d], padding='SAME'):
#conv1
h_conv1 = slim.conv2d(x_image, 32, [5, 5], padding='SAME')
h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
#conv2
h_conv2 = slim.conv2d(h_pool1, 64, [5, 5], padding='SAME')
h_pool2 = slim.max_pool2d(h_conv2, [2, 2])
#Densely Connected Layer
h_pool2_flat = slim.flatten(h_pool2)
h_fc1 = slim.fully_connected(h_pool2_flat, 1024)
#dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = slim.dropout(h_fc1, keep_prob)
#Readout Layer
y_conv = slim.fully_connected(h_fc1_drop, 10, activation_fn=None)
#Train and Evaluate the Model
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv, labels=y_))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
mnist = input_data.read_data_sets('./../MNISTDat', one_hot=True)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_:batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob: 0.5})
需要补充说明的是slim.conv2的stride默认是[1*1]的,padding方式默认是’SAME’,可通过实验得出:
import tensorflow as tf
import tensorflow.contrib.slim as slim
x1 = tf.random_normal(shape=[1, 64, 64, 3])
w = tf.fill([5, 5, 3, 64], 1.0)
y1 = tf.nn.relu(tf.nn.conv2d(x1, w, strides=[1, 1, 1, 1], padding='SAME'))
y2 = slim.conv2d(x1, 64, [5, 5], weights_initializer=tf.ones_initializer, padding='SAME')
y3 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer, padding='SAME')
y4 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer, padding='VALID')
y5 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
y1_value,y2_value,y3_value,y4_value,y5_value=sess.run([y1,y2,y3,y4,y5])
print("shapes are", y1_value.shape, y2_value.shape, y3_value.shape, y4_value.shape, y5_value.shape)
>>shapes are (1, 64, 64, 64) (1, 64, 64, 64) (1, 22, 22, 64) (1, 20, 20, 64) (1, 22, 22, 64)
有同学可能觉得上面用到的slim的那些功能用没有什么特别大的优势,我自己通过定义函数时设定默认参数同样可以实现,下面我们再来介绍一下slim中最吸引人的地方,repeat和stack操作。
Repeat是用来处理操作参数一致的情况,假定有三个相同连续的卷积层,
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
利用slim中的repeat操作可减少代码量:
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
stack是用来处理卷积核或输出维度不一致的情况。假定三层连续的FC层
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
使用stack操作,有
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
假定3层连续的卷积核不一样的卷积操作
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
使用stack操作,有
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3])], scope='core')
是不是很方便,当然如果你觉得这个还是自己编写函数或类来实现的好,那也行。在后面的博文中,我们用slim实现VGGNets 16及Inception V3,代码特别简洁。
TensorFlow实现 :https://pan.baidu.com/s/14A91inmZmSC55dgDv8ZZ3Q