声明:本文通过CNN实现mnist例子总结了TensorFlow 1.12的相关API。代码来源于《Learning TensorFlow》这本书,API查阅了TensorFlow官网API
本文通过对经典深度学习的入门示例“mnist手写体数字识别”API进行总结,其目的是使自己在初学时候熟悉TensorFlow相关API,同时熟悉TensorFlow的基本使用。为了直奔主题,本文忽略了对CNN相关知识的详述,首先直接给出了用TensorFlow实现mnist手写体数字识别的代码,然后直接依次罗列其中的API并给出解释和加以拓展。
关键词:TensorFlow;API r1.12;mnist;
"""
CNN手写体数字识别
"""
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# 先定义好需要用到的函数
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
def conv_layer(input, shape):
W = weight_variable(shape)
b = bias_variable([shape[3]])
return tf.nn.relu(conv2d(input, W) + b)
def full_layer(input, size):
in_size = int(input.get_shape()[1])
W = weight_variable([in_size, size])
b = bias_variable([size])
return tf.matmul(input, W) + b
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
x_image = tf.reshape(x, [-1, 28, 28, 1])
conv1 = conv_layer(x_image, shape=[5, 5, 1, 32])
conv1_pool = max_pool_2x2(conv1)
conv2 = conv_layer(conv1_pool, shape=[5, 5, 32, 64])
conv2_pool = max_pool_2x2(conv2)
conv2_flat = tf.reshape(conv2_pool, [-1, 7*7*64])
full_1 = tf.nn.relu(full_layer(conv2_flat, 1024))
keep_prob = tf.placeholder(tf.float32)
full1_drop = tf.nn.dropout(full_1, keep_prob=keep_prob)
y_conv = full_layer(full1_drop, 10)
sess = tf.Session()
tf.summary.FileWriter('log/', sess.graph)
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv, labels=y_))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
STEPS = 1001
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(STEPS):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = sess.run(accuracy, feed_dict={
x: batch[0],
y_: batch[1],
keep_prob: 1.0})
print("step {}, training accuracy {}".format(i, train_accuracy))
sess.run(train_step, feed_dict={
x: batch[0], y_: batch[1], keep_prob: 0.5})
X = mnist.test.images.reshape(10, 1000, 784)
Y = mnist.test.labels.reshape(10, 1000, 10)
test_accuracy = np.mean([sess.run(accuracy, feed_dict={
x: X[i], y_: Y[i], keep_prob: 1.0})
for i in range(10)])
print("test accuracy: {}".format(test_accuracy))
tf.random.truncated_normal(
shape, # A 1-D integer Tensor or Python array. The shape of the output tensor.
mean=0.0, # A 0-D Tensor or Python value of type dtype. The mean of the truncated normal distribution.
stddev=1.0, # A 0-D Tensor or Python value of type dtype. The standard deviation of the normal distribution, before truncation.
dtype=tf.float32, # The type of the output.
seed=None, # A Python integer. Used to create a random seed for the distribution. See tf.set_random_seed for behavior.
name=None # A name for the operation (optional).
)
截断正太分布常用于给训练参数(如:权值,偏置)产生一些初值。
变量(Variable)是特殊的张量(Tensor),它的值可以是一个任何类型和形状的张量。与其他张量不同,变量存在于单个 session.run 调用的上下文之外,也就是说,变量存储的是持久张量,当训练模型时,用变量来存储和更新参数。除此之外,在调用operator之前,所有变量都应被显式地初始化过。Variable其实是一个python类,该类的构造函数其实包含了很多参数:
__init__(
initial_value=None,
trainable=True,
collections=None,
validate_shape=True,
caching_device=None,
name=None,
variable_def=None,
dtype=None,
expected_shape=None,
import_scope=None,
constraint=None,
use_resource=None,
synchronization=tf.VariableSynchronization.AUTO,
aggregation=tf.VariableAggregation.NONE
)
其中initial_value是传入的初始化值,官网是这样描述的:
initial_value: A Tensor, or Python object convertible to a Tensor, which is the initial value for the Variable. The initial value must have a shape specified unless validate_shape is set to False. Can also be a callable with no argument that returns the initial value when called. In that case, dtype must be specified. (Note that initializer functions from init_ops.py must first be bound to a shape before being used here.)
变量的初始值可以是一个张量,或者是可转换为张量的Python对象。初始值必须具有指定的形状,除非validate_shape参数设置为False。也可以是一个无参数调用,调用时返回初始值。在这种情况下,dtype必须指定。(请注意,init_ops.py中的初始化函数在使用之前必须先绑定到一个形状。
变量创建时使用tf.Variable(), 在使用前需要为初始化数据分类内存,这时需要给sess.run()传入一个tf.global_variables_initializer()。
init = tf.global_variables_initializer()
sess.run(init)
和其他张量对象一样,Variables只有在运行模型时才会计算。同时重用同一个variable为了提高效率,我们可以调用tf.get_variables()。
其他更详细的信息直接参看官方api。
tf.constant(
value, # A constant value (or list) of output type dtype.
dtype=None, # The type of the elements of the resulting tensor.
shape=None, # Optional dimensions of resulting tensor.
name='Const', # Optional name for the tensor.
verify_shape=False # Boolean that enables verification of a shape of values.
)
根据函数参数信息,可以发现value是必须传的参数。
与tf.fill()比较:
tf.constant differs from tf.fill in a few ways:
- tf.constant supports arbitrary constants, not just uniform scalar Tensors like tf.fill.
- tf.constant creates a Const node in the computation graph with the exact value at graph construction time. On the other hand, tf.fill creates an Op in the graph that is expanded at runtime.
- Because tf.constant only embeds constant values in the graph, it does not support dynamic shapes based on other runtime Tensors, whereas tf.fill does.
TensorFlow已经为我们指定内置结构用于供给输入值,这些结构称为占位符。 占位符可以被认为是空变量,并将在随后填充数据。 我们首先使用它们来构建我们的图形,并且只有在执行它时才使用输入数据。
tf.placeholder(
dtype, # 指定占位符数据类型
shape=None, # shape指定输入的shape,当某一维指定位None,表示这一维可以是任意值。如x = tf.placeholder(tf.float32, shape=[None, 784]),这里None表示这个维度可以是任意大小,通常用于表示样本数量。
name=None # 名称
)
tf.reshape( # 将给定的tensor的形状转换为指定的shape
tensor,
shape,
name=None
)
注意shape参数可以有一个-1,表示缺省值(自适应),就是先根据其他维度调整,到时tensor总维度乘积除以其他几个维度乘积,就是缺省的维度大小。如:
a = tf.placeholder(tf.float32, shape=[1, 24])
print(a.get_shape())
b = tf.reshape(a, [-1, 3, 4])
print(b.get_shape())
# out =
# (1, 24)
# (2, 3, 4)
b的第一维设为 − 1 -1 −1,通过自适应(如果不能整除将会报错),第一维reshape后为 ( 1 × 24 ) / ( 3 × 4 ) = 2 (1\times 24)/(3\times4) = 2 (1×24)/(3×4)=2。特殊的 shape=[-1]表示将tensor展成一维。
tf.nn.conv2d(
input, # 输入张量data_format默认为 [batch, in_height, in_width, in_channels], 数值类型必须是half, bfloat16, float32, float64
filter, # 滤波器(核)[filter_height, filter_width, in_channels, out_channels]
strides, # 移动步长:[batch, height, weight, channel]分别表示对应的移动步长
padding, # 边缘补0,设置为‘SAME’添加后产生的特征图和输入维度一样大
use_cudnn_on_gpu=True, # bool类型,是否使用cudnn加速,默认为true
data_format='NHWC', # 输入输出数据格式:默认为 [batch, height, width, channels]
dilations=[1, 1, 1, 1], # 每一维与data_format对应。如果设置为k> 1,则该维度上的每个滤镜元素之间将有k-1个跳过的单元格。可以做出中空滤波器的效果,用相同数量的参数获得更大的感受野
name=None # 名字
)
tf.nn.max_pool(
value, # data_format格式的4维张量,一般是卷积后的feture map
ksize, # 池化窗口大小,取一个四维向量,一般是[1, height, width, 1],因为我们不想在batch和channels上做池化
strides, # 和卷积类似,窗口在每一个维度上滑动的步长,一般也是[1, stride,stride, 1]
padding, # 和卷积类似,可以取'VALID' 或者'SAME'
data_format='NHWC',
name=None
)
tf.nn.relu(
features, # 张量
name=None # 名称
)
tf.nn.dropout(
x, # float类型的tensor
keep_prob, # float类型,每个元素被保留下来的概率,设置神经元被选中的概率,在初始化时keep_prob是一个占位符, keep_prob = tf.placeholder(tf.float32) 。tensorflow在run时设置keep_prob具体的值,例如keep_prob: 0.5
noise_shape=None, # 一个1维的int32张量,代表了随机产生“保留/丢弃”标志的shape
seed=None, # 整形变量,随机数种子
name=None
)
tf.matmul(a,b) # 矩阵相乘a*b
tf.Session() # 在会话中启动图
tf.summary.FileWriter() # 将摘要协议缓冲区写入事件文件
tf.reduce_mean() # 默认reduce_mean(x)对所有元素求均值,指定减小的维度用axis属性
tf.equal() # 判断张量相等
tf.argmax() # 返回张量中的最大值
tf.cast() # 将tensor转型为新的类型