【TensorFlow】Mnist数据集 - RNN

一、使用RNN识别手写数字

1、什么是RNN

RNN(Recurrent Neural Network)是一类用于处理序列数据的神经网络。首先我们要明确什么是序列数据,摘取百度百科词条:时间序列数据是指在不同时间点上收集到的数据,这类数据反映了某一事物、现象等随时间的变化状态或程度。这是时间序列数据的定义,当然这里也可以不是时间,比如文字序列,但总归序列数据有一个特点——后面的数据跟前面的数据有关系。

 

【TensorFlow】Mnist数据集 - RNN_第1张图片

Mnist数据集不是时间序列,前后序列之间的关系也没有那么大,本博文旨在练习、理解使用RNN。

2、下载数据集

# 下载数据集
import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
from tensorflow.examples.tutorials.mnist import input_data

sess = tf.Session()
mnist = input_data.read_data_sets('data', one_hot=True)
print(mnist.train.images.shape)

执行结果:

Extracting data\train-images-idx3-ubyte.gz
Extracting data\train-labels-idx1-ubyte.gz
Extracting data\t10k-images-idx3-ubyte.gz
Extracting data\t10k-labels-idx1-ubyte.gz
(55000, 784)

3、设置参数 

# 设置参数
lr = 1e-3
# 一阶段一阶段的输入
input_size = 28 # 每行输入28个特征点,x0、x1、...x28都是28个像素点
timestep_size = 28 # 持续输入28行,当做28个序列:x0、x1、...、x28
hidden_size = 256 # 隐含层的数量
layer_num = 2 # LSTM layer的层数
class_num = 10 # 10分类问题

_X = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, class_num])

batch_size =  tf.placeholder(tf.int32, [])
keep_prob = tf.placeholder(tf.float32, []) # dropout选项

4、定义网络结构

使用两层RNN后,再对最后一次RNN的结果用全连接、多分类。

【TensorFlow】Mnist数据集 - RNN_第2张图片

X = tf.reshape(_X, [-1, 28, 28])

def lstm_cell():
    cell = rnn.LSTMCell(hidden_size, reuse=tf.get_variable_scope().reuse) # 重复利用变量
    return rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)

# 堆叠2层LSTM单元,Class tf.contrib.rnn.MultiRNNCell和Class tf.nn.rnn_cell.MultiRNNCell互为别名.由多个简单的cells组成的RNN cell。用于构建多层循环神经网络
mlstm_cell = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(layer_num)], state_is_tuple=True)

# 用全零来初始化状态
init_state = mlstm_cell.zero_state(batch_size, dtype=tf.float32)

# 得到每一层的输出结果,最终要利用的是最后的结果
outputs = list()
state = init_state
with tf.variable_scope('RNN'):
    for timestep in range(timestep_size):
        if timestep > 0:
            tf.get_variable_scope().reuse_variables()
        # bachsize,哪行?,每行的所有数据
        (cell_output, state) = mlstm_cell(X[:, timestep, :], state)
        outputs.append(cell_output)
# 最后一次结果
h_state = outputs[-1]

注意:如果在jupyter notebook中执行“with tf.variable_scope('RNN'):”,只能执行一次,如果第二次执行会报如下错误:

ValueError: Variable RNN/multi_rnn_cell/cell_0/lstm_cell/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "c:\python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()
  File "c:\python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
    op_def=op_def)
  File "c:\python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)

若想执行第二次,需要释放“RNN”共享变量,即重启notebook服务,点击按钮“”。

在TensorFlow中,对共享变量更详细的了解请参考:https://blog.csdn.net/duanlianvip/article/details/96279660

5、迭代训练 

# 迭代训练
# softmax层参数
W = tf.Variable(tf.truncated_normal([hidden_size, class_num], stddev=0.1), dtype=tf.float32)
bias = tf.Variable(tf.constant(0.1, shape=[class_num]), dtype=tf.float32)
y_pre = tf.nn.softmax(tf.matmul(h_state, W) + bias)

# 损失和评估函数
cross_entropy = -tf.reduce_mean(y * tf.log(y_pre))
train_op = tf.train.AdamOptimizer(lr).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_pre, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

sess.run(tf.global_variables_initializer())
for i in range(2000):
    _batch_size = 128
    batch = mnist.train.next_batch(_batch_size)
    if (i+1)%200 ==0:
        # 如果是每200次,则执行accuracy,打印结果
        train_accuracy = sess.run(accuracy, feed_dict={_X: batch[0], y: batch[1], keep_prob:1.0, batch_size:_batch_size})
        # 已经迭代完成epoch数:mnist.train.epochs_completed
        print("Iter%d, step %d, training accuracy %g" % (mnist.train.epochs_completed, (i+1), train_accuracy))
    # 若不是每200次,执行train_op操作
    sess.run(train_op, feed_dict={_X: batch[0], y: batch[1], keep_prob:0.5, batch_size:_batch_size})
    
# 计算测试数据的准确率
print("test accuracy %g" % sess.run(accuracy, feed_dict={_X: mnist.test.images, y: mnist.test.labels, keep_prob:1.0, batch_size:mnist.test.images.shape[0]}))

执行结果:

Iter0, step 200, training accuracy 0.90625
Iter0, step 400, training accuracy 0.9375
Iter1, step 600, training accuracy 0.992188
Iter1, step 800, training accuracy 0.960938
Iter2, step 1000, training accuracy 0.984375
Iter2, step 1200, training accuracy 0.992188
Iter3, step 1400, training accuracy 0.992188
Iter3, step 1600, training accuracy 0.992188
Iter4, step 1800, training accuracy 0.976562
Iter4, step 2000, training accuracy 0.953125
test accuracy 0.9806

二、RNN处理单个图像每层结果展示

1、查看5张图片经过RNN操作后输出数据维度 

# 单个图像RNN每层结果
_batch_size = 5
X_batch, y_batch = mnist.test.next_batch(_batch_size)
print(X_batch.shape, y_batch.shape)
# 图像切分为28块,得到28个结果。5张图像。256个特征
_outputs, _state = sess.run([outputs, state], feed_dict={_X: X_batch, y: y_batch, keep_prob:1.0, batch_size:_batch_size})
print('_outputs.shape =', np.asarray(_outputs).shape)

结果:

(5, 784) (5, 10)
_outputs.shape = (28, 5, 256)

2、展示其中一张图

import matplotlib.pyplot as plt
print(mnist.train.labels[4])

X3 = mnist.train.images[4]
img3 = X3.reshape([28, 28])
plt.imshow(img3, cmap='gray')
plt.show()

【TensorFlow】Mnist数据集 - RNN_第3张图片

3、展示一张图片经过RNN后的输出值维度

X3.shape = [-1, 784]
y_batch = mnist.train.labels[0]
print(y_batch)
y_batch.shape = [-1, class_num]

X3_outputs = np.array(sess.run(outputs, feed_dict={_X: X3, y: y_batch, keep_prob:1.0, batch_size:1}))
print(X3_outputs.shape)
X3_outputs.shape = [28, hidden_size]
print(X3_outputs.shape)

结果:

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
(28, 1, 256)
(28, 256)

4、用条形图展示概率值

h_W = sess.run(W, feed_dict={_X: X3, y: y_batch, keep_prob:1.0, batch_size:1})
h_bias = sess.run(bias, feed_dict={_X: X3, y: y_batch, keep_prob:1.0, batch_size:1})
h_bias.shape = [-1, 10]

bar_index = range(class_num)
for i in range(X3_outputs.shape[0]):
    # 将窗口分为7行4列28个子图
    plt.subplot(7, 4, i+1)
    X3_h_shate = X3_outputs[i, :].reshape([-1, hidden_size])
    pro = sess.run(tf.nn.softmax(tf.matmul(X3_h_shate, h_W) + h_bias))
    # 绘制条形图,概率pro[0]为height 高度,width宽度。默认:color="blue", width=0.8
    plt.bar(bar_index, pro[0], width=0.2, align='center')
    # 不显示坐标尺寸
    plt.axis('off')
plt.show()

结果:

【TensorFlow】Mnist数据集 - RNN_第4张图片

由上图可知,最后一步(最后一个图)的预测结果为第四个长方图的值最大,即预测结果为3.

你可能感兴趣的:(图像处理,TensorFlow,深度学习)