一、RNN与CNN的区别
CNN是在空间上的共享,而RNN是在时序上的共享;RNN采用的是时序数据,输入的数据存在先后顺序关系;
二、RNN网络模型
在RNN结构中与传统的前馈神经网络不同,其存在记忆效应,这种记忆使得神经网络可以对上下文进行分析,显示的表示其对时间的依赖性:
这种对时间的依赖产生了混沌效应。
RNN模型的数学表示为:
上面形成一个简单的RNN层,将整个函数简化为:
对于多层RNN模型,则将y继续作为输入:
三、利用TensorFlow实现简单的RNN
import os
import numpy as np
import tensorflow as tf
n_inputs = 3
n_units = 5
batch_size = 4
x0 = tf.placeholder(tf.float32,[None,n_inputs])
x1 = tf.placeholder(tf.float32,[None,n_inputs])
cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_units)
init_state = cell.zero_state(batch_size,dtype=tf.float32)
outputs,states = tf.nn.static_rnn(cell,[x0,x1],initial_state=init_state)
h0,h1 = outputs
init = tf.global_variables_initializer()
x0_batch = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 0, 1]]) # t = 0
x1_batch = np.array([[9, 8, 7], [0, 0, 0], [6, 5, 4], [3, 2, 1]]) # t = 1
with tf.Session() as sess:
sess.run(init)
h0_val, h1_val = sess.run([h0, h1], feed_dict={x0: x0_batch, x1: x1_batch})
print("the output of h0_val are")
print(h0_val)
对比代码:
import os
import numpy as np
import tensorflow as tf
# unroll RNN manually
n_inputs = 3
n_units = 5
x0 = tf.placeholder(tf.float32, [None, n_inputs])
x1 = tf.placeholder(tf.float32, [None, n_inputs])
Wx = tf.Variable(tf.random_normal([n_inputs, n_units], dtype=tf.float32), name="Wx")
Wh = tf.Variable(tf.random_normal([n_units, n_units], dtype=tf.float32), name="Wh")
b = tf.Variable(tf.zeros([n_units,], dtype=tf.float32), name="b")
h0 = tf.tanh(tf.matmul(x0, Wx) + b)
h1 = tf.tanh(tf.matmul(x1, Wx) + tf.matmul(h0, Wh) + b)
init = tf.global_variables_initializer()
x0_batch = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 0, 1]]) # t = 0
x1_batch = np.array([[9, 8, 7], [0, 0, 0], [6, 5, 4], [3, 2, 1]]) # t = 1
with tf.Session() as sess:
sess.run(init)
h0_val, h1_val = sess.run([h0, h1], feed_dict={x0: x0_batch, x1: x1_batch})
print("the output of h0_val are")
print(h0_val)
print("the output of h1_val are")
print(h1_val)