class tf.contrib.rnn.BasicRNNCell(num_units, activation=None, reuse=None, name=None)
输入参数:
- num_units: int, the number of units in the RNN cell.
- activation: Nonlinearity to use.
Default: tanh
.- reuse: (optional) Python boolean describing whether to
reuse variables in an existing scope
. If not True, and the existing scope already has the given variables, an error is raised.- name: String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require
reuse=True
in such cases.输出:
- 一个隐层神经元数量为
num_units
的 RNN 基本单元(实例化的 cell)常用属性:
- state_size:size(s) of state(s)
used
by this cell,等于隐层神经元数量- output_size: size of outputs
produced
by this cell- 注意: 在此函数中,
state_size
永远等于output_size
常用方法:
call(inputs, state)
: 返回两个一模一样的隐层状态值zero_state(batch_size, dtype)
: 返回一个形状为[batch_size, state_size]
的全零张量
import tensorflow as tf
cell = tf.contrib.rnn.BasicRNNCell(num_units=128)
print(cell.state_size) # 128
inputs = tf.placeholder(shape=[32, 100], dtype=tf.float32) # 32 是 batch_size
h0 = cell.zero_state(batch_size=32, dtype=tf.float32) # 通过 zero_state 得到一个全 0 的初始状态,形状为(batch_size, state_size)
output, h1 = cell.call(inputs=inputs, state=h0) # 调用 call 函数, 在时间序列上推进一步
print(h1.shape) # (32, 128)
output == h1 # True
class tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1.0, state_is_tuple=True, activation=None, reuse=None, name=None)
输入参数:
- num_units: int, the number of units in the RNN cell.
- forget_bias: float, The bias added to forget gates. Must set to 0.0 manually when restoring from CudnnLSTM-trained checkpoints.
- state_is_tuple: If True, accepted and returned states are 2-tuples of the
c_state and m_state
- activation: Nonlinearity to use.
Default: tanh
.- reuse: (optional) Python boolean describing whether to
reuse variables in an existing scope
. If not True, and the existing scope already has the given variables, an error is raised.- name: String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require
reuse=True
in such cases.输出:
- 一个隐层神经元数量为
num_units
的 LSTM 基本单元(实例化的 lstm_cell)- state_size:size(s) of state(s)
used
by this cell,等于隐层神经元数量- output_size: size of outputs
produced
by this cell.- 注意: 在此函数中,
state_size
永远等于output_size
常用方法:
call(inputs, state)
: 返回一个是 new_h,一个是 new_state(LSTMStateTuple:包含 c 和 h)zero_state(batch_size, dtype)
: 返回一个形状为[batch_size, state_size]
的全零张量,注意此时state_size 是 LSTMStateTuple(c=num_units , h=num_units)
new_c
和 new_h
的组合,而 output 就是单独的 new_h
new_h
添加单独的 Softmax 层
才能得到最后的分类概率输出new_c = c * sigmoid(f + self._forget_bias) + sigmoid(i) * self._activation(j)
new_h = self._activation(new_c) * sigmoid(o)
if self._state_is_tuple:
new_state = LSTMStateTuple(new_c, new_h)
else:
new_state = array_ops.concat([new_c, new_h], 1)
return new_h, new_state
import tensorflow as tf
lstm_cell = tf.contrib.rnn.BasicRNNCell(num_units=128)
print(lstm_cell.output_size) # 128
print(lstm_cell.state_size) # LSTMStateTuple(c=128, h=128)
inputs = tf.placeholder(shape=[32, 100], dtype=tf.float32) # 32 是 batch_size
h0 = lstm_cell.zero_state(batch_size=32, dtype=tf.float32)
print(h0)
# LSTMStateTuple(c=, h=)
new_h, new_state = lstm_cell.call(inputs=inputs, state=h0) # 调用 call 函数, 在时间序列上推进一步
print(new_h.shape) # (32, 128)
print(new_state.h) # Tensor("mul_2:0", shape=(32, 128), dtype=float32)
print(new_state.c) # Tensor("add_1:0", shape=(32, 128), dtype=float32)
目的:解决基础的 RNNCell 每次只能在时间上前进了
一步
的缺点。
函数:TF 提供了一个tf.nn.dynamic_rnn
函数,使用该函数就相当于调用了 n 次call函数
。即通过 (h0,x1,x2,….,xn) ( h 0 , x 1 , x 2 , … . , x n ) 直接得 (h1,h2…,hn) ( h 1 , h 2 … , h n ) 。
1、 RNN
tf.nn.dynamic_rnn(cell, inputs, initial_state=None, sequence_length=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)
输入参数:
- cell: 一个 RNNCell 实例对象
- inputs: RNN 的输入序列
- initial_state: RNN 的初始状态, If cell.state_size is an integer, this must be a Tensor of appropriate type and shape
[batch_size, cell.state_size]
. If cell.state_size is a tuple, this should be a tuple of tensors having shapes[batch_size, s] for s in cell.state_size.
- sequence_length: 形状为
[batch_size]
, 其中的每一个值为sequence length(即 time_steps)
, eg:sequence_length=tf.fill([batch_size], time_steps)
- time_major: 默认为 False,输入和输出张量的形状为
[batch_size, max_time, depth]
;当取 True 的时候, it avoidstransposes
at the beginning and end of the RNN calculation,输入和输出张量的形状为[max_time, batch_size, depth]
- scope: VariableScope for the created subgraph; defaults to “rnn”.
输出 (outputs, state) :
- outputs:是 time_steps 步里所有的输出,形状为
[batch_size, max_time, cell.output_size]
- state:是最后一步的隐状态,形状为
batch_size, cell.state_size
time_major=False 时计算图中的 transpose 可视化:
2、 BLSTM
tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, inputs, initial_state_fw=None, initial_state_bw=None, sequence_length=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)
输入参数:
- 只比上面 1 中多了一个反向的 LSTMCell 实例对象和反向的初始状态;输入 inputs 相同,只是信息是双向传递的
输出 (outputs, output_states) :
- outputs:
- 输出是 time_steps 步里所有的输出, 它是一个元组
(output_fw, output_bw)
包含了前向和后向的输出结果,每一个结果的形状为[batch_size, max_time, cell_fw.output_size]
- It returns a tuple instead of a single concatenated Tensor. If the concatenated one is preferred, the forward and backward outputs can be concatenated as
tf.concat(outputs, 2)
- output_states:: 是一个元组
(output_state_fw, output_state_bw)
,包含前向和后向的最后一步的状态
很多时候,单层 RNN 的能力有限,我们需要多层的 RNN,在 TensorFlow 中,可以使用
tf.nn.rnn_cell.MultiRNNCell
函数对RNNCell 进行堆叠。
# 创建 2 个 LSTMCell,隐层神经元的数量分别为 128 和 256
rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [128, 256]]
# create a RNN cell composed sequentially of a number of RNNCells
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)
# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
inputs=data,
dtype=tf.float32)
1、TensorFlow中RNN实现的正确打开方式
2、https://www.tensorflow.org/api_guides/python/contrib.rnn
3、https://www.tensorflow.org/api_guides/python/nn#Recurrent_Neural_Networks