[LSTM学习笔记8]How to Develop Bidirectional LSTMs

一.结构

1.概述
Bidirectional RNN(BRNN)同时使用正向和反向的序列来进行预测,前提是需要在预测输出前知道完整的输入序列。其结构如图,具体可以参加论文《B idirectional Recurrent Neural Networks》
[LSTM学习笔记8]How to Develop Bidirectional LSTMs_第1张图片
2.实现
(1)LSTM层中的参数go_backwards可以用于指定方向性:

model = Sequential()
model.add(LSTM(..., input_shape=(...), go_backwards=True))
...

(2)Bidirectional封装

model = Sequential()
model.add(Bidirectional(LSTM(...), input_shape=(...)))
...

3.正向和反向LSTM的结合形式
正向和反向LSTM的输出需要进行结合以提供给后续的网络模块作为输入,Keras中定义的结合方式有:

  • ‘sum’:求和
  • ‘mul’:相乘
  • ‘concat’:连接(默认方式),输出给下一层模块的数据量将是单向LSTM的两倍
  • ‘ave’:平均

二.以累积和问题为例展示BRNN

1.问题描述:
给定长度为n_timesteps个0到1之间的随机数,定义阈值为输入序列长度的四分之一,当输入数字的累积和大于阈值时,输出结果为1,否则为0,比如输入序列为:
0.63144003 0.29414551 0.91587952 0.95189228 0.32195638 0.60742236 0.83895793 0.18023048
0.84762691 0.29165514
输出应该为:0001111111 ,输出应该在所有输入序列均进入网络后才开始输出
[LSTM学习笔记8]How to Develop Bidirectional LSTMs_第2张图片
2.代码实现:

from random import random
from numpy import array
from numpy import cumsum
from numpy import array_equal
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import TimeDistributed
from keras.layers import Bidirectional

# create a cumulative sum sequence
def get_sequence(n_timesteps):
    # create a sequence of random numbers in [0,1]
    X = array([random() for _ in range(n_timesteps)])
    # calculate cut -off value to change class values
    limit = n_timesteps/4.0
    # determine the class outcome for each item in cumulative sequence
    y = array([0 if x < limit else 1 for x in cumsum(X)])
    return X, y

# create multiple samples of cumulative sum sequences
def get_sequences(n_sequences, n_timesteps):
    seqX, seqY = list (), list()
    # create and store sequences
    for _ in range(n_sequences):
        X, y = get_sequence(n_timesteps)
        seqX.append(X)
        seqY.append(y)
    # reshape input and output for lstm
    seqX = array(seqX).reshape(n_sequences, n_timesteps, 1)
    seqY = array(seqY).reshape(n_sequences, n_timesteps, 1)
    return seqX, seqY

# define problem
n_timesteps = 10

# define LSTM
model = Sequential()
model.add(Bidirectional(LSTM(50, return_sequences=True), input_shape=(n_timesteps, 1)))
model.add(TimeDistributed(Dense(1, activation= 'sigmoid' )))
model.compile(loss='binary_crossentropy' , optimizer='adam' , metrics=['acc'])
print (model.summary())

[LSTM学习笔记8]How to Develop Bidirectional LSTMs_第3张图片

# train LSTM
X, y = get_sequences(50000, n_timesteps)
model.fit(X, y, epochs=1, batch_size=10)

在这里插入图片描述

# evaluate LSTM
X, y = get_sequences(100, n_timesteps)
loss, acc = model.evaluate(X, y, verbose=0)
print ('Loss : %f, Accuracy: %f '% (loss, acc*100))

在这里插入图片描述

# make predictions
for _ in range(10):
    X, y = get_sequences(1, n_timesteps)
    yhat = model.predict_classes(X, verbose=0)
    exp, pred = y.reshape(n_timesteps), yhat.reshape(n_timesteps)
    print('y =%s , yhat =%s, correct=%s' % (exp, pred, array_equal(exp,pred)))

[LSTM学习笔记8]How to Develop Bidirectional LSTMs_第4张图片

你可能感兴趣的:(机器学习,Keras)