MLP或者CNN都只能依照当前的状态进行识别,如果处理时间序列的问题,就需要RNN、LSTM模型了。本博客使用 LSTM 对 IMDb 数据集进行分析预测,用MLP进行预测可以参考这篇博客 【Keras-MLP】IMDb,用RNN模型进行分类的可以参考【Keras-RNN】IMDb
LSTM(long short term memory)也是一种时间递归神经网络,专门设计来解决RNNN的长期依赖问题。LSTM的设计是为了克服传统RNN的学习远处连接信息的能力。简单说RNN有短期记忆,但是没有长期记忆。
话不多说,直接开始正文
参考/转载
LSTM全名是Long Short-Term Memory,长短时记忆网络,可以用来处理时序数据,在自然语言处理和语音识别等领域应用广泛。和原始的循环神经网络RNN相比,LSTM解决了RNN的梯度消失问题,可以处理长序列数据,成为当前最流行的RNN变体。
单输入变成4输入,3个门由信号控制,sigmoid后信号在0-1之间
公式推导如下
Q: 对于第二个输入门和第三个输出门,为什么要用tanh函数,不用别的函数呢?
A: 可以用其它函数,比如relu,效果可能更好,用 tanh 的主要原因是它的结果在-1到1之间,LSTM记忆的范围更大
import urllib.request
import os
import tarfile
#下载数据集
url="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
filepath="data/aclImdb_v1.tar.gz"
if not os.path.isfile(filepath):
result=urllib.request.urlretrieve(url,filepath)
print('downloaded:',result)
# 解压
if not os.path.exists("data/aclImdb"):
tfile = tarfile.open("data/aclImdb_v1.tar.gz", 'r:gz')
result=tfile.extractall('data/')
同 【Keras-MLP】IMDb
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.preprocessing.text import Tokenizer
import re
re_tag = re.compile(r'<[^>]+>')
def rm_tags(text):
return re_tag.sub('', text)
import os
def read_files(filetype):
path = "data/aclImdb/"
file_list=[]
positive_path=path + filetype+"/pos/"
for f in os.listdir(positive_path):
file_list+=[positive_path+f]
negative_path=path + filetype+"/neg/"
for f in os.listdir(negative_path):
file_list+=[negative_path+f]
print('read',filetype, 'files:',len(file_list))
all_labels = ([1] * 12500 + [0] * 12500)
all_texts = []
for fi in file_list:
with open(fi,encoding='utf8') as file_input:
all_texts += [rm_tags(" ".join(file_input.readlines()))]
return all_labels,all_texts
开始处理
# 读文件
y_train,train_text=read_files("train")
y_test,test_text=read_files("test")
# 建立单词和数字映射的字典
token = Tokenizer(num_words=3800)
token.fit_on_texts(train_text)
#将影评的单词映射到数字
x_train_seq = token.texts_to_sequences(train_text)
x_test_seq = token.texts_to_sequences(test_text)
# 让所有影评保持在380个数字
x_train = sequence.pad_sequences(x_train_seq, maxlen=380)
x_test = sequence.pad_sequences(x_test_seq, maxlen=380)
output
read train files: 25000
read test files: 25000
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation,Flatten
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
model = Sequential()
model.add(Embedding(output_dim=32,
input_dim=3800,
input_length=380))
model.add(Dropout(0.2))
# 加LSTM
model.add(LSTM(32))
model.add(Dense(units=256,
activation='relu' ))
model.add(Dropout(0.2))
model.add(Dense(units=1,
activation='sigmoid' ))
model.summary()
output
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 380, 32) 121600
_________________________________________________________________
dropout_1 (Dropout) (None, 380, 32) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 32) 8320
_________________________________________________________________
dense_1 (Dense) (None, 256) 8448
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 257
=================================================================
Total params: 138,625
Trainable params: 138,625
Non-trainable params: 0
_________________________________________________________________
参数计算
3800*32 = 121600
LSTM 参数量计算参考
32*4(32+32+1) = 8320
32*256+256 = 8448
256*1 + 1 = 257
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
train_history =model.fit(x_train, y_train,batch_size=100,
epochs=10,verbose=2,
validation_split=0.2)
参数说明请参考
【Keras-MLP】MNIST 或者 Keras中文文档
output
Train on 20000 samples, validate on 5000 samples
Epoch 1/10
- 74s - loss: 0.4944 - acc: 0.7517 - val_loss: 0.4402 - val_acc: 0.7676
Epoch 2/10
- 70s - loss: 0.2842 - acc: 0.8844 - val_loss: 0.3079 - val_acc: 0.8626
Epoch 3/10
- 69s - loss: 0.2373 - acc: 0.9067 - val_loss: 0.5062 - val_acc: 0.7886
Epoch 4/10
- 69s - loss: 0.2094 - acc: 0.9186 - val_loss: 0.3603 - val_acc: 0.8424
Epoch 5/10
- 71s - loss: 0.1953 - acc: 0.9253 - val_loss: 0.4827 - val_acc: 0.7898
Epoch 6/10
- 70s - loss: 0.1873 - acc: 0.9267 - val_loss: 0.3809 - val_acc: 0.8552
Epoch 7/10
- 71s - loss: 0.1794 - acc: 0.9326 - val_loss: 0.4820 - val_acc: 0.8036
Epoch 8/10
- 72s - loss: 0.1582 - acc: 0.9418 - val_loss: 0.6176 - val_acc: 0.7926
Epoch 9/10
- 70s - loss: 0.1442 - acc: 0.9479 - val_loss: 0.4762 - val_acc: 0.8340
Epoch 10/10
- 71s - loss: 0.1375 - acc: 0.9489 - val_loss: 0.5651 - val_acc: 0.8022
可视化结果
%pylab inline
import matplotlib.pyplot as plt
def show_train_history(train_history,train,validation):
plt.plot(train_history.history[train])
plt.plot(train_history.history[validation])
plt.title('Train History')
plt.ylabel(train)
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
调用查看精度变化
show_train_history(train_history,'acc','val_acc')
调用查看损失变化
show_train_history(train_history,'loss','val_loss')
scores = model.evaluate(x_test, y_test, verbose=1)
scores[1]
output
25000/25000 [==============================] - 42s 2ms/step
0.85292
Note:scores[0] 为损失
查看输出的概率
probility=model.predict(x_test)
probility[:10]
output
array([[0.99887246],
[0.9926156 ],
[0.9984492 ],
[0.8551011 ],
[0.99635917],
[0.9980083 ],
[0.9960477 ],
[0.9684047 ],
[0.87130046],
[0.26892343]], dtype=float32)
查看输出的结果,大于0.5的为1.小于0.5的为0
predict=model.predict_classes(x_test)
predict[:10]
output
array([[1],
[1],
[1],
[1],
[1],
[1],
[1],
[1],
[1],
[0]], dtype=int32)
SentimentDict={1:'正面的',0:'负面的'}
def display_test_Sentiment(i):
print(test_text[i])
print('label:',SentimentDict[y_test[i]],
'预测结果:',SentimentDict[predict[i][0]])
调用
display_test_Sentiment(2)
output
BLACK WATER is a thriller that manages to completely transcend it’s limitations (it’s an indie flick) by continually subverting expectations to emerge as an intense experience.In the tradition of all good animal centered thrillers ie Jaws, The Edge, the original Cat People, the directors know that restraint and what isn’t shown are the best ways to pack a punch. The performances are real and gripping, the crocdodile is extremely well done, indeed if the Black Water website is to be believed that’s because they used real crocs and the swamp location is fabulous.If you are after a B-grade gore fest croc romp forget Black Water but if you want a clever, suspenseful ride that will have you fearing the water and wondering what the hell would I do if i was up that tree then it’s a must see.
label: 正面的 预测结果: 正面的
http://www.imdb.com/title/tt2771200
def predict_review(input_text):
# 影评转换为数字列表
input_seq = token.texts_to_sequences([input_text])
# 截断数字列表使得所有输入长度为380
pad_input_seq = sequence.pad_sequences(input_seq , maxlen=380)
# 预测分类结果
predict_result=model.predict_classes(pad_input_seq)
# 输出结果
print(SentimentDict[predict_result[0][0]])
调用
predict_review(’’’
It’s hard to believe that the same talented director who made the influential cult action classic The Road Warrior had anything to do with this disaster.
Road Warrior was raw, gritty, violent and uncompromising, and this movie is the exact opposite. It’s like Road Warrior for kids who need constant action in their movies.
This is the movie. The good guys get into a fight with the bad guys, outrun them, they break down in their vehicle and fix it. Rinse and repeat. The second half of the movie is the first half again just done faster.
The Road Warrior may have been a simple premise but it made you feel something, even with it’s opening narration before any action was even shown. And the supporting characters were given just enough time for each of them to be likable or relatable.
In this movie there is absolutely nothing and no one to care about. We’re supposed to care about the characters because… well we should. George Miller just wants us to, and in one of the most cringe worthy moments Charlize Theron’s character breaks down while dramatic music plays to try desperately to make us care.
Tom Hardy is pathetic as Max. One of the dullest leading men I’ve seen in a long time. There’s not one single moment throughout the entire movie where he comes anywhere near reaching the same level of charisma Mel Gibson did in the role. Gibson made more of an impression just eating a tin of dog food. I’m still confused as to what accent Hardy was even trying to do.
I was amazed that Max has now become a cartoon character as well. Gibson’s Max was a semi-realistic tough guy who hurt, bled, and nearly died several times. Now he survives car crashes and tornadoes with ease?
In the previous movies, fuel and guns and bullets were rare. Not anymore. It doesn’t even seem Post-Apocalyptic. There’s no sense of desperation anymore and everything is too glossy looking. And the main villain’s super model looking wives with their perfect skin are about as convincing as apocalyptic survivors as Hardy’s Australian accent is. They’re so boring and one-dimensional, George Miller could have combined them all into one character and you wouldn’t miss anyone.
Some of the green screen is very obvious and fake looking, and the CGI sandstorm is laughably bad. It wouldn’t look out of place in a Pixar movie.
There’s no tension, no real struggle, or any real dirt and grit that Road Warrior had. Everything George Miller got right with that masterpiece he gets completely wrong here.
‘’’)
output
负面的
model_json = model.to_json()
with open("SaveModel/Imdb_RNN_model.json", "w") as json_file:
json_file.write(model_json)
model.save_weights("SaveModel/Imdb_RNN_model.h5")
print("Saved model to disk")
声明:代码源于《TensorFlow+Keras深度学习人工智能实践应用》 林大贵版,引用、转载请注明出处,谢谢,如果对书本感兴趣,买一本看看吧!!!