本篇文章会分析比较LSTM模型和一维卷积模型的特点。
在进行比较之前,我们需要加载之前训练好的两个模型。
from keras.models import load_model
from keras.datasets import imdb
from keras.preprocessing.sequence import pad_sequences
model_conv = load_model("my_model_conv1d.h5")
model_lstm = load_model("my_model_lstm.h5")
下面提供了一个函数,用于编码数据
import numpy as np
#将字符串编码为整数序列
def encode_word(word):
#下载imdb内置的单词整数映射字典
word_index = imdb.get_word_index()
word = word.replace(","," ").replace("."," ").replace("?"," ").replace("!"," ").replace("'s'"," is").replace('"',"").replace(":","").replace(" "," ").strip()
word = word.lower()
word_list = word.split(" ")
return [(word_index.get(i,-3)+3) for i in word_list]
我们尝试编码这样一句话。
print(encode_word("I think it's a great movie. People want to see it again."))
输出结果如下:
[13, 104, 45, 6, 87, 20, 84, 181, 8, 67, 12, 174]
下面两个函数是用来使用一维卷积预测和LSTM模型预测的函数
#一维卷积模型预测
def conv1d_predict(word):
word = encode_word(word)
word = np.array(word)
word[word>10000] = 0
word = np.expand_dims(word,0)
word = pad_sequences(word,maxlen=maxlen)
imdb_score = model_lstm.predict(word)[0][0]
if imdb_score>0.5:
return "the score is:{} good".format(imdb_score)
return "the score is:{} bad".format(imdb_score)
#LSTM模型预测
def lstm_predict(word):
word = encode_word(word)
word = np.array(word)
word[word>10000] = 0
word = np.expand_dims(word,0)
word = pad_sequences(word,maxlen=maxlen)
imdb_score = model_lstm.predict(word)[0][0]
if imdb_score>0.5:
return "the score is:{} good".format(imdb_score)
return "the score is:{} bad".format(imdb_score)
准备工作做好了,下面来进行分析。
我们首先看这样一条影评:
这是一部很棒的电影。你无法想象我第一次看到它时的震惊。天哪!这是我出生以来看过的最好的电影。没有比这更好的了。我相信大多数人会和我保持同样的看法。
其英文翻译为:
It’s such a great movie. You can’t imagine the shock when I saw it for the first time. My God! It’s the best movie I’ve ever seen since I was born. There’s nothing better. I believe that most people will keep the same view as me.
使用了一维卷积的模型预测为:
print(conv1d_predict("It's such a great movie. You can't imagine the shock when I saw it for the first time. My God! It's the best movie I've ever seen since I was born. There's nothing better. I believe that most people will keep the same view as me."))
预测结果如下:
the score is:0.9934569001197815 good
模型预测为正面影评,给出了正面评论的概率为0.9934569001197815
再来看看LSTM模型的预测:
print(lstm_predict("It's such a great movie. You can't imagine the shock when I saw it for the first time. My God! It's the best movie I've ever seen since I was born. There's nothing better. I believe that most people will keep the same view as me."))
预测结果如下:
the score is:0.8504339456558228 good
模型预测为正面影评,给出了正面评论的概率为0.8504339456558228
然后再看第二条影评:
太糟糕了。我一秒钟都看不下去。这部电影只是制造麻烦。我不明白主任是怎么想的。这样的故事也可以编出来。是给孩子们看的。
其英文翻译为:
It’s too bad. I can’t watch it for a second. This movie is just making trouble. I don’t understand what the director thinks. This kind of story can also be made. It’s for children to watch.
使用了一维卷积的模型预测为:
print(conv1d_predict("It's too bad. I can't watch it for a second. This movie is just making trouble. I don't understand what the director thinks. This kind of story can also be made. It's for children to watch."))
预测结果如下:
the score is:0.02947079949080944 bad
模型预测为负面影评,给出了正面评论的概率为0.02947079949080944
再来看看LSTM模型的预测:
print(lstm_predict("It's too bad. I can't watch it for a second. This movie is just making trouble. I don't understand what the director thinks. This kind of story can also be made. It's for children to watch."))
预测结果如下:
the score is:0.20686979591846466 bad
模型预测为负面影评,给出了正面评论的概率为0.20686979591846466
从上面两个结果来看,似乎一维卷积模型的效果更加好, 因为它预测的概率更加准确。
我们分析一下第一条评论:
It’s such a great movie. You can’t imagine the shock when I saw it for the first time. My God! It’s the best movie I’ve ever seen since I was born. There’s nothing better. I believe that most people will keep the same view as me.
一维卷积模型捕获的特征是没有顺序的,它似乎捕获到了great movie
,best movie
,shock
这些词汇,于是给出了较高的概率,而LSTM模型无法捕获这些词汇,它会从顺序的角度来看这条影评,它能捕获因果关系。
再来看第二条影评:
It’s too bad. I can’t watch it for a second. This movie is just making trouble. I don’t understand what the director thinks. This kind of story can also be made. It’s for children to watch.
同理,一维卷积模型似乎捕获到了bad
,making trouble
这类的负面词汇,于是给出了为正面评论较低的概率,而LSTM就没有这么聪明了,它只能依顺序来看,然后进行预测。
从结果上来看,两者的预测都是正确的,但是貌似一维卷积模型效果更好,它似乎更加聪明,只找了一些表达情绪的词汇就判断出了这条影评是正面还是负面。那么我们当然也认为一维卷积模型更加出色吗?
口说无凭,我们再接着往下看。
下面我们用一条爱情公寓电影的影评来看看两个模型的效果(博主这里只是借用别人的评论,本人对此不发表任何看法)
这是第三条影评:
老实说,电视连续剧《爱情公寓》还是很不错的!但这部电影真的,情节太糟糕了。我完全不明白它在说什么。一会儿是现实,一会儿是科幻小说(主角的光环),一会儿是现实。只是没有焦点。太频繁了。结局太差太假。
其英文翻译如下:
To be honest, the TV series love apartment is very good! But the movie is really, the plot is terrible. I have no idea what it’s talking about. One moment is reality, one moment is science fiction (the aura of the protagonist), one moment is reality. Just no focus. Too often. The ending is too bad and too fake.
使用了一维卷积的模型预测为:
print(conv1d_predict("To be honest, the TV series love apartment is very good! But the movie is really, the plot is terrible. I have no idea what it's talking about. One moment is reality, one moment is science fiction (the aura of the protagonist), one moment is reality. Just no focus. Too often. The ending is too bad and too fake."))
预测结果如下:
the score is:0.7587981224060059 good
模型预测为正面影评,给出了正面评论的概率为0.7587981224060059
再来看看LSTM模型的预测:
print(lstm_predict("To be honest, the TV series love apartment is very good! But the movie is really, the plot is terrible. I have no idea what it's talking about. One moment is reality, one moment is science fiction (the aura of the protagonist), one moment is reality. Just no focus. Too often. The ending is too bad and too fake."))
预测结果如下:
the score is:0.21752914786338806 bad
模型预测为负面影评,给出了正面评论的概率为0.21752914786338806
到这里大家就看出差别了。
一维卷积模型给出了错误的预测,它预测结果为正面,给出了正面的概率为0.7587981224060059,而LSTM模型给出了正确的预测,它预测结果为负面,给出了正面的概率为0.21752914786338806。那么为什么会出现这样的情况了?刚刚我们不是觉得一维卷积模型的预测更为准确吗?这是由于模型的特性不同。
且看评论:
To be honest, the TV series love apartment is very good! But the movie is really, the plot is terrible. I have no idea what it’s talking about. One moment is reality, one moment is science fiction (the aura of the protagonist), one moment is reality. Just no focus. Too often. The ending is too bad and too fake.
这里一维卷积模型似乎是捕获到了very good!
同时也捕获到了is terrible
,在我们看来人家明明是说的TV好,但电影差,可是一维卷积模型就没有这么聪明了,它只聚焦到了这些词上,虽然这两个词的意思是相反的,但是它似乎觉得正面的概率还是稍微高一点,于是给出了0.7587981224060059的预测概率。我们可以从概率上看到它似乎也有一点纠结,不像前两次预测都是那么肯定。
而LSTM模型可是从顺序来看这条影评的啊,它意识到了But
这个转折,然后将后面的评价的权重放大,它理解了这个句子的前因后果,于是预测这条影评为负面。
到这里大家可以看出两个模型的区别的吧,两者看待问题的方式都不同,这是由于模型的结构所导致。
希望这篇文章能够帮助大家理解到两个模型的区别。