imdb是一个文本情感分析的数据集,通过评论来分析观众对电影是好评还是差评
其网络结构比较简单
________________________________________________________________________________
Layer (type) Output Shape Param #
================================================================================
embedding_1 (Embedding) (None, 100, 128) 2560000
________________________________________________________________________________
bidirectional_1 (Bidirectional) (None, 128) 98816
________________________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
________________________________________________________________________________
dense_1 (Dense) (None, 1) 129
================================================================================
Total params: 2,658,945
Trainable params: 2,658,945
Non-trainable params: 0
________________________________________________________________________________
对imdb数据集稍微分析一下,
通过函数load_data获取到的x_train, y_train,是一堆编号,这个编号不太直接,可以通过下面代码解析出来:
word_index = imdb.get_word_index()
word_index = {k:(v+3) for k,v in word_index.items()}
word_index[""] = 0
word_index[""] = 1
word_index[""] = 2 # unknown
word_index[""] = 3
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(text):
return ' '.join([reverse_word_index.get(i, '?') for i in text])
for i in range(10):
print(decode_review(x_train[i]))
print(y_train[i])
就可以看到评论的具体内容,而y_train打印出来的是0和1,分别代表差评和好评
x_train和y_train的shape分别为
(25000, 100)
(25000,)
——————————————————————————————————
不另开帖子了,把其他几个网络的结构也贴出来备忘:
imdb_cnn_lstm.py的神经网络结构如下:
________________________________________________________________________________
Layer (type) Output Shape Param #
================================================================================
embedding_1 (Embedding) (None, 100, 128) 2560000
________________________________________________________________________________
dropout_1 (Dropout) (None, 100, 128) 0
________________________________________________________________________________
conv1d_1 (Conv1D) (None, 96, 64) 41024
________________________________________________________________________________
max_pooling1d_1 (MaxPooling1D) (None, 24, 64) 0
________________________________________________________________________________
lstm_1 (LSTM) (None, 70) 37800
________________________________________________________________________________
dense_1 (Dense) (None, 1) 71
________________________________________________________________________________
activation_1 (Activation) (None, 1) 0
================================================================================
Total params: 2,638,895
Trainable params: 2,638,895
Non-trainable params: 0
________________________________________________________________________________
imdb_cnn.py的神经网络结构如下:
____________________________________________________________________________________________________
Layer (type) Output Shape Param #
====================================================================================================
embedding_1 (Embedding) (None, 400, 50) 250000
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 400, 50) 0
____________________________________________________________________________________________________
conv1d_1 (Conv1D) (None, 398, 250) 37750
____________________________________________________________________________________________________
global_max_pooling1d_1 (GlobalMaxPooling1D) (None, 250) 0
____________________________________________________________________________________________________
dense_1 (Dense) (None, 250) 62750
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 250) 0
____________________________________________________________________________________________________
activation_1 (Activation) (None, 250) 0
____________________________________________________________________________________________________
dense_2 (Dense) (None, 1) 251
____________________________________________________________________________________________________
activation_2 (Activation) (None, 1) 0
====================================================================================================
Total params: 350,751
Trainable params: 350,751
Non-trainable params: 0
____________________________________________________________________________________________________
imdb_lstm.py的神经网络结构为:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, None, 128) 2560000
_________________________________________________________________
lstm_1 (LSTM) (None, 128) 131584
_________________________________________________________________
dense_1 (Dense) (None, 1) 129
=================================================================
Total params: 2,691,713
Trainable params: 2,691,713
Non-trainable params: 0
_________________________________________________________________
——————————————————————
总目录