新手上路,
文本分类中,我将一个文本用不同的嵌入得到不同的表示,代码如下
#第一种嵌入
sentence_input_pos = Input(shape=(MAX_TEXT_LENGTH,), dtype='int32')
embedded_sequences_pos = embedding_layer_pos(sentence_input_pos)
l_lstm_pos = Bidirectional(GRU(50, return_sequences=False))(embedded_sequences_pos)
#第二种嵌入
sentence_input = Input(shape=(MAX_TEXT_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sentence_input)
l_lstm = Bidirectional(GRU(50, return_sequences=False))(embedded_sequences)
#拼接起来
concatenate = Lambda(lambda x :K.concatenate(x,-1))([l_lstm,l_lstm_pos])
preds = Dense(5, activation='softmax')(concatenate)
modelll = Model(inputs = [sentence_input_pos,sentence_input], outputs = preds)
print(modelll.summary())
plot_model(modelll, to_file='C:\Users\ycl\Desktop\Flatten.png', show_shapes=True)
sgd = optimizers.SGD(momentum=0.9)
modelll.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=["accuracy",f1,getRecall,getPrecision])
print("modelll fitting - Hierachical attention network")
modelll.fit([x_train,x_train], y_train, validation_data=(x_val, y_val),epochs=10, batch_size=64)
然后出现如下错误:
The error isbelow :
Traceback (most recent call last):
File "C:\Users\ycl\Desktop\sentimentanalysisLSTM+POS+Word2vec11.py", line 357, in
modelll.fit([x_train,x_train], y_train, validation_data=(x_val, y_val),epochs=10, batch_size=64)
File "D:\software\Anaconda\envs\tensorflow-gpu\lib\site-packages\keras\engine\training.py", line 972, in fit
batch_size=batch_size)
File "D:\software\Anaconda\envs\tensorflow-gpu\lib\site-packages\keras\engine\training.py", line 751, in _standardize_user_data
exception_prefix='input')
File "D:\software\Anaconda\envs\tensorflow-gpu\lib\site-packages\keras\engine\training_utils.py", line 102, in standardize_input_data
str(len(data)) + ' arrays: ' + str(data)[:200] + '...')
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[144, 38, 145, 146, 39, 40, 147, 41, 4, 39, 148, 6, 149,
150, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, ...
解决方案:
将modelll.fit([x_train,x_train], y_train, validation_data=(x_val, y_val),epochs=10, batch_size=64)
改成:modelll.fit([x_train,x_train], y_train, validation_data=([x_val,x_val], y_val),epochs=10, batch_size=64)
验证集的问题