关于如何训练词向量,如何将文本数据组织成Keras的要求,本文不会讲述。
本文的目的在于解决经典论文集中的CNN分类模型,如下图所示:
从上图中可以看到,每次训练时,filter size的大小是变化的,包括3、4、5。而网上流传的利用Keras构建CNN文本分类模型中,filter size是固定的,因此本文就是要解决filter size变化的情况下的CNN文本分类模型。
一般的Keras.CNN
模型中filter size是固定不变的,代码如下所示:
def build_model(dropout):
embedding_dim = maxlen
model = models.Sequential()
# Embedding
model.add(layers.Embedding(embedding_matrix.shape[0],
embedding_dim,
weights = [embedding_matrix],
input_length = maxlen,
trainable = False))
model.add(layers.Conv1D(filters=128, kernel_size=3, activation='relu'))
model.add(layers.MaxPool1D(pool_size=2, strides=1))
model.add(layers.Flatten())
model.add(layers.Dropout(dropout))
model.add(layers.Dense(5, activation='softmax'))
rmsprop = optimizers.RMSprop(lr=0.002, rho=0.9, epsilon=1e-06)
model.compile(loss='categorical_crossentropy',
optimizer=rmsprop,
metrics=['accuracy'])
return model
如果想要改变filter size,那么需要用到keras的函数式API定义方式。
import keras.backend as K
from keras.layers.core import Lambda
# 输入
inp = Input(shape=[maxlen],)
# Embedding layers
emb = Embedding(max_features, output_dim=300, weights=[embedding_matrix], trainable = False)(inp)
# conv layers
convs = []
filter_sizes = [2,3,4]
for fsz in filter_sizes:
conv1 = Conv1D(filters,kernel_size=fsz,activation='relu')(emb)
# pool1 = layers.GlobalMaxPool1D()(conv1)
pool1 = MaxPooling1D(maxlen-fsz+1)(conv1)
pool1 = Flatten()(pool1)
convs.append(pool1)
# 对每个池化的层拼接,需要用Lambda方法进行转换,否则报错
merger = Lambda(lambda x: K.concatenate(x,axis=1))
merge = merger(convs)
#out = Dropout(0.5)(merge)
#output = Dense(32,activation='relu')(out)
#output = Dropout(0.5)(merge)
output = Dense(units=Y_train.shape[1],activation='sigmoid')(merge)
model = Model([inp],output)
model.compile(loss='categorical_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
对每个filter size确定的池化层拼接,如果直接使用# K.concatenate(convs,axis=1)
,则会报错AttributeError: 'NoneType' object has no attribute '_inbound_nodes'
。因此,需要用Lambda方法进行转换。具体的可以参考链接Lambda
from keras import layers
inp = Input(shape=(maxlen,))
x = Embedding(max_features, output_dim=300, weights=[embedding_matrix], trainable = False)(inp)
x = layers.LSTM(64, return_sequences=True)(x)
x = layers.GlobalMaxPool1D()(x)
x = Dense(16, activation="relu")(x)
x = Dropout(0.1)(x)
x = Dense(5, activation="sigmoid")(x)
model = Model(inputs=inp, outputs=x)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])