keras FAQ

介绍

这里记录keras文档FAQ中在工作中用到的一些问题和技巧。参考自这里
主要包括:

  • 多GPU训练
  • 获取中间层的输出
  • 冻结(freeze)某些层

多GPU运行

运行一个模型在多个gpu上有两种方法:数据并行、设备并行

数据并行

数据并行是将一个模型在每个GPU上都部署一份进行训练,同时处理,加速训练。
keras有内置的工具keras.utils.multi_gpu_model,该模块可以为任何自定义模型产生一个数据并行模型,在多gpu上达到线性拟合加速(quasi-linear speedup)。
更多可以参考multi_gpu_model
这里给出一个例子

from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# This `fit` call will be distributed on 8 GPUs.
parallel_model.fit(x, y, epochs=20, batch_size=256) # batch size: 256, each GPU will process 32 samples.

设备并行

设备并行是在不同的GPU上运行一个模型的多个分支,多用于模型中有多个并行的结构,例如AlexNet的卷积就是放到多个GPU上运行的。提供一个例子

# Model where a shared LSTM is used to encode two different sequences in parallel
input_a = keras.Input(shape=(140, 256))
input_b = keras.Input(shape=(140, 256))
shared_lstm = keras.layers.LSTM(64)
# Process the first sequence on one GPU
with tf.device_scope('/gpu:0'):
    encoded_a = shared_lstm(tweet_a)
# Process the next sequence on another GPU
with tf.device_scope('/gpu:1'):
    encoded_b = shared_lstm(tweet_b)
# Concatenate results on CPU
with tf.device_scope('/cpu:0'):
    merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

如何获取某一层的输出

从Model中获取输出

创建一个模型,直接输出模型预测的结果。如下。

from keras.models import Model
model = ...  # create the original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

使用keras function

from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

如果模型有dropout、BN

如果模型有dropout、BN这种训练期有效、测试期无效的层,需要给一个指标(flag)。如下

get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([x, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([x, 1])[0]

如何冻结(freeze)某些层

冻结代表在训练时期,某一些层的参数是不变的。这个多用于微调模型。
只需要在创建某一层的时候设定trainable参数为False。
frozen_layer = Dense(32, trainable=False)
或者在创建之后设定,如下。

x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)

frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')

layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')

frozen_model.fit(data, labels)  # this does NOT update the weights of `layer`
trainable_model.fit(data, labels)  # this updates the weights of `layer`

你可能感兴趣的:(keras FAQ)