Keras 模型由多个组件组成:
您可以通过 Keras API 将这些片段一次性保存到磁盘,或仅选择性地保存其中一些片段:
下面我们来看看每个选项:什么时候选择其中哪个选项?它们是如何工作的?
model = ... # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')
将模型加载回来:
from tensorflow import keras
model = keras.models.load_model('path/to/location')
您可以将整个模型保存到单个工件中。它将包括:
模型的架构/配置
模型的权重值(在训练过程中学习)
模型的编译信息(如果调用了 compile())
优化器及其状态(如果有的话,使您可以从上次中断的位置重新开始训练)
model.save() 或 tf.keras.models.save_model()
tf.keras.models.load_model()
您可以使用两种格式将整个模型保存到磁盘:TensorFlow SavedModel 格式和较早的 Keras H5 格式。推荐使用 SavedModel 格式。它是使用 model.save() 时的默认格式。
您可以通过以下方式切换到 H5 格式:
将 save_format=‘h5’ 传递给 save()。
将以 .h5 或 .keras 结尾的文件名传递给 save()。
def get_model():
# Create a simple model.
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss="mean_squared_error")
return model
model = get_model()
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)
# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")
# It can be used to reconstruct the model identically.同一的,相等,相同的
reconstructed_model = keras.models.load_model("my_model")
# Let's check:
np.testing.assert_allclose(
model.predict(test_input), reconstructed_model.predict(test_input)
)
# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
4/4 [==============================] - 0s 1ms/step - loss: 0.3874
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Layer.updates (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
INFO:tensorflow:Assets written to: my_model/assets
4/4 [==============================] - 0s 1ms/step - loss: 0.3378
<tensorflow.python.keras.callbacks.History at 0x7f2cec175588>
-模型文件夹
-pb文件: 模型架构和训练配置(包括优化器、损失和指标)
-variables/ 目录 :权重
调用 model.save(‘my_model’) 会创建一个名为 my_model 的文件夹,其包含以下内容:
assets saved_model.pb variables
模型架构和训练配置(包括优化器、损失和指标)存储在 saved_model.pb 中。
权重保存在 variables/ 目录下。
使用 SavedModel 格式
SavedModel 包含一个完整的 TensorFlow 程序——不仅包含权重值,还包含计算。它不需要原始模型构建代码就可以运行,因此,对共享和部署(使用 TFLite、TensorFlow.js、TensorFlow Serving 或 TensorFlow Hub)非常有用。
本文深入探讨有关使用低级别 tf.saved_model API 的一些详细内容:
详情可以看链接
保存模型和模型的层时,SavedModel 格式会存储类名称、调用函数、损失和权重(如果已实现,则还包括配置)。调用函数会定义模型/层的计算图。
如果没有模型/层配置,调用函数会被用来创建一个与原始模型类似的模型,该模型可以被训练、评估和用于推断。
尽管如此,在编写自定义模型或层类时,对 get_config 和 from_config 方法进行定义始终是一种好的做法。这样您就可以稍后根据需要轻松更新计算。有关详细信息,请参阅自定义对象。
以下示例演示了在没有重写配置方法的情况下,从 SavedModel 格式加载自定义层所发生的情况。
class CustomModel(keras.Model):
def __init__(self, hidden_units):
super(CustomModel, self).__init__()
self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]
def call(self, inputs):
x = inputs
for layer in self.dense_layers:
x = layer(x)
return x
model = CustomModel([16, 16, 10])
# Build the model by calling it
input_arr = tf.random.uniform((1, 5))
outputs = model(input_arr)
model.save("my_model")
# Delete the custom-defined model class to ensure that the loader does not have
# access to it.
del CustomModel
loaded = keras.models.load_model("my_model")
np.testing.assert_allclose(loaded(input_arr), outputs)
print("Original model:", model)
print("Loaded model:", loaded)
INFO:tensorflow:Assets written to: my_model/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Original model: <__main__.CustomModel object at 0x7f2cec175fd0>
Loaded model: <tensorflow.python.keras.saving.saved_model.load.CustomModel object at 0x7f2d6ce80a58>
如上例所示,加载器动态地创建了一个与原始模型行为类似的新模型。
Keras 还支持保存单个 HDF5 文件,其中包含模型的架构、权重值和 compile() 信息。它是 SavedModel 的轻量化替代选择。
model = get_model()
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)
# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.
model.save("my_h5_model.h5")
# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_h5_model.h5")
# Let's check:
np.testing.assert_allclose(
model.predict(test_input), reconstructed_model.predict(test_input)
)
# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
与 SavedModel 格式相比,H5 文件不包括以下两方面内容:
通过 model.add_loss() 和 model.add_metric() 添加的外部损失和指标不会被保存(这与 SavedModel 不同)。
如果您的模型有此类损失和指标且您想要恢复训练,则您需要在加载模型后自行重新添加这些损失。请注意,这不适用于通过 self.add_loss() 和 self.add_metric()
在层内创建的损失/指标。只要该层被加载,这些损失和指标就会被保留,因为它们是该层 call 方法的一部分。
已保存的文件中不包含自定义对象(如自定义层)的计算图。在加载时,Keras 需要访问这些对象的 Python
类/函数以重建模型。请参阅自定义对象。
模型的配置(或架构)指定模型包含的层,以及这些层的连接方式*。如果您有模型的配置,则可以使用权重的新初始化状态创建模型,而无需编译信息。
*请注意,这仅适用于使用函数式或序列式 API 定义的模型,不适用于子类化模型。
这些类型的模型是显式的层计算图:它们的配置始终以结构化形式提供。
get_config() 和 from_config()
tf.keras.models.model_to_json() 和 tf.keras.models.model_from_json()
调用 config = model.get_config() 将返回一个包含模型配置的 Python 字典。然后可以通过 Sequential.from_config(config)(针对 Sequential 模型)或 Model.from_config(config)(针对函数式 API 模型)重建同一模型。
相同的工作流也适用于任何可序列化的层。
layer = keras.layers.Dense(3, activation="relu")
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)
model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)
inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)
这与 get_config / from_config 类似,不同之处在于它会将模型转换成 JSON 字符串,之后该字符串可以在没有原始模型类的情况下进行加载。它还特定于模型,不适用于层。
model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
json_config = model.to_json()
new_model = keras.models.model_from_json(json_config)
子类化模型和层的架构 在 init 和 call 方法中进行定义。它们被视为 Python 字节码,无法将其序列化为兼容 JSON 的配置,您可以尝试对字节码进行序列化(例如通过 pickle),但这样做极不安全,因为模型将无法在其他系统上进行加载。
为了保存/加载带有自定义层的模型或子类化模型,您应该重写 get_config 和 from_config(可选)方法。此外,您应该注册自定义对象,以便 Keras 能够感知它。
自定义函数(如激活损失或初始化)不需要 get_config 方法。只需将函数名称注册为自定义对象,就足以进行加载。
您可以加载由 Keras 生成的 TensorFlow 计算图。要进行此类加载,您无需提供任何 custom_objects。您可以执行以下代码进行加载:
model.save("my_model")
tensorflow_graph = tf.saved_model.load("my_model")
x = np.random.uniform(size=(4, 32)).astype(np.float32)
predicted = tensorflow_graph(x).numpy()
INFO:tensorflow:Assets written to: my_model/assets
请注意,此方法有几个缺点:
由于可
虽然不鼓励使用此方法,但当您遇到棘手问题(例如,您丢失了自定义对象的代码,或在使用 tf.keras.models.load_model() 加载模型时遇到了问题)时,它还是能够提供帮助。
有关详细信息,请参阅 tf.saved_model.load 相关页面。
规范:
class CustomLayer(keras.layers.Layer):
def __init__(self, a):
self.var = tf.Variable(a, name="var_a")
def call(self, inputs, training=False):
if training:
return inputs * self.var
else:
return inputs
def get_config(self):
return {
"a": self.var.numpy()}
# There's actually no need to define `from_config` here, since returning
# `cls(**config)` is the default behavior.
@classmethod
def from_config(cls, config):
return cls(**config)
layer = CustomLayer(5)
layer.var.assign(2)
serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(
serialized_layer, custom_objects={
"CustomLayer": CustomLayer}
)
Keras 会对生成了配置的类进行记录。在上例中,tf.keras.layers.serialize 生成了自定义层的序列化形式:
{
'class_name': 'CustomLayer', 'config': {
'a': 2} }
Keras 会保留所有内置的层、模型、优化器和指标的主列表,用于查找正确的类以调用 from_config。如果找不到该类,则会引发错误(Value Error: Unknown layer)。有几种方法可以将自定义类注册到此列表中:
class CustomLayer(keras.layers.Layer):
def __init__(self, units=32, **kwargs):
super(CustomLayer, self).__init__(**kwargs)
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=(self.units,), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
def get_config(self):
config = super(CustomLayer, self).get_config()
config.update({
"units": self.units})
return config
def custom_activation(x):
return tf.nn.tanh(x) ** 2
# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)
# Retrieve the config
config = model.get_config()
# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {
"CustomLayer": CustomLayer, "custom_activation": custom_activation}
with keras.utils.custom_object_scope(custom_objects):
new_model = keras.Model.from_config(config)
您还可以通过 tf.keras.models.clone_model() 在内存中克隆模型。这相当于获取模型的配置,然后通过配置重建模型(因此它不会保留编译信息或层的权重值)。
with keras.utils.custom_object_scope(custom_objects):
new_model = keras.models.clone_model(model)
您可以选择仅保存和加载模型的权重。这可能对以下情况有用:
您可以使用 get_weights 和 set_weights 在不同对象之间复制权重:
def create_layer():
layer = keras.layers.Dense(64, activation="relu", name="dense_2")
layer.build((None, 784))
return layer
layer_1 = create_layer()
layer_2 = create_layer()
# Copy weights from layer 2 to layer 1
layer_2.set_weights(layer_1.get_weights())
# Create a simple functional model
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
def __init__(self, output_dim, name=None):
super(SubclassedModel, self).__init__(name=name)
self.output_dim = output_dim
self.dense_1 = keras.layers.Dense(64, activation="relu", name="dense_1")
self.dense_2 = keras.layers.Dense(64, activation="relu", name="dense_2")
self.dense_3 = keras.layers.Dense(output_dim, name="predictions")
def call(self, inputs):
x = self.dense_1(inputs)
x = self.dense_2(x)
x = self.dense_3(x)
return x
def get_config(self):
return {
"output_dim": self.output_dim, "name": self.name}
subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))
# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())
assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
np.testing.assert_allclose(a.numpy(), b.numpy())
因为无状态层不会改变权重的顺序或数量,所以即便存在额外的/缺失的无状态层,模型也可以具有兼容架构。
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model_with_dropout = keras.Model(
inputs=inputs, outputs=outputs, name="3_layer_mlp"
)
functional_model_with_dropout.set_weights(functional_model.get_weights())
可以用以下格式调用 model.save_weights,将权重保存到磁盘:
TensorFlow 检查点
HDF5
model.save_weights 的默认格式是 TensorFlow 检查点。可以通过以下两种方法来指定保存格式:
您还可以选择将权重作为内存中的 Numpy 数组取回。每个 API 都有自己的优缺点,详情如下。
# Runnable example
sequential_model = keras.Sequential(
[
keras.Input(shape=(784,), name="digits"),
keras.layers.Dense(64, activation="relu", name="dense_1"),
keras.layers.Dense(64, activation="relu", name="dense_2"),
keras.layers.Dense(10, name="predictions"),
]
)
sequential_model.save_weights("ckpt")
load_status = sequential_model.load_weights("ckpt")
# `assert_consumed` can be used as validation that all variable values have been
# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other
# methods in the Status object.
load_status.assert_consumed()
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2d6cb709e8>
格式详情
TensorFlow 检查点格式使用对象特性名称来保存和恢复权重。以 tf.keras.layers.Dense 层为例。该层包含两个权重:dense.kernel 和 dense.bias。将层保存为 tf 格式后,生成的检查点会包含 “kernel” 和 “bias” 键及其对应的权重值。有关详细信息,请参阅 TF 检查点指南中的“加载机制”。
请注意,特性/计算图边缘根据父对象中使用的名称而非变量的名称进行命名。请考虑下面示例中的 CustomLayer。变量 CustomLayer.var 是将 “var” 而非 “var_a” 作为键的一部分来保存的。
class CustomLayer(keras.layers.Layer):
def __init__(self, a):
self.var = tf.Variable(a, name="var_a")
layer = CustomLayer(5)
layer_ckpt = tf.train.Checkpoint(layer=layer).save("custom_layer")
ckpt_reader = tf.train.load_checkpoint(layer_ckpt)
ckpt_reader.get_variable_to_dtype_map()
{
'save_counter/.ATTRIBUTES/VARIABLE_VALUE': tf.int64,
'_CHECKPOINTABLE_OBJECT_GRAPH': tf.string,
'layer/var/.ATTRIBUTES/VARIABLE_VALUE': tf.int32}
本质上,只要两个模型具有相同的架构,它们就可以共享同一个检查点。
两种方式:两个example
example1:函数式模型,
建立了一个模型1后,保存其Checkpoint,然后重构一个包含模型1的结构的模型2,在模型2中即可直接load_weights()进来模型1的权重
example2:序列模型Sequential
直接将前面建立的模型加入到Sequential中,然后用原模型加载权重
关键代码
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
pretrained_model.load_weights("pretrained_ckpt")
源代码
# --------example1
#----step1
# 先创建一个模型
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
#抽取上面模型的输入和最后一层的输入(即倒数第二层的输出)
# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(
functional_model.inputs, functional_model.layers[-1].input, name="pretrained_model"
)
# 随机赋值
# Randomly assign "trained" weights.
for w in pretrained.weights:
w.assign(tf.random.normal(w.shape))
pretrained.save_weights("pretrained_ckpt")
pretrained.summary()
----step2
# 假设这是另一个程序,相同的架构指的是,pretrained 模型保存了3个层(即784,64,64三个Dense),
# 本模型也有这三个层,当然最后还加入了一个dense (5),由于有了相同的架构,可以把pretrained模型前三个层权重提取过来,checkpoint只保存到Dense(64)
# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(5, name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="new_model")
# Load the weights from pretrained_ckpt into model.
model.load_weights("pretrained_ckpt")
# Check that all of the pretrained weights have been loaded.
# 测试前面的预训练权重,和后面建立的model权重一样
for a, b in zip(pretrained.weights, model.weights):
np.testing.assert_allclose(a.numpy(), b.numpy())
print("\n", "-" * 50)
model.summary()
#--------------------------------------
# Example 2: Sequential model
# Recreate the pretrained model, and load the saved weights.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
pretrained_model = keras.Model(inputs=inputs, outputs=x, name="pretrained")
# Sequential example:
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
model.summary()
pretrained_model.load_weights("pretrained_ckpt")
# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,
# but will *not* work as expected. If you inspect the weights, you'll see that
# none of the weights will have loaded. `pretrained_model.load_weights()` is the
# correct method to call.
输出:
Model: "pretrained_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
digits (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 50240
_________________________________________________________________
dense_2 (Dense) (None, 64) 4160
=================================================================
Total params: 54,400
Trainable params: 54,400
Non-trainable params: 0
_________________________________________________________________
--------------------------------------------------
Model: "new_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
digits (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 50240
_________________________________________________________________
dense_2 (Dense) (None, 64) 4160
_________________________________________________________________
predictions (Dense) (None, 5) 325
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
pretrained (Functional) (None, 64) 54400
_________________________________________________________________
predictions (Dense) (None, 5) 325
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2d6cb47080>
通常建议使用相同的 API 来构建模型。如果您在序贯模型和函数式模型之间,或在函数式模型和子类化模型等之间进行切换,请始终重新构建预训练模型并将预训练权重加载到该模型。
下一个问题是,如果模型架构截然不同,如何保存权重并将其加载到不同模型?解决方案是使用 tf.train.Checkpoint 来保存和恢复确切的层/变量。
# Create a subclassed model that essentially uses functional_model's first
# and last layers.
# First, save the weights of functional_model's first and last dense layers.
# 先用tf.train.Checkpoint保存functional_model的一些层和变量
first_dense = functional_model.layers[1]
last_dense = functional_model.layers[-1]
ckpt_path = tf.train.Checkpoint(
dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias
).save("ckpt")
#定义子模型
# Define the subclassed model.
class ContrivedModel(keras.Model):
def __init__(self):
super(ContrivedModel, self).__init__()
self.first_dense = keras.layers.Dense(64)
self.kernel = self.add_variable("kernel", shape=(64, 10))
self.bias = self.add_variable("bias", shape=(10,))
def call(self, inputs):
x = self.first_dense(inputs)
return tf.matmul(x, self.kernel) + self.bias
# 人造 Contrived 模型
model = ContrivedModel()
# Call model on inputs to create the variables of the dense layer.
_ = model(tf.ones((1, 784)))
# Create a Checkpoint with the same structure as before, and load the weights.
tf.train.Checkpoint(
dense=model.first_dense, kernel=model.kernel, bias=model.bias
).restore(ckpt_path).assert_consumed()
WARNING:tensorflow:From <ipython-input-1-eec1d28bc826>:15: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2d6cb53940>
HDF5 格式包含按层名称分组的权重。权重是通过将可训练权重列表与不可训练权重列表连接起来进行排序的列表(与 layer.weights 相同)。
因此,如果模型的层和可训练状态与保存在检查点中的相同,则可以使用 HDF5 检查点。
# Runnable example
sequential_model = keras.Sequential(
[
keras.Input(shape=(784,), name="digits"),
keras.layers.Dense(64, activation="relu", name="dense_1"),
keras.layers.Dense(64, activation="relu", name="dense_2"),
keras.layers.Dense(10, name="predictions"),
]
)
sequential_model.save_weights("weights.h5")
sequential_model.load_weights("weights.h5")
请注意,当模型包含嵌套层时,更改 layer.trainable 可能导致 layer.weights 的顺序不同。trainable =False 的层 variables将放在权重列表的后面
class NestedDenseLayer(keras.layers.Layer):
def __init__(self, units, name=None):
super(NestedDenseLayer, self).__init__(name=name)
self.dense_1 = keras.layers.Dense(units, name="dense_1")
self.dense_2 = keras.layers.Dense(units, name="dense_2")
def call(self, inputs):
return self.dense_2(self.dense_1(inputs))
nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, "nested")])
variable_names = [v.name for v in nested_model.weights]
print("variables: {}".format(variable_names))
#-------------
print("\nChanging trainable status of one of the nested layers...")
nested_model.get_layer("nested").dense_1.trainable = False
variable_names_2 = [v.name for v in nested_model.weights]
print("\nvariables: {}".format(variable_names_2))
print("variable ordering changed:", variable_names != variable_names_2)
variables: ['nested/dense_1/kernel:0', 'nested/dense_1/bias:0', 'nested/dense_2/kernel:0', 'nested/dense_2/bias:0']
Changing trainable status of one of the nested layers...
variables: ['nested/dense_2/kernel:0', 'nested/dense_2/bias:0', 'nested/dense_1/kernel:0', 'nested/dense_1/bias:0']
variable ordering changed: True
从 HDF5 加载预训练权重时,建议将权重加载到设置了检查点的原始模型中,然后将所需的权重/层提取到新模型中。
def create_functional_model():
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
return keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
functional_model = create_functional_model()
functional_model.save_weights("pretrained_weights.h5")
# In a separate program:
pretrained_model = create_functional_model()
pretrained_model.load_weights("pretrained_weights.h5")
# Create a new model by extracting layers from the original model:
extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name="dense_3"))
model = keras.Sequential(extracted_layers)
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 64) 50240
_________________________________________________________________
dense_2 (Dense) (None, 64) 4160
_________________________________________________________________
dense_3 (Dense) (None, 5) 325
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
1 keras保存模型,就是model.save()或者tf.keras.models.save_model() 和 keras.models.load_model()
这是SavedModel 格式
2 获取模型的配置,使用权重的新初始化状态创建模型,而无需编译信息。
get_config() 和 from_config()
tf.keras.models.model_to_json() 和 tf.keras.models.model_from_json()
3 用于内存中权重迁移的 API:get_weights(),set_weights()
或者将其保存到磁盘model.save_weights (“ckpt”),model.load_weights(“ckpt”)
相同模型可以共享权重,不同架构模型可以用tf.train.Checkpoint 保存和恢复 层/变量