一个keras模型包含多个部件:
一种体系结构或配置,它指定模型包含哪些层以及如何连接它们。
一系列权重值(“模型状态”)。
优化器(通过编译模型定义)。
一组losses和metrics(通过编译模型或调用add_loss()或add_metric()定义)。
Keras API使得可以将这些一次保存到磁盘,或者仅选择性地保存其中一些:
将所有内容以TensorFlow SavedModel格式(或更旧的Keras H5格式)保存到单个存档中。 这是标准做法。
仅保存架构/配置,通常保存为JSON文件。
仅保存权重值。 通常在训练模型时使用。
让我们看一下这些选项中的每一个:什么时候使用其中一个? 它们如何工作?
如果您只有10秒钟的时间阅读本指南,则需要了解以下内容。
model = ... # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')
from tensorflow import keras
model = keras.models.load_model('path/to/location')
import numpy as np
import tensorflow as tf
from tensorflow import keras
您可以将整个模型保存到单个工件中。 它将包括:
def get_model():
# Create a simple model.
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss="mean_squared_error")
return model
model = get_model()
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)
# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")
# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_model")
# Let's check:
np.testing.assert_allclose(
model.predict(test_input), reconstructed_model.predict(test_input)
)
# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
保存模型及其层时,SavedModel格式存储类名称,调用函数,损失和权重(以及配置(如果已实现))。 调用函数定义模型/层的计算图。
以下是从SavedModel格式加载自定义图层而不覆盖config方法时发生的情况的示例。
class CustomModel(keras.Model):
def __init__(self, hidden_units):
super(CustomModel, self).__init__()
self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]
def call(self, inputs):
x = inputs
for layer in self.dense_layers:
x = layer(x)
return x
model = CustomModel([16, 16, 10])
# Build the model by calling it
input_arr = tf.random.uniform((1, 5))
outputs = model(input_arr)
model.save("my_model")
# Delete the custom-defined model class to ensure that the loader does not have
# access to it.
del CustomModel
loaded = keras.models.load_model("my_model")
np.testing.assert_allclose(loaded(input_arr), outputs)
print("Original model:", model)
print("Loaded model:", loaded)
Keras还支持保存包含模型的架构,权重值和compile()信息的单个HDF5文件。 它是SavedModel的轻量替代方案。
model = get_model()
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)
# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.
model.save("my_h5_model.h5")
# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_h5_model.h5")
# Let's check:
np.testing.assert_allclose(
model.predict(test_input), reconstructed_model.predict(test_input)
)
# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
模型的配置(或体系结构)指定了模型包含的层以及如何连接这些层*。 如果您具有模型的配置,则可以使用权重的新初始化状态创建模型,而无需编译信息。
这些类型的模型是显式的层图:它们的配置始终以结构化形式提供。
APIs
layer = keras.layers.Dense(3, activation="relu")
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)
model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)
inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)
子类化模型和层的体系结构在__init__和call方法中定义。 它们被认为是Python字节码,无法序列化为与JSON兼容的配置-您可以尝试序列化字节码(例如通过pickle),但这是完全不安全的,这意味着您的模型无法加载到其他系统上。
为了保存/加载具有自定义图层的模型或子类模型,您应该覆盖get_config和可选的from_config方法。 另外,您应该使用注册自定义对象,以便Keras知道它。
自定义函数(例如,激活丢失或初始化)不需要get_config方法。 只要将功能名称注册为自定义对象,该功能名称就足以加载。
可以加载Keras生成的TensorFlow图。 如果这样做,则无需提供任何custom_object。 您可以这样做:
model.save("my_model")
tensorflow_graph = tf.saved_model.load("my_model")
x = np.random.uniform(size=(4, 32)).astype(np.float32)
predicted = tensorflow_graph(x).numpy()
请注意,此方法有几个缺点:*由于可追溯性,您应该始终有权访问所使用的自定义对象。 您不想将无法重新创建的模型投入生产。 * tf.saved_model.load返回的对象不是Keras模型。 因此,它不是那么容易使用。 例如,您将无权访问.predict()或.fit()
即使不鼓励使用它,它也可以为您提供帮助,例如,如果您处于困境中,例如,丢失了自定义对象的代码或在使用tf.keras.models.load_model()加载模型时遇到问题。
Specifications:
class CustomLayer(keras.layers.Layer):
def __init__(self, a):
self.var = tf.Variable(a, name="var_a")
def call(self, inputs, training=False):
if training:
return inputs * self.var
else:
return inputs
def get_config(self):
return {"a": self.var.numpy()}
# There's actually no need to define `from_config` here, since returning
# `cls(**config)` is the default behavior.
@classmethod
def from_config(cls, config):
return cls(**config)
layer = CustomLayer(5)
layer.var.assign(2)
serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(
serialized_layer, custom_objects={"CustomLayer": CustomLayer}
)
Keras保留所有内置层,模型,优化器和度量标准类的主列表,该列表用于查找正确的类以调用from_config。 如果找不到该类,则会引发错误(值错误:未知图层)。 有几种方法可以将自定义类注册到此列表中:
class CustomLayer(keras.layers.Layer):
def __init__(self, units=32, **kwargs):
super(CustomLayer, self).__init__(**kwargs)
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=(self.units,), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
def get_config(self):
config = super(CustomLayer, self).get_config()
config.update({"units": self.units})
return config
def custom_activation(x):
return tf.nn.tanh(x) ** 2
# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)
# Retrieve the config
config = model.get_config()
# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {"CustomLayer": CustomLayer, "custom_activation": custom_activation}
with keras.utils.custom_object_scope(custom_objects):
new_model = keras.Model.from_config(config)
您可以选择仅保存和加载模型的权重。 在以下情况下这可能很有用:
可以使用get_weights和set_weights在不同对象之间复制权重:
def create_layer():
layer = keras.layers.Dense(64, activation="relu", name="dense_2")
layer.build((None, 784))
return layer
layer_1 = create_layer()
layer_2 = create_layer()
# Copy weights from layer 2 to layer 1
layer_2.set_weights(layer_1.get_weights())
在内存中将权重从一个模型转移到具有兼容架构的另一个模型
# Create a simple functional model
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
def __init__(self, output_dim, name=None):
super(SubclassedModel, self).__init__(name=name)
self.output_dim = output_dim
self.dense_1 = keras.layers.Dense(64, activation="relu", name="dense_1")
self.dense_2 = keras.layers.Dense(64, activation="relu", name="dense_2")
self.dense_3 = keras.layers.Dense(output_dim, name="predictions")
def call(self, inputs):
x = self.dense_1(inputs)
x = self.dense_2(x)
x = self.dense_3(x)
return x
def get_config(self):
return {"output_dim": self.output_dim, "name": self.name}
subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))
# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())
assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
np.testing.assert_allclose(a.numpy(), b.numpy())
由于无状态层不会更改权重的顺序或数量,因此即使存在额外的/缺少的无状态层,模型也可以具有兼容的体系结构。
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model_with_dropout = keras.Model(
inputs=inputs, outputs=outputs, name="3_layer_mlp"
)
functional_model_with_dropout.set_weights(functional_model.get_weights())
可以通过以下格式调用model.save_weights将权重保存到磁盘:
案例
# Runnable example
sequential_model = keras.Sequential(
[
keras.Input(shape=(784,), name="digits"),
keras.layers.Dense(64, activation="relu", name="dense_1"),
keras.layers.Dense(64, activation="relu", name="dense_2"),
keras.layers.Dense(10, name="predictions"),
]
)
sequential_model.save_weights("ckpt")
load_status = sequential_model.load_weights("ckpt")
# `assert_consumed` can be used as validation that all variable values have been
# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other
# methods in the Status object.
load_status.assert_consumed()
TensorFlow Checkpoint格式使用对象属性名称保存和恢复权重。 例如,考虑tf.keras.layers.Dense层。 该层包含两个权重:densed.kernel和densed.bias。 将图层保存为tf格式后,生成的检查点将包含键“内核”和“偏差”及其对应的权重值。 有关更多信息,请参见TF Checkpoint指南中的“加载机制”。
请注意,属性/图形边缘是以父对象中使用的名称而不是变量的名称命名的。 在下面的示例中考虑CustomLayer。 变量CustomLayer.var与“ var”一起作为键的一部分保存,而不是“ var_a”。
class CustomLayer(keras.layers.Layer):
def __init__(self, a):
self.var = tf.Variable(a, name="var_a")
layer = CustomLayer(5)
layer_ckpt = tf.train.Checkpoint(layer=layer).save("custom_layer")
ckpt_reader = tf.train.load_checkpoint(layer_ckpt)
ckpt_reader.get_variable_to_dtype_map()
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(
functional_model.inputs, functional_model.layers[-1].input, name="pretrained_model"
)
# Randomly assign "trained" weights.
for w in pretrained.weights:
w.assign(tf.random.normal(w.shape))
pretrained.save_weights("pretrained_ckpt")
pretrained.summary()
# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(5, name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="new_model")
# Load the weights from pretrained_ckpt into model.
model.load_weights("pretrained_ckpt")
# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
np.testing.assert_allclose(a.numpy(), b.numpy())
print("\n", "-" * 50)
model.summary()
# Example 2: Sequential model
# Recreate the pretrained model, and load the saved weights.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
pretrained_model = keras.Model(inputs=inputs, outputs=x, name="pretrained")
# Sequential example:
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
model.summary()
pretrained_model.load_weights("pretrained_ckpt")
# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,
# but will *not* work as expected. If you inspect the weights, you'll see that
# none of the weights will have loaded. `pretrained_model.load_weights()` is the
# correct method to call.
通常建议对构建模型使用相同的API。 如果在“顺序”和“功能”或“功能和子类”等之间切换,则始终重建预训练模型并将预训练权重加载到该模型。
下一个问题是,如果模型架构完全不同,如何将权重保存并加载到不同的模型中? 解决方案是使用tf.train.Checkpoint保存和还原确切的图层/变量。
案例:
# Create a subclassed model that essentially uses functional_model's first
# and last layers.
# First, save the weights of functional_model's first and last dense layers.
first_dense = functional_model.layers[1]
last_dense = functional_model.layers[-1]
ckpt_path = tf.train.Checkpoint(
dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias
).save("ckpt")
# Define the subclassed model.
class ContrivedModel(keras.Model):
def __init__(self):
super(ContrivedModel, self).__init__()
self.first_dense = keras.layers.Dense(64)
self.kernel = self.add_variable("kernel", shape=(64, 10))
self.bias = self.add_variable("bias", shape=(10,))
def call(self, inputs):
x = self.first_dense(inputs)
return tf.matmul(x, self.kernel) + self.bias
model = ContrivedModel()
# Call model on inputs to create the variables of the dense layer.
_ = model(tf.ones((1, 784)))
# Create a Checkpoint with the same structure as before, and load the weights.
tf.train.Checkpoint(
dense=model.first_dense, kernel=model.kernel, bias=model.bias
).restore(ckpt_path).assert_consumed()
HDF5格式包含按图层名称分组的权重。 权重是通过将可训练权重列表与不可训练权重列表(与layer.weights相同)连接而排序的列表。 因此,如果模型具有与保存在检查点中相同的图层和可训练状态,则可以使用hdf5检查点。
案例
# Runnable example
sequential_model = keras.Sequential(
[
keras.Input(shape=(784,), name="digits"),
keras.layers.Dense(64, activation="relu", name="dense_1"),
keras.layers.Dense(64, activation="relu", name="dense_2"),
keras.layers.Dense(10, name="predictions"),
]
)
sequential_model.save_weights("weights.h5")
sequential_model.load_weights("weights.h5")
请注意,当模型包含嵌套图层时,更改layer.trainable可能导致不同的layer.weights排序。
class NestedDenseLayer(keras.layers.Layer):
def __init__(self, units, name=None):
super(NestedDenseLayer, self).__init__(name=name)
self.dense_1 = keras.layers.Dense(units, name="dense_1")
self.dense_2 = keras.layers.Dense(units, name="dense_2")
def call(self, inputs):
return self.dense_2(self.dense_1(inputs))
nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, "nested")])
variable_names = [v.name for v in nested_model.weights]
print("variables: {}".format(variable_names))
print("\nChanging trainable status of one of the nested layers...")
nested_model.get_layer("nested").dense_1.trainable = False
variable_names_2 = [v.name for v in nested_model.weights]
print("\nvariables: {}".format(variable_names_2))
print("variable ordering changed:", variable_names != variable_names_2)
从HDF5加载预训练的权重时,建议将权重加载到原始检查点模型中,然后将所需的权重/图层提取到新模型中
def create_functional_model():
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
return keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")
functional_model = create_functional_model()
functional_model.save_weights("pretrained_weights.h5")
# In a separate program:
pretrained_model = create_functional_model()
pretrained_model.load_weights("pretrained_weights.h5")
# Create a new model by extracting layers from the original model:
extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name="dense_3"))
model = keras.Sequential(extracted_layers)
model.summary()
Serialization and saving