本文是对官方文档的学习笔记。

介绍

Keras model 主要包括以下几个组成部分：

结构、配置，它们定义了一个模型的架构，各个部分之间如何连接
权重
优化器
损失函数和 Metric
利用 Keras API 可以把这些部分打包成一个 model保存，同时也可以单独保存。
以 TensorFlow 模型打包保存（或者 Kears H5 格式）
只保存架构（通常以 JSON 方式保存）
只保存权重

太长不看版

最简单的 Save &Load

model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')

from tensorflow import keras
model = keras.models.load_model('path/to/location')

下面的文章是对细节的描述

保存整个模型

APIs

- model.save() or tf.keras.models.save_model()
- tf.keras.models.load_model()

模型可以被保存成2中格式

H5 （Keras 旧格式）
Tensorflow SavedModel 格式（推荐）
使用 model.save() SavedModel 是默认模式，但是可以指定使用 H5
在 save() 中指定 save_format='h5'
filename 以 .h5 结尾

Example:

def get_model():
    # Create a simple model.
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer="adam", loss="mean_squared_error")
    return model


model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_model")

# Let's check:
np.testing.assert_allclose(
    model.predict(test_input), reconstructed_model.predict(test_input)
)

# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)

SavedModel 保存了什么？

代码 model.save('my_model') 会生成一个 my_model 文件夹。该文件夹会包含以下3个文件：

assets
saved_model.pb ：模型结构，优化器，Metric 等等
variables ：文件夹，保存权重

更多详细信息见： SavedModel guide (The SavedModel format on disk).

SavedModel 如何保存客户自定义模型

保存模型及其 layers时，SavedModel格式存储类名称，调用函数，损失和权重（以及配置（如果已实现））。调用函数定义模型/层的计算图。

在没有模型/层配置的情况下，调用函数用于创建一个像原始模型一样存在的模型，该模型可以进行训练，评估并用于预测

尽管如此，在编写自定义模型或图层类时定义get_config和from_config方法始终是一个好习惯。这样，便可以在需要时轻松地更新计算部分。

以下是从SavedModel格式加载自定义layer，而不覆盖config方法时发生的情况

class CustomModel(keras.Model):
    def __init__(self, hidden_units):
        super(CustomModel, self).__init__()
        self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]

    def call(self, inputs):
        x = inputs
        for layer in self.dense_layers:
            x = layer(x)
        return x


model = CustomModel([16, 16, 10])
# Build the model by calling it
input_arr = tf.random.uniform((1, 5))
outputs = model(input_arr)
model.save("my_model")

# Delete the custom-defined model class to ensure that the loader does not have
# access to it.
del CustomModel

loaded = keras.models.load_model("my_model")
np.testing.assert_allclose(loaded(input_arr), outputs)

print("Original model:", model)
print("Loaded model:", loaded)

INFO:tensorflow:Assets written to: my_model/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Original model: <__main__.CustomModel object at 0x7f0fbc3f8208>
Loaded model:

Keras H5 格式

Keras H5 会将所有信息都打包进一个 HDF5 文件（和大数据 HDFS 没有关系）。

model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.
model.save("my_h5_model.h5")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_h5_model.h5")

# Let's check:
np.testing.assert_allclose(
    model.predict(test_input), reconstructed_model.predict(test_input)
)

# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)

H5 文件的局限：

External losses & metrics ：通过 model.add_loss() 和 model.add_metric() 添加的信息会丢失，在load 以后要重新添加。但，如果这部分是通过自定义 Model 类加通过 self.add_loss() 和 self.add_metric() 加入的，则会保留在H5 文件中，因为他们已经是 Model 的一部分了。
DAG 中的自定义部分：例如自定义层不会保存在 H5文件中，需要开发者在load 以后重新加载。

保存架构

下面讨论仅限于通过 Sequence 和 Functional API 构建的模型，不包括继承 Model 子类

APIs:

get_config() and from_config()
- tf.keras.models.model_to_json() and tf.keras.models.model_from_json()

get_config() 和 from_config()

config = model.get_config() 会返回一个包括网络结构的 python dict

Layer ：

layer = keras.layers.Dense(3, activation="relu")
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)

Sequential model：

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)

Functional model ：

inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)

`to_json()` and `tf.keras.models.model_from_json()`

类似 get_config / from_config 但是会返回 JSON 格式

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
json_config = model.to_json()
new_model = keras.models.model_from_json(json_config)

自定义对象

Models and layers

自定义对象的架构一般在 init 和 call 构建，但是他们在内存中以 bytecode 存在，无法转换成JSON 格式（虽然pickle 可以做序列化，但是有风险，而且不能跨平台）。一个好的解决方案时，自定义的对象需要实现 get_config 和 from_config 方法，并且将该自定义对象注册到 Kears 中。

Custom functions

自定义函数不需要实现get_config，只要把自定义函数注册到 Kears 里即可。

只加载计算图

model.save("my_model")
tensorflow_graph = tf.saved_model.load("my_model")
x = np.random.uniform(size=(4, 32)).astype(np.float32)
predicted = tensorflow_graph(x).numpy()

该方法的缺点：

这种方法中自定义对象无法追溯，也无法重建
返回的不是 Keras Model 对象，很难用，比如无法用 fit 和 predict 函数

虽然有如上缺点，但是他也有自己的好处，比如操作简单，当无法重建自定义类时，可以起到备份的作用。

定义Config 方法

规范：

get_config : 应该返回一个可以JSON 化的 Python dictionary
from_config(config) (classmethod) ：应该返回一个 layer 或者Model 对象

class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name="var_a")

    def call(self, inputs, training=False):
        if training:
            return inputs * self.var
        else:
            return inputs

    def get_config(self):
        return {"a": self.var.numpy()}

    # There's actually no need to define `from_config` here, since returning
    # `cls(**config)` is the default behavior.
    @classmethod
    def from_config(cls, config):
        return cls(**config)


layer = CustomLayer(5)
layer.var.assign(2)

serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(
    serialized_layer, custom_objects={"CustomLayer": CustomLayer}
)

注册自定义对象

Keras 维护了一个列表，记录了哪些类会产生config。 Keras 维护了一个主列表，上面记录所了所有的内建 layer, model, optimizer, 和 metric classes。在 from_config 时， Kears 会查表，如果发现一个名字不在表上的对象，它就会报错。有几种方法阻止这种报错：

在 load 函数中设置自定义对象
tf.keras.utils.custom_object_scope or tf.keras.utils.CustomObjectScope
tf.keras.utils.register_keras_serializable

class CustomLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"units": self.units})
        return config


def custom_activation(x):
    return tf.nn.tanh(x) ** 2


# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)

# Retrieve the config
config = model.get_config()

# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {"CustomLayer": CustomLayer, "custom_activation": custom_activation}
with keras.utils.custom_object_scope(custom_objects):
    new_model = keras.Model.from_config(config)

In-memory 克隆模型

注意，这里Clone 只意味着 get_config , from_config 所以只是 Clone 架构。具体的权重，compilation 都不会复制。

with keras.utils.custom_object_scope(custom_objects):
    new_model = keras.models.clone_model(model)

Save & Load 权重

只存储权重在如下场景中可能有优势：

存储的模型只用来预测
迁移学习

APIs

tf.keras.layers.Layer.get_weights(): 返回 a list of numpy arrays.
tf.keras.layers.Layer.set_weights()

把权重从一层转移到另外一层

def create_layer():
    layer = keras.layers.Dense(64, activation="relu", name="dense_2")
    layer.build((None, 784))
    return layer


layer_1 = create_layer()
layer_2 = create_layer()

# Copy weights from layer 2 to layer 1
layer_2.set_weights(layer_1.get_weights())

把权重从一个模型转移到另外一个模型

# Create a simple functional model
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
    def __init__(self, output_dim, name=None):
        super(SubclassedModel, self).__init__(name=name)
        self.output_dim = output_dim
        self.dense_1 = keras.layers.Dense(64, activation="relu", name="dense_1")
        self.dense_2 = keras.layers.Dense(64, activation="relu", name="dense_2")
        self.dense_3 = keras.layers.Dense(output_dim, name="predictions")

    def call(self, inputs):
        x = self.dense_1(inputs)
        x = self.dense_2(x)
        x = self.dense_3(x)
        return x

    def get_config(self):
        return {"output_dim": self.output_dim, "name": self.name}


subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))

# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())

assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

无状态层
对于无状态层，因为他不会改变数据的顺序与值，所以添加无状态层不会影响权重恢复。

无状态层：比如 DropOut 层

inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)

# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model_with_dropout = keras.Model(
    inputs=inputs, outputs=outputs, name="3_layer_mlp"
)

functional_model_with_dropout.set_weights(functional_model.get_weights())

支持从磁盘写入/读取的API

权重可以使用 model.save_weights 一如下2种格式保存在磁盘上

TensorFlow Checkpoint
HDF5

TensorFlow Checkpoint 是默认模式，有2中办发可以改变模式。

save_format 参数: Set the value to save_format="tf" or save_format="h5".
文件路径: 如果文件名以 .h5 or .hdf5 结尾 then the HDF5 format is used.

TF Checkpoint format

# Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("ckpt")
load_status = sequential_model.load_weights("ckpt")

# `assert_consumed` can be used as validation that all variable values have been
# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other
# methods in the Status object.
load_status.assert_consumed()

Checkpoint 工作机制

TensorFlow Checkpoint 格式使用对象属性名称保存和恢复权重。例如，tf.keras.layers.Dense层。该层包含两个权重：densed.kernel和densed.bias。将图层保存为tf格式后，生成的检查点将包含键“kernel”和“bias”及其对应的权重值。

注意，attribute/graph edge 是以父对象中使用的名称而不是变量的名称命名的。在下面的示例中考虑CustomLayer。变量CustomLayer.var与键“ var”一起保存，而不是“ var_a”。

详情查看 "Loading mechanics" in the TF Checkpoint guide.

class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name="var_a")


layer = CustomLayer(5)
layer_ckpt = tf.train.Checkpoint(layer=layer).save("custom_layer")

ckpt_reader = tf.train.load_checkpoint(layer_ckpt)

ckpt_reader.get_variable_to_dtype_map()

{'save_counter/.ATTRIBUTES/VARIABLE_VALUE': tf.int64,
 '_CHECKPOINTABLE_OBJECT_GRAPH': tf.string,
 'layer/var/.ATTRIBUTES/VARIABLE_VALUE': tf.int32}

迁移学习

原则上说，只要2个网络有相同的拓扑结构，他们就可以共享 Checkpoint

inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(
    functional_model.inputs, functional_model.layers[-1].input, name="pretrained_model"
)
# Randomly assign "trained" weights.
for w in pretrained.weights:
    w.assign(tf.random.normal(w.shape))
pretrained.save_weights("pretrained_ckpt")
pretrained.summary()

# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(5, name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="new_model")

# Load the weights from pretrained_ckpt into model.
model.load_weights("pretrained_ckpt")

# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

print("\n", "-" * 50)
model.summary()

# Example 2: Sequential model
# Recreate the pretrained model, and load the saved weights.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
pretrained_model = keras.Model(inputs=inputs, outputs=x, name="pretrained")

# Sequential example:
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
model.summary()

pretrained_model.load_weights("pretrained_ckpt")

# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,
# but will *not* work as expected. If you inspect the weights, you'll see that
# none of the weights will have loaded. `pretrained_model.load_weights()` is the
# correct method to call.

如果网络结构不一致，如何共享 Checkpoint？这是一个例子

# Create a subclassed model that essentially uses functional_model's first
# and last layers.
# First, save the weights of functional_model's first and last dense layers.
first_dense = functional_model.layers[1]
last_dense = functional_model.layers[-1]
ckpt_path = tf.train.Checkpoint(
    dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias
).save("ckpt")

# Define the subclassed model.
class ContrivedModel(keras.Model):
    def __init__(self):
        super(ContrivedModel, self).__init__()
        self.first_dense = keras.layers.Dense(64)
        self.kernel = self.add_variable("kernel", shape=(64, 10))
        self.bias = self.add_variable("bias", shape=(10,))

    def call(self, inputs):
        x = self.first_dense(inputs)
        return tf.matmul(x, self.kernel) + self.bias


model = ContrivedModel()
# Call model on inputs to create the variables of the dense layer.
_ = model(tf.ones((1, 784)))

# Create a Checkpoint with the same structure as before, and load the weights.
tf.train.Checkpoint(
    dense=model.first_dense, kernel=model.kernel, bias=model.bias
).restore(ckpt_path).assert_consumed()

利用HDF5 格式存储权重

HDF5格式包含按图层名称分组的权重。权重是通过将可训练权重列表与不可训练权重列表（与layer.weights相同）连接而排序的列表。因此，如果模型具有与保存在Checkpoint 中相同的层和可训练状态，则该模型可以使用 HDF5 Checkpoint 。

# Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("weights.h5")
sequential_model.load_weights("weights.h5")

注意，当模型包含嵌套图层时，更改layer.trainable可能导致不同的layer.weights排序。

class NestedDenseLayer(keras.layers.Layer):
    def __init__(self, units, name=None):
        super(NestedDenseLayer, self).__init__(name=name)
        self.dense_1 = keras.layers.Dense(units, name="dense_1")
        self.dense_2 = keras.layers.Dense(units, name="dense_2")

    def call(self, inputs):
        return self.dense_2(self.dense_1(inputs))


nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, "nested")])
variable_names = [v.name for v in nested_model.weights]
print("variables: {}".format(variable_names))

print("\nChanging trainable status of one of the nested layers...")
nested_model.get_layer("nested").dense_1.trainable = False

variable_names_2 = [v.name for v in nested_model.weights]
print("\nvariables: {}".format(variable_names_2))
print("variable ordering changed:", variable_names != variable_names_2)

迁移学习

从HDF5加载预训练的权重时，建议将权重加载到原始Checkpoint 模型中，然后将所需的权重/图层提取到新模型中。

def create_functional_model():
    inputs = keras.Input(shape=(784,), name="digits")
    x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = keras.layers.Dense(10, name="predictions")(x)
    return keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")


functional_model = create_functional_model()
functional_model.save_weights("pretrained_weights.h5")

# In a separate program:
pretrained_model = create_functional_model()
pretrained_model.load_weights("pretrained_weights.h5")

# Create a new model by extracting layers from the original model:
extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name="dense_3"))
model = keras.Sequential(extracted_layers)
model.summary()

TF2 Keras (5) : 保存&加载

介绍

太长不看版

保存整个模型

APIs

SavedModel 保存了什么？

SavedModel 如何保存客户自定义模型

Keras H5 格式

保存架构

get_config() 和 from_config()

`to_json()` and `tf.keras.models.model_from_json()`

自定义对象

Models and layers

Custom functions

只加载计算图

定义Config 方法

注册自定义对象

In-memory 克隆模型

Save & Load 权重

APIs

支持从磁盘写入/读取的API

TF Checkpoint format

Checkpoint 工作机制

迁移学习

利用HDF5 格式存储权重

迁移学习

你可能感兴趣的:(TF2 Keras (5) : 保存&加载)

TF2 Keras (5) : 保存&加载

介绍

太长不看版

保存整个模型

APIs

SavedModel 保存了什么？

SavedModel 如何保存客户自定义模型

Keras H5 格式

保存架构

get_config() 和 from_config()

to_json() and tf.keras.models.model_from_json()

自定义对象

Models and layers

Custom functions

只加载计算图

定义Config 方法

注册自定义对象

In-memory 克隆模型

Save & Load 权重

APIs

支持从磁盘 写入/读取 的API

TF Checkpoint format

Checkpoint 工作机制

迁移学习

利用HDF5 格式存储权重

迁移学习

你可能感兴趣的:(TF2 Keras (5) : 保存&加载)

`to_json()` and `tf.keras.models.model_from_json()`

支持从磁盘写入/读取的API