【翻译】如何诊断lstm模型的过拟合与欠拟合

原文地址

It can be difficult to determine whether your Long Short-Term Memory model is performing well on your sequence prediction problem.

很难确定您的LSTM模型在序列预测问题上是否表现良好。

You may be getting a good model skill score, but it is important to know whether your model is a good fit for your data or if it is underfit or overfit and could do better with a different configuration.

您可能会获得不错的模型得分,但是重要的是要知道您的模型是否拟合了您的数据,或者模型是否欠拟合或过拟合,并且使用其他的设置(参数?)使模型变得更好。

In this tutorial, you will discover how you can diagnose the fit of your LSTM model on your sequence prediction problem.

在本教程中,您将学习如何在序列预测问题上判断LSTM模型的拟合程度。

After completing this tutorial, you will know:

  • How to gather and plot training history of LSTM models.
  • How to diagnose an underfit, good fit, and overfit model.
  • How to develop more robust diagnostics by averaging multiple model runs.
  • Kick-start your project with my new book Long Short-Term Memory Networks With Python, including step-by-step tutorials and the Python source code files for all examples.

完成本教程之后,您将知道:

  • 如何收集和绘制LSTM模型的训练历史;
  • 如何诊断欠拟合,良好拟合和过拟合模型;
  • 如何通过平均多个模型的运行来寻找更可靠的诊断;
  • 用我的新书《使用Python进行长期短期存储网络》启动您的项目,其中包括所有示例的分步教程和Python源代码文件。

Let’s get started.

开始吧。

Update Jan/2020: Updated API for Keras 2.3 and TensorFlow 2.0.

2020年1月更新:更新了Keras 2.3和TensorFlow 2.0的API。

Tutorial Overview

This tutorial is divided into 6 parts; they are:

  1. Training History in Keras
  2. Diagnostic Plots
  3. Underfit Example
  4. Good Fit Example
  5. Overfit Example
  6. Multiple Runs Example

本教程分为6个部分,分别是:

  1. Keras的训练历史
  2. 诊断图
  3. 欠拟合示例
  4. 良好拟合示例
  5. 过拟合示例
  6. 多次运行示例

1.Training History in Keras

You can learn a lot about the behavior of your model by reviewing its performance over time.

通过查看模型的性能,您可以了解很多有关模型行为的知识。

LSTM models are trained by calling the fit() function. This function returns a variable called history that contains a trace of the loss and any other metrics specified during the compilation of the model. These scores are recorded at the end of each epoch.

通过调用 fit() 函数来训练LSTM模型。 该函数返回一个称为history的变量,其中包含loss的轨迹以及在模型编译期间指定的任何其他度量(如准确率等)。 这些分数记录在每个epoch的结尾。

'''
此处省略模型构建代码
'''
history = model.fit(...)

For example, if your model was compiled to optimize the log loss (binary_crossentropy) and measure accuracy each epoch, then the log loss and accuracy will be calculated and recorded in the history trace for each training epoch.

例如,如果编译模型以优化对数损失 (binary_crossentropy) 并在每个时期测量准确率,则将计算对数损失和准确性,并将其记录在每个训练时期的历史记录中。

Each score is accessed by a key in the history object returned from calling fit(). By default, the loss optimized when fitting the model is called “loss” and accuracy is called “acc“.

通过调用 fit() 返回的历史记录对象中的键可以访问每个分数。 默认情况下,拟合模型时优化的损失称为“loss”,而准确性称为“acc”。

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X, Y, epochs=100)
print(history.history['loss'])
print(history.history['accuracy'])

Keras also allows you to specify a separate validation dataset while fitting your model that can also be evaluated using the same loss and metrics.

Keras还允许在拟合模型的同时指定一个单独的验证数据集,该模型也可以使用相同的损失和度量进行评估。

This can be done by setting the validation_split argument on fit() to use a portion of the training data as a validation dataset.

这可以通过在 fit() 上设置 validation_split 参数来使用一部分训练数据作为验证数据集来完成。

history = model.fit(X, Y, epochs=100, validation_split=0.33)

This can also be done by setting the validation_data argument and passing a tuple of X and y datasets.

也可以通过设置validation_data参数并传递X和y数据集的元组来完成。

history = model.fit(X, Y, epochs=100, validation_data=(valX, valY))

The metrics evaluated on the validation dataset are keyed using the same names, with a “val_” prefix.

在验证数据集上评估的指标使用相同的名称作为关键字,并带有“ val_”前缀。

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X, Y, epochs=100, validation_split=0.33)
print(history.history['loss'])
print(history.history['accuracy'])
print(history.history['val_loss'])
print(history.history['val_accuracy'])

2. Diagnostic Plots

The training history of your LSTM models can be used to diagnose the behavior of your model.

LSTM模型的训练历史可用于诊断模型的行为。

You can plot the performance of your model using the Matplotlib library. For example, you can plot training loss vs test loss as follows:

您可以使用Matplotlib库绘制模型的性能。 例如,您可以绘制训练损失与测试损失的关系图,如下所示:

from matplotlib import pyplot
# 此处省略构建模型...
history = model.fit(X, Y, epochs=100, validation_data=(valX, valY))
pyplot.plot(history.history['loss'])
pyplot.plot(history.history['val_loss'])
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'validation'], loc='upper right')
pyplot.show()

Creating and reviewing these plots can help to inform you about possible new configurations to try in order to get better performance from your model.

创建和查看这些图可以帮助报告您可能尝试的新配置,以便从模型中获得更好的性能。

Next, we will look at some examples. We will consider model skill on the train and validation sets in terms of loss that is minimized. You can use any metric that is meaningful on your problem.

接下来,我们将看一些示例。 我们将从损失最小化的角度考虑训练和验证集上的模型技能。 您可以使用对您的问题有意义的任何度量。

3.Underfit Example

An underfit model is one that is demonstrated to perform well on the training dataset and poor on the test dataset.

欠拟合模型是一种在训练集上表现良好而在测试集上表现较差的模型

This can be diagnosed from a plot where the training loss is lower than the validation loss, and the validation loss has a trend that suggests further improvements are possible.

可以从训练损失低于验证损失的图表中诊断出来,并且验证损失的趋势表明有可能进一步改进。

A small contrived example of an underfit LSTM model is provided below.
下面提供了一个欠拟合LSTM模型的人为设计示例。

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot
from numpy import array

# return training data
def get_train():
	seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# return validation data
def get_val():
	seq = [[0.5, 0.6], [0.6, 0.7], [0.7, 0.8], [0.8, 0.9], [0.9, 1.0]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# define model
model = Sequential()
model.add(LSTM(10, input_shape=(1,1)))
model.add(Dense(1, activation='linear'))
# compile model
model.compile(loss='mse', optimizer='adam')
# fit model
X,y = get_train()
valX, valY = get_val()
history = model.fit(X, y, epochs=100, validation_data=(valX, valY), shuffle=False)
# plot train and validation loss
pyplot.plot(history.history['loss'])
pyplot.plot(history.history['val_loss'])
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'validation'], loc='upper right')
pyplot.show()

Running this example produces a plot of train and validation loss showing the characteristic of an underfit model. In this case, performance may be improved by increasing the number of training epochs.

运行此示例将生成训练图和验证损失图,其中显示了欠拟合模型的特征。 在这种情况下,可以通过增加训练时期的数量来提高性能。

In this case, performance may be improved by increasing the number of training epochs.

在这种情况下,可以通过增加训练时期的数量来提高性能。

【翻译】如何诊断lstm模型的过拟合与欠拟合_第1张图片
Alternately, a model may be underfit if performance on the training set is better than the validation set and performance has leveled off. Below is an example of an underfit model with insufficient memory cells.

另外,如果训练集的性能优于验证集并且性能趋于稳定,则模型可能不适合。以下是内存单元不足的欠拟合模型的示例。

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot
from numpy import array

# return training data
def get_train():
	seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((5, 1, 1))
	return X, y

# return validation data
def get_val():
	seq = [[0.5, 0.6], [0.6, 0.7], [0.7, 0.8], [0.8, 0.9], [0.9, 1.0]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# define model
model = Sequential()
model.add(LSTM(1, input_shape=(1,1)))
model.add(Dense(1, activation='linear'))
# compile model
model.compile(loss='mae', optimizer='sgd')
# fit model
X,y = get_train()
valX, valY = get_val()
history = model.fit(X, y, epochs=300, validation_data=(valX, valY), shuffle=False)
# plot train and validation loss
pyplot.plot(history.history['loss'])
pyplot.plot(history.history['val_loss'])
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'validation'], loc='upper right')
pyplot.show()

Running this example shows the characteristic of an underfit model that appears under-provisioned.

运行此示例将显示配置不足的欠拟合模型的特征。

In this case, performance may be improved by increasing the capacity of the model, such as the number of memory cells in a hidden layer or number of hidden layers.

在这种情况下,可以通过增加模型的容量(例如隐藏层中的存储单元数或隐藏层数)来提高性能。
【翻译】如何诊断lstm模型的过拟合与欠拟合_第2张图片

4. Good Fit Example

A good fit is a case where the performance of the model is good on both the train and validation sets.

一个良好拟合是模型在训练集和验证集上都表现良好的情况。

This can be diagnosed from a plot where the train and validation loss decrease and stabilize around the same point.

可以从训练和验证的损失减少并稳定在同一点的图像中进行诊断。

The small example below demonstrates an LSTM model with a good fit.

下面的小例子演示了一个良好拟合的LSTM模型。

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot
from numpy import array

# return training data
def get_train():
	seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((5, 1, 1))
	return X, y

# return validation data
def get_val():
	seq = [[0.5, 0.6], [0.6, 0.7], [0.7, 0.8], [0.8, 0.9], [0.9, 1.0]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# define model
model = Sequential()
model.add(LSTM(10, input_shape=(1,1)))
model.add(Dense(1, activation='linear'))
# compile model
model.compile(loss='mse', optimizer='adam')
# fit model
X,y = get_train()
valX, valY = get_val()
history = model.fit(X, y, epochs=800, validation_data=(valX, valY), shuffle=False)
# plot train and validation loss
pyplot.plot(history.history['loss'])
pyplot.plot(history.history['val_loss'])
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'validation'], loc='upper right')
pyplot.show()

Running the example creates a line plot showing the train and validation loss meeting.

运行示例将创建一个折线图,其中显示了训练和验证损失集合。

Ideally, we would like to see model performance like this if possible, although this may not be possible on challenging problems with a lot of data.

理想情况下,如果可能,我们希望看到这样的模型性能,尽管在处理大量数据的难题时这可能是不可能的。

【翻译】如何诊断lstm模型的过拟合与欠拟合_第3张图片

5. Overfit Example

An overfit model is one where performance on the train set is good and continues to improve, whereas performance on the validation set improves to a point and then begins to degrade.

过拟合模型是指训练集上的性能良好并且会持续改进的模型,而验证集上的性能会有所改善,然后开始下降。

This can be diagnosed from a plot where the train loss slopes down and the validation loss slopes down, hits an inflection point, and starts to slope up again.

这可以从以下图表中判断出来,在该图表中,训练的损失下降了,验证损失下降了,到达拐点,然后又开始上升。

The example below demonstrates an overfit LSTM model.

下面的示例演示了过拟合的LSTM模型。

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot
from numpy import array

# return training data
def get_train():
	seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((5, 1, 1))
	return X, y

# return validation data
def get_val():
	seq = [[0.5, 0.6], [0.6, 0.7], [0.7, 0.8], [0.8, 0.9], [0.9, 1.0]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# define model
model = Sequential()
model.add(LSTM(10, input_shape=(1,1)))
model.add(Dense(1, activation='linear'))
# compile model
model.compile(loss='mse', optimizer='adam')
# fit model
X,y = get_train()
valX, valY = get_val()
history = model.fit(X, y, epochs=1200, validation_data=(valX, valY), shuffle=False)
# plot train and validation loss
pyplot.plot(history.history['loss'][500:])
pyplot.plot(history.history['val_loss'][500:])
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'validation'], loc='upper right')
pyplot.show()

Running this example creates a plot showing the characteristic inflection point in validation loss of an overfit model.

运行此示例将创建一个图,该图显示过拟合模型的验证损失中的特征拐点。

This may be a sign of too many training epochs.

这可能是训练过多的迹象。

In this case, the model training could be stopped at the inflection point. Alternately, the number of training examples could be increased.

在这种情况下,模型训练可以在拐点处停止。 或者,可以增加训练样本的数量。
【翻译】如何诊断lstm模型的过拟合与欠拟合_第4张图片

6. Multiple Runs Example

LSTMs are stochastic, meaning that you will get a different diagnostic plot each run.

LSTM是随机的,这意味着您每次运行都会得到不同的诊断图。

It can be useful to repeat the diagnostic run multiple times (e.g. 5, 10, or 30). The train and validation traces from each run can then be plotted to give a more robust idea of the behavior of the model over time.

多次重复诊断运行(例如5、10或30)可能很有用。 然后可以绘制每次运行的训练轨迹和验证轨迹,以便对模型随时间的行为提供更可靠的了解。

The example below runs the same experiment a number of times before plotting the trace of train and validation loss for each run.

下面的示例多次运行同一实验,然后绘制每次运行的训练轨迹和验证损失。

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from matplotlib import pyplot
from numpy import array
from pandas import DataFrame

# return training data
def get_train():
	seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((5, 1, 1))
	return X, y

# return validation data
def get_val():
	seq = [[0.5, 0.6], [0.6, 0.7], [0.7, 0.8], [0.8, 0.9], [0.9, 1.0]]
	seq = array(seq)
	X, y = seq[:, 0], seq[:, 1]
	X = X.reshape((len(X), 1, 1))
	return X, y

# collect data across multiple repeats
train = DataFrame()
val = DataFrame()
for i in range(5):
	# define model
	model = Sequential()
	model.add(LSTM(10, input_shape=(1,1)))
	model.add(Dense(1, activation='linear'))
	# compile model
	model.compile(loss='mse', optimizer='adam')
	X,y = get_train()
	valX, valY = get_val()
	# fit model
	history = model.fit(X, y, epochs=300, validation_data=(valX, valY), shuffle=False)
	# story history
	train[str(i)] = history.history['loss']
	val[str(i)] = history.history['val_loss']

# plot train and validation loss across multiple runs
pyplot.plot(train, color='blue', label='train')
pyplot.plot(val, color='orange', label='validation')
pyplot.title('model train vs validation loss')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.show()

In the resulting plot, we can see that the general trend of underfitting holds across 5 runs and is a stronger case for perhaps increasing the number of training epochs.

在生成的图中,我们可以看到在5次运行中总体拟合不足的趋势保持不变,这可能是增加训练时期数量的有力证据。

【翻译】如何诊断lstm模型的过拟合与欠拟合_第5张图片

Further Reading

This section provides more resources on the topic if you are looking go deeper.

如果您想更深入,本节提供了有关该主题的更多资源。

  • History Callback Keras API
  • Learning Curve in Machine Learning on Wikipedia
  • Overfitting on Wikipedia

Summary

In this tutorial, you discovered how to diagnose the fit of your LSTM model on your sequence prediction problem.

在本教程中,您发现了如何在序列预测问题上诊断LSTM模型的拟合度。

Specifically, you learned:

  • How to gather and plot training history of LSTM models.
  • How to diagnose an underfit, good fit, and overfit model.
  • How to develop more robust diagnostics by averaging multiple model runs.

具体来说,您了解到:

  • 如何收集和绘制LSTM模型的训练历史;
  • 如何诊断欠拟合,良好拟合和过拟合模型;
  • 如何通过平均多个模型运行来寻找更可靠的诊断方法。

Do you have any questions? Ask your questions in the comments below and I will do my best to answer.

您有任何问题吗? 在下面的评论中提出您的问题,我会尽力回答。

ps.这是作者说的不是我说的啊,有问题点击原文地址去问作者吧。

你可能感兴趣的:(Python,tensorflow,人工智能,深度学习,lstm,python)