一句话介绍LSTM,它是RNN的进阶版,如果说RNN的最大限度是理解一句话,那么LSTM的最大限度则是理解一段话
任务说明:数据集中提供了火灾温度(Tem1)、一氧化碳浓度(CO 1)、烟雾浓度(Soot 1)随着时间变化数据,我们需要根据这些数据对未来某一时刻的火灾温度做出预测
1、导入数据
采用了panda函数
panda常用函数链接:pandas常用函数 - 简书 (jianshu.com)
import tensorflow as tf
import pandas as pd
import numpy as np
gpus = tf.config.list_physical_devices("GPU")
if gpus:
tf.config.experimental.set_memory_growth(gpus[0], True) #设置GPU显存用量按需使用
tf.config.set_visible_devices([gpus[0]],"GPU")
print(gpus)
df_1 = pd.read_csv("D:/DeepLearning/woodpine2.csv")
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2、数据可视化
Seaborn是matplotlib库的扩展,主要专注于统计学的分析
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['savefig.dpi'] = 500 #图片像素
plt.rcParams['figure.dpi'] = 500 #分辨率
fig, ax =plt.subplots(1,3,constrained_layout=True, figsize=(14, 3))
sns.lineplot(data=df_1["Tem1"], ax=ax[0])
sns.lineplot(data=df_1["CO 1"], ax=ax[1])
sns.lineplot(data=df_1["Soot 1"], ax=ax[2])
plt.show()
3、构建数据集
dataFrame = df_1.iloc[:,1:]
dataFrame
Tem1 | CO 1 | Soot 1 | |
---|---|---|---|
0 | 25.0 | 0.000000 | 0.000000 |
1 | 25.0 | 0.000000 | 0.000000 |
2 | 25.0 | 0.000000 | 0.000000 |
3 | 25.0 | 0.000000 | 0.000000 |
4 | 25.0 | 0.000000 | 0.000000 |
... | ... | ... | ... |
5943 | 295.0 | 0.000077 | 0.000496 |
5944 | 294.0 | 0.000077 | 0.000494 |
5945 | 292.0 | 0.000077 | 0.000491 |
5946 | 291.0 | 0.000076 | 0.000489 |
5947 | 290.0 | 0.000076 | 0.000487 |
5948 rows × 3 columns
设置X,Y
width_X = 8
width_y = 1
取前8个时间段的Tem1
、CO 1
、Soot 1
为X,第9个时间段的Tem1
为y。
X = []
y = []
in_start = 0
for _, _ in df_1.iterrows():
in_end = in_start + width_X
out_end = in_end + width_y
if out_end < len(dataFrame):
X_ = np.array(dataFrame.iloc[in_start:in_end , ])
X_ = X_.reshape((len(X_)*3))
y_ = np.array(dataFrame.iloc[in_end :out_end, 0])
X.append(X_)
y.append(y_)
in_start += 1
X = np.array(X)
y = np.array(y)
X.shape, y.shape
((5939, 24), (5939, 1))
归一化
from sklearn.preprocessing import MinMaxScaler
#将数据归一化,范围是0到1
sc = MinMaxScaler(feature_range=(0, 1))
X_scaled = sc.fit_transform(X)
X_scaled.shape
(5939, 24)
X_scaled = X_scaled.reshape(len(X_scaled),width_X,3)
X_scaled.shape
(5939, 8, 3)
划分数据集
取5000之前的数据为训练集,5000之后的为验证集
X_train = np.array(X_scaled[:5000]).astype('float64')
y_train = np.array(y[:5000]).astype('float64')
X_test = np.array(X_scaled[5000:]).astype('float64')
y_test = np.array(y[5000:]).astype('float64')
X_train.shape
(5000, 8, 3)
4、构建模型
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,LSTM,Bidirectional
from tensorflow.keras import Input
# 多层 LSTM
model_lstm = Sequential()
model_lstm.add(LSTM(units=64, activation='relu', return_sequences=True,
input_shape=(X_train.shape[1], 3)))
model_lstm.add(LSTM(units=64, activation='relu'))
model_lstm.add(Dense(width_y))
WARNING:tensorflow:Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU. WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
5、模型训练与编译
只观测loss数值,不观测准确率,所以删去metrics选项
model_lstm.compile(optimizer=tf.keras.optimizers.Adam(1e-3),
loss='mean_squared_error') # 损失函数用均方误差
X_train.shape, y_train.shape
((5000, 8, 3), (5000, 1))
history_lstm = model_lstm.fit(X_train, y_train,
batch_size=64,
epochs=40,
validation_data=(X_test, y_test),
validation_freq=1)
Epoch 1/40 79/79 [==============================] - 5s 23ms/step - loss: 10543.9570 - val_loss: 6043.4136 Epoch 2/40 79/79 [==============================] - 2s 21ms/step - loss: 129.8679 - val_loss: 709.7237 Epoch 3/40 79/79 [==============================] - 1s 19ms/step - loss: 13.4282 - val_loss: 282.2633 Epoch 4/40 79/79 [==============================] - 1s 19ms/step - loss: 8.2278 - val_loss: 254.7455 Epoch 5/40 79/79 [==============================] - 1s 19ms/step - loss: 8.1414 - val_loss: 235.5970 Epoch 6/40 79/79 [==============================] - 1s 19ms/step - loss: 8.5222 - val_loss: 192.8408 Epoch 7/40 79/79 [==============================] - 2s 21ms/step - loss: 8.5886 - val_loss: 192.3040 Epoch 8/40 79/79 [==============================] - 2s 20ms/step - loss: 7.9806 - val_loss: 273.1663 Epoch 9/40 79/79 [==============================] - 2s 19ms/step - loss: 7.7559 - val_loss: 210.4082 Epoch 10/40 79/79 [==============================] - 2s 19ms/step - loss: 7.7049 - val_loss: 219.2136 Epoch 11/40 79/79 [==============================] - 2s 20ms/step - loss: 8.3173 - val_loss: 178.7746 Epoch 12/40 79/79 [==============================] - 1s 19ms/step - loss: 7.5541 - val_loss: 244.2220 Epoch 13/40 79/79 [==============================] - 2s 19ms/step - loss: 7.7218 - val_loss: 216.0784 Epoch 14/40 79/79 [==============================] - 2s 20ms/step - loss: 7.5529 - val_loss: 208.1548 Epoch 15/40 79/79 [==============================] - 1s 19ms/step - loss: 8.5332 - val_loss: 202.2611 Epoch 16/40 79/79 [==============================] - 2s 19ms/step - loss: 7.2165 - val_loss: 200.9129 Epoch 17/40 79/79 [==============================] - 2s 20ms/step - loss: 7.6918 - val_loss: 164.5995 Epoch 18/40 79/79 [==============================] - 2s 19ms/step - loss: 8.3463 - val_loss: 191.1199 Epoch 19/40 79/79 [==============================] - 2s 20ms/step - loss: 7.1454 - val_loss: 352.0036 Epoch 20/40 79/79 [==============================] - 1s 19ms/step - loss: 8.5322 - val_loss: 182.7811 Epoch 21/40 79/79 [==============================] - 2s 19ms/step - loss: 7.5082 - val_loss: 157.3525 Epoch 22/40 79/79 [==============================] - 2s 19ms/step - loss: 7.6907 - val_loss: 212.4112 Epoch 23/40 79/79 [==============================] - 2s 19ms/step - loss: 7.2731 - val_loss: 237.3845 Epoch 24/40 79/79 [==============================] - 2s 20ms/step - loss: 8.4894 - val_loss: 129.4779 Epoch 25/40 79/79 [==============================] - 2s 19ms/step - loss: 8.6488 - val_loss: 135.0774 Epoch 26/40 79/79 [==============================] - 1s 19ms/step - loss: 8.7399 - val_loss: 199.4531 Epoch 27/40 79/79 [==============================] - 1s 18ms/step - loss: 7.4957 - val_loss: 196.7446 Epoch 28/40 79/79 [==============================] - 1s 18ms/step - loss: 6.7455 - val_loss: 136.7112 Epoch 29/40 79/79 [==============================] - 2s 22ms/step - loss: 6.8989 - val_loss: 273.9738 Epoch 30/40 79/79 [==============================] - 2s 20ms/step - loss: 10.6817 - val_loss: 197.4949 Epoch 31/40 79/79 [==============================] - 2s 20ms/step - loss: 6.5239 - val_loss: 317.7299 Epoch 32/40 79/79 [==============================] - 2s 20ms/step - loss: 8.6730 - val_loss: 136.9751 Epoch 33/40 79/79 [==============================] - 2s 19ms/step - loss: 6.9990 - val_loss: 156.7759 Epoch 34/40 79/79 [==============================] - 2s 19ms/step - loss: 6.9430 - val_loss: 157.6123 Epoch 35/40 79/79 [==============================] - 1s 19ms/step - loss: 6.8603 - val_loss: 94.2304 Epoch 36/40 79/79 [==============================] - 2s 19ms/step - loss: 8.4595 - val_loss: 149.4800 Epoch 37/40 79/79 [==============================] - 2s 19ms/step - loss: 7.1759 - val_loss: 171.2727 Epoch 38/40 79/79 [==============================] - 2s 20ms/step - loss: 8.0675 - val_loss: 99.9099 Epoch 39/40 79/79 [==============================] - 2s 19ms/step - loss: 6.3221 - val_loss: 83.2262 Epoch 40/40 79/79 [==============================] - 2s 19ms/step - loss: 6.7495 - val_loss: 133.3407
6、loss图
# 支持中文
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
plt.figure(figsize=(5, 3),dpi=120)
plt.plot(history_lstm.history['loss'] , label='LSTM Training Loss')
plt.plot(history_lstm.history['val_loss'], label='LSTM Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
预测
predicted_y_lstm = model_lstm.predict(X_test) # 测试集输入模型进行预测
y_test_one = [i[0] for i in y_test]
predicted_y_lstm_one = [i[0] for i in predicted_y_lstm]
plt.figure(figsize=(5, 3),dpi=120)
# 画出真实数据和预测数据的对比曲线
plt.plot(y_test_one[:1000], color='red', label='真实值')
plt.plot(predicted_y_lstm_one[:1000], color='blue', label='预测值')
plt.title('Title')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()
30/30 [==============================] - 0s 5ms/step
from sklearn import metrics
"""
RMSE :均方根误差 -----> 对均方误差开方
R2 :决定系数,可以简单理解为反映模型拟合优度的重要的统计量
"""
RMSE_lstm = metrics.mean_squared_error(predicted_y_lstm, y_test)**0.5
R2_lstm = metrics.r2_score(predicted_y_lstm, y_test)
print('均方根误差: %.5f' % RMSE_lstm)
print('R2: %.5f' % R2_lstm)
均方根误差: 11.54733 R2: 0.73235