This week,a paper which mainly proposes a novel DL framework combining multiple nested long short term memory networks (MTMC-NLSTM) for accurate AQI forecasting has been read.In addition,the Long Short Term Memory neural network ,mainly including the process of LSTM forward propagation with code has been studied.
本周阅读文献《Multivariate Air Quality Forecasting With Nested Long Short Term Memory Neural Network》,这篇文献主要提出了一种结合多个嵌套长短期记忆网络的深度学习框架进行AQI预测;然后对LSTM进行学习,主要学习了用代码实现LSTM前向传播的过程。
题目:Multivariate Air Quality Forecasting With Nested Long Short Term Memory Neural Network
作者:Jin, N (Jin, Ning); Zeng, YK (Zeng, Yongkang); Yan, K (Yan, Ke) ; Ji, ZW (Ji, Zhiwei)
2.基于联邦学习的多任务多通道 NLSTM 神经网络。本文提出一种MTMC算法来预测AQI的六个组成部分,而不是分别训练和测试AQI的六个组成部分,同时通过联邦学习启蒙。所提出的MTMC学习结构通过同时考虑AQI不同分量之间的内部相关性,增强了数据驱动的预测性能。
NLSTM 存储单元将另一个 LSTM 内存单元嵌套到原始 LSTM 单元中。外部存储单元可以自由地选择性地读写内部单元的相关长期信息。总体而言,这种结构提高了原始 LSTM 神经网络结构的鲁棒性,能够记忆和处理长期历史信息。在 LSTM 中,输出门遵循的原则是,与当前时间步长无关的信息仍然值得记住。按照上述逻辑,NLSTM在预测具有波动变化的时间序列数据方面更具优势。
使用另一个 LSTM 单元替换ct普通 LSTM 模型中的计算方程构成 NLSTM 单元。外部 LSTM 称为外部存储器,内部 LSTM 称为内部存储器。在普通的LSTM单元中,存储单元的状态ct更新如下:
ct=h~t−1 +x~t
NLSTM 单元中,该过程被内部 LSTM 单元取代,其中xt和ht−1分别是短期和长期记忆输入。内部LSTM单元的结构与普通单元相同,如下所示:
步骤2:将步骤1处理的数据分为训练集和测试集,作为MTMC神经网络的输入。该网络有 24 个输入对应于 24 个子信号,6 个输出对应于六个 AQI 的预测值。网络通过基于训练集中的 24 个子信号拟合六个 AQI 的实际值来训练。在训练过程中,训练集的 5% 被拆分为验证集。训练神经网络时,AQI 的预测是基于测试集生成的。
第 3 步:将步骤 2 中获得的预测非规范化为原始量级。然后对预测结果进行评估,根据误差的严重性和拟合的有效性来检查预测性能。
四个评估指标,即平均绝对误差 (MAE)、均方根误差 (RMSE)、平均绝对百分比误差 (MAPE) 和 r 平方 (R2),对预测的准确性进行了评价,验证了所提方法的有效性。
CLASS torch.nn.LSTM(*args, **kwargs)的参数:
input_size – The number of expected features in the input x(输入序列特征的大小)
hidden_size – The number of features in the hidden state h(h的大小)
num_layers – Number of recurrent layers. (堆叠的层数) bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True(决定 b_ih 和 b_hh是否可以丢弃)
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature) instead of (seq, batch, feature). Note that this does not apply to hidden or cell states. See the Inputs/Outputs sections below for details. Default: False(默认batch在中间一维)
dropout – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional LSTM. Default: False(双向)
proj_size – If > 0, will use LSTM with projections of corresponding size. Default: 0(LSTMP,为了减小LSTM 的参数和计算量)
传入的输入Inputs: input, (h_0, c_0)
input:输入序列,大小为(N,L,H in )
输出:Outputs: output, (h_n, c_n)
output:模型的序列输出,大小为(L,N,D∗Hout )
import torch
import torch.nn as nn
# 定义常量
bs, T, i_size, h_size = 2, 3, 4, 5
# proj_size = 3
input = torch.randn(bs, T, i_size) # 输入序列
c0 = torch.randn(bs, h_size) # 初始值,不需要训练
h0 = torch.randn(bs, h_size)
# h0 = torch.randn(bs, proj_size)
# 调用官方LSTM API
lstm_layer = nn.LSTM(i_size, h_size, batch_first=True)
# lstm_layer = nn.LSTM(i_size, h_size, batch_first=True, proj_size=proj_size)
output, (h_final, c_final) = lstm_layer(input, (h0.unsqueeze(0), c0.unsqueeze(0)))
print(output.shape, h_final.shape, c_final.shape)
tensor([[[ 0.0646, 0.2626, -0.4449, 0.1008, -0.3899],
[-0.0540, 0.4603, -0.3255, 0.1206, -0.1900],
[ 0.0450, 0.1745, -0.0695, 0.0960, -0.2173]],
[[-0.1420, -0.0881, 0.3784, 0.0850, -0.1860],
[-0.0081, 0.0408, 0.0051, 0.4927, -0.1601],
[-0.0079, 0.0765, 0.1276, 0.1734, -0.2499]]],
def lstm_forward(input, initial_states, w_ih, w_hh, b_ih, b_hh):
h0, c0 = initial_states # 初始状态
bs, T, i_size = input.shape
h_size = w_ih.shape[0] // 4
prev_h = h0
prev_c = c0
batch_w_ih = w_ih.unsqueeze(0).tile(bs, 1, 1) # [bs, 4*h_size, i_size]
batch_w_hh = w_hh.unsqueeze(0).tile(bs, 1, 1) # [bs, 4*h_size, h_size]
output_size = h_size
output = torch.zeros(bs, T, output_size) # 输出序列
for t in range(T):
x = input[:, t, :] # 当前时刻的输入向量,[bs, i_size]
w_times_x = torch.bmm(batch_w_ih, x.unsqueeze(-1)) # [bs, 4*h_size, 1]
w_times_x = w_times_x.squeeze(-1) # (bs, 4*h_size)
w_times_h_prev = torch.bmm(batch_w_hh, prev_h.unsqueeze(-1)) # [bs, 4*h_size, 1]
w_times_h_prev = w_times_h_prev.squeeze(-1) # [bs, 4*h_size]
# 分别计算输入门(i)、遗忘门(f)、cell门(g)、输出门(0)
i_t = torch.sigmoid(w_times_x[:, :h_size] + w_times_h_prev[:, :h_size] + b_ih[:h_size] + b_hh[:h_size])
f_t = torch.sigmoid(w_times_x[:, h_size:2 * h_size] + w_times_h_prev[:, h_size:2 * h_size]
+ b_ih[h_size:2 * h_size] + b_hh[h_size:2 * h_size])
g_t = torch.tanh(w_times_x[:, 2 * h_size:3 * h_size] + w_times_h_prev[:, 2 * h_size:3 * h_size]
+ b_ih[2 * h_size:3 * h_size] + b_hh[2 * h_size:3 * h_size])
o_t = torch.sigmoid(w_times_x[:, 3 * h_size:4 * h_size] + w_times_h_prev[:, 3 * h_size:4 * h_size]
+ b_ih[3 * h_size:4 * h_size] + b_hh[3 * h_size:4 * h_size])
prev_c = f_t * prev_c + i_t * g_t
prev_h = o_t * torch.tanh(prev_c)
output[:, t, :] = prev_h
return output, (prev_h, prev_c)
output_custom, (h_final_custom, c_final_custom) = lstm_forward(input, (h0, c0), lstm_layer.weight_ih_l0,
lstm_layer.weight_hh_l0, lstm_layer.bias_ih_l0,
tensor([[[ 0.0646, 0.2626, -0.4449, 0.1008, -0.3899],
[-0.0540, 0.4603, -0.3255, 0.1206, -0.1900],
[ 0.0450, 0.1745, -0.0695, 0.0960, -0.2173]],
[[-0.1420, -0.0881, 0.3784, 0.0850, -0.1860],
[-0.0081, 0.0408, 0.0051, 0.4927, -0.1601],
[-0.0079, 0.0765, 0.1276, 0.1734, -0.2499]]], grad_fn=)