nn.RNN 类初始化主要参数:
nn.RNN 类实例化对象主要参数
注:默认num_directions=1
# input_size=4, hidden_size=6, num_layer=1
rnn=nn.RNN(4,6,1)
# seq_len=2, batch_size=3, input_size=4
input=torch.randn(2,3,4)
# num_layer=1, batch_size=3, hidden_size=6
h0=torch.randn(1,3,6)
output,hn=rnn(input,h0)
print("out:",output.shape,output)
print("hn:",hn.shape,hn)
# seq_len=2, batch_size=3, num_directions*hidden_size=6
out: torch.Size([2, 3, 6])
tensor([[[-0.6265, -0.0157, 0.9049, -0.9148, -0.1023, 0.8824],
[ 0.1760, 0.1963, 0.3808, -0.9247, -0.2264, 0.6422],
[ 0.3331, -0.9721, -0.7927, -0.3843, 0.8845, 0.3520]],
[[ 0.3291, -0.3104, -0.7785, -0.5462, 0.8294, 0.9277],
[ 0.7093, -0.7809, -0.6781, -0.5684, 0.9314, 0.8526],
[ 0.5716, -0.8184, -0.2193, -0.6427, 0.7650, 0.5599]]],
grad_fn=)
# num_directions*num_layer=1,batch_szie=3,hidden_size=6
hn: torch.Size([1, 3, 6])
tensor([[[ 0.8884, 0.8530, -0.7011, -0.6993, -0.5694, -0.3473],
[ 0.3698, 0.2642, 0.3087, -0.0876, -0.5907, -0.2327],
[ 0.5826, 0.7660, 0.3343, -0.5449, -0.7647, -0.0619]]],
grad_fn=)
通常情况下,我们会在初始化RNN时设置 batch_size=True。这会使input和output的维度有所变化,但它不会影响h0和hn的维度。
如果是双向RNN,初始化RNN时设置 bidirectional =True,即num_directions=2,这会影响output,h0和hn的维度。
nn.LSTM 类初始化主要参数:
nn.LSTM 类实例化对象主要参数
注:默认num_directions=1
# input_size=4, hidden_size=6, num_layer=1
lstm=nn.LSTM(4,6,1)
# seq_len=2, batch_size=3, input_size=4
input=torch.randn(2,3,4)
# num_layer=1,batch_size=3, hidden_size=6
h0=torch.randn(1,3,6)
c0=torch.randn(1,3,6)
output,(hn,cn)=lstm(input,(h0,c0))
print("out",output.shape,output)
print("hn",hn.shape,hn)
print("cn",cn.shape,cn)
# seq_len=2, batch_size=3, num_directions*hidden_size=6
torch.Size([2, 3, 6])
tensor([[[-0.0414, 0.5901, -0.3420, 0.0888, 0.1882, -0.2603],
[ 0.1543, -0.0668, 0.0877, -0.1536, 0.1953, 0.3458],
[-0.1405, -0.0209, 0.1439, 0.0977, 0.0071, -0.1940]],
[[ 0.0318, 0.2772, -0.0229, 0.0464, 0.1286, -0.0865],
[ 0.1021, 0.2384, 0.0941, -0.1227, 0.1751, 0.3902],
[-0.0881, 0.1334, 0.0564, -0.0522, 0.0354, -0.0247]]],
grad_fn=)
# num_directions*num_layer=1,batch_size=3, hidden_size=6
torch.Size([1, 3, 6])
tensor([[[ 0.0318, 0.2772, -0.0229, 0.0464, 0.1286, -0.0865],
[ 0.1021, 0.2384, 0.0941, -0.1227, 0.1751, 0.3902],
[-0.0881, 0.1334, 0.0564, -0.0522, 0.0354, -0.0247]]],
grad_fn=)
# num_directions*num_layer=1,batch_size=3, hidden_size=6
torch.Size([1, 3, 6])
tensor([[[ 0.0832, 0.4552, -0.0624, 0.1234, 0.3947, -0.1668],
[ 0.2845, 0.4464, 0.3017, -0.3624, 0.2983, 0.7948],
[-0.2298, 0.1902, 0.1637, -0.1332, 0.0778, -0.0428]]],
grad_fn=)
关于 LSTM的 batch_first 和 bidirectional 的设置参考RNN
nn.GRU 类初始化主要参数:
nn.GRU 类实例化对象主要参数
GRU的维度和RNN一模一样。
# input_size=4, hidden_size=6, num_layer=1
gru=nn.GRU(4,6,1)
# seq_len=2, batch_size=3, input_size=4
input=torch.randn(2,3,4)
# num_layer=1,batch_size=3, hidden_size=6
h0=torch.randn(1,3,6)
output,hn=gru(input,h0)
print("out",output.shape,output)
print("hn",hn.shape,hn)
# seq_len=2, batch_size=3, num_directions*hidden_size=6
torch.Size([2, 3, 6])
tensor([[[-0.9060, -0.7757, 0.7011, -0.4514, -1.0205, -0.6123],
[ 0.6067, -0.1415, -1.3128, -0.2117, -0.4429, -0.2052],
[ 0.0051, 0.0630, -0.0658, 0.1197, -0.4444, -0.2348]],
[[-0.0020, -0.3685, 0.2763, -0.3061, -0.7251, -0.5263],
[ 0.3088, -0.2424, -0.9513, -0.0241, -0.4825, 0.0095],
[ 0.2136, 0.2759, 0.2112, 0.3923, -0.2075, -0.3016]]],
grad_fn=)
# num_directions*num_layer=1,batch_size=3, hidden_size=6
torch.Size([1, 3, 6])
tensor([[[-0.0020, -0.3685, 0.2763, -0.3061, -0.7251, -0.5263],
[ 0.3088, -0.2424, -0.9513, -0.0241, -0.4825, 0.0095],
[ 0.2136, 0.2759, 0.2112, 0.3923, -0.2075, -0.3016]]],
grad_fn=)