import torch
import torch.nn as nn
lstm = nn.LSTM(10, 20, 2)
x = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
output, (hn, cn)=lstm(x, (h0, c0))
>>
output.shape torch.Size([5, 3, 20])
hn.shape torch.Size([2, 3, 20])
cn.shape torch.Size([2, 3, 20])
lstm=nn.LSTM(input_size, hidden_size, num_layers)
x seq_len, batch, input_size
h0 num_layers × \times ×num_directions, batch, hidden_size
c0 num_layers × \times ×num_directions, batch, hidden_size
output seq_len, batch, num_directions × \times ×hidden_size
hn num_layers × \times ×num_directions, batch, hidden_size
cn num_layers × \times ×num_directions, batch, hidden_size
举个例子:
对句子进行LSTM操作
假设有100
个句子(sequence),每个句子里有7
个词,batch_size
=64,embedding_size
=300
此时,各个参数为:
input_size
=embedding_size
=300
batch
=batch_size
=64
seq_len
=7
另外设置hidden_size
=100, num_layers
=1
import torch
import torch.nn as nn
lstm = nn.LSTM(300, 100, 1)
x = torch.randn(7, 64, 300)
h0 = torch.randn(1, 64, 100)
c0 = torch.randn(1, 64, 100)
output, (hn, cn)=lstm(x, (h0, c0))
>>
output.shape torch.Size([7, 64, 100])
hn.shape torch.Size([1, 64, 100])
cn.shape torch.Size([1, 64, 100])