[Pytorch]pytorch中的LSTM模型

公式表示

Pytorch中LSTM的公式表示为:

it=σ(Wiixt+bii+Whih(t1)+bhi) i t = σ ( W i i x t + b i i + W h i h ( t − 1 ) + b h i )

ft=σ(Wifxt+bif+Whfh(t1)+bhf) f t = σ ( W i f x t + b i f + W h f h ( t − 1 ) + b h f )

gt=tanh(Wigxt+big+Whgh(t1)+bhg) g t = tanh ⁡ ( W i g x t + b i g + W h g h ( t − 1 ) + b h g )

ot=σ(Wioxt+bio+Whoh(t1)+bho) o t = σ ( W i o x t + b i o + W h o h ( t − 1 ) + b h o )

ct=ftc(t1)+itgt c t = f t c ( t − 1 ) + i t g t

ht=ottanh(ct) h t = o t tanh ⁡ ( c t )

其中 it i t 为输入门, ft f t 为遗忘门, gt g t 为细胞门的输出, ot o t 为输出门, ct c t 为细胞 t t 时刻的状态, ht h t t t 时刻的隐藏层状态。

定义

Pytorch中LSTM的定义如下:

class torch.nn.LSTM(*args, **kwargs)

参数列表

  • input_size:x的特征维度
  • hidden_size:隐藏层的特征维度
  • num_layers:lstm隐层的层数,默认为1
  • bias:False则 bih b i h =0和 bhh b h h =0. 默认为True
  • batch_first:True则输入输出的数据格式为 (batch, seq, feature)
  • dropout:除最后一层,每一层的输出都进行dropout,默认为: 0
  • bidirectional:True则为双向lstm默认为False
  • 输入:input, ( h0 h 0 , c0 c 0 )
  • 输出:output, ( hn h n , cn c n )

输入数据格式:
input(seq_len, batch, input_size)
h0(num_layers * num_directions, batch, hidden_size)
c0(num_layers * num_directions, batch, hidden_size)

输出数据格式:
output(seq_len, batch, hidden_size * num_directions)
hn(num_layers * num_directions, batch, hidden_size)
cn(num_layers * num_directions, batch, hidden_size)

实例:基于LSTM的词性标注模型

import torch
import gensim
torch.manual_seed(2)

datas=[('你 叫 什么 名字 ?','n v n n f'),('今天 天气 怎么样 ?','n n adj f'),]
words=[ data[0].split() for data in datas]
tags=[ data[1].split() for data in datas]


id2word=gensim.corpora.Dictionary(words)
word2id=id2word.token2id

id2tag=gensim.corpora.Dictionary(tags)
tag2id=id2tag.token2id

def sen2id(inputs):
    return [word2id[word] for word in inputs]
def tags2id(inputs):
    return [tag2id[word] for word in inputs]
# print(sen2id('你 叫 什么 名字'.split()))
def formart_input(inputs):
    return torch.autograd.Variable(torch.LongTensor(sen2id(inputs)))
def formart_tag(inputs):
    return torch.autograd.Variable(torch.LongTensor(tags2id(inputs)),)

class LSTMTagger(torch.nn.Module):
    def __init__(self,embedding_dim,hidden_dim,voacb_size,target_size):
        super(LSTMTagger,self).__init__()
        self.embedding_dim=embedding_dim
        self.hidden_dim=hidden_dim
        self.voacb_size=voacb_size
        self.target_size=target_size
        self.lstm=torch.nn.LSTM(self.embedding_dim,self.hidden_dim)
        self.log_softmax=torch.nn.LogSoftmax()
        self.embedding=torch.nn.Embedding(self.voacb_size,self.embedding_dim)
        self.hidden=(torch.autograd.Variable(torch.zeros(1,1,self.hidden_dim)),torch.autograd.Variable(torch.zeros(1,1,self.hidden_dim)))
        self.out2tag=torch.nn.Linear(self.hidden_dim,self.target_size)
    def forward(self,inputs):
        input=self.embedding((inputs))
        out,self.hidden=self.lstm(input.view(-1,1,self.embedding_dim),self.hidden)
        tags=self.log_softmax(self.out2tag(out.view(-1,self.hidden_dim)))
        return tags

model=LSTMTagger(3,3,len(word2id),len(tag2id))
loss_function=torch.nn.NLLLoss()
optimizer=torch.optim.SGD(model.parameters(),lr=0.1)
for _ in range(100):
    model.zero_grad()
    input=formart_input('你 叫 什么 名字'.split())
    tags=formart_tag('n n adj f'.split())
    out=model(input)
    loss=loss_function(out,tags)
    loss.backward(retain_variables=True)
    optimizer.step()
    print(loss.data[0])
input=formart_input('你 叫 什么'.split())
out=model(input)
out=torch.max(out,1)[1]
print([id2tag[out.data[i]] for i in range(0,out.size()[0])])

你可能感兴趣的:(lstm,DL)