PyTorch深度学习实践-P12循环神经网络

Basic RNN 是神经网络的一种,适用于把一个序列变成另一个序列,例如天气 股市 自然语言处理,使用需要了解1序列数据2循环过程的权重共享机制

  • DNN dense 稠密网络,deep 深网络 有很多线性层
  • x1,x2,x3x4...是数据样本的不同特征
  • 全连接层的运算量很大,相比比之下卷积因为有权重共享计算量小的多

补充python中join()方法 和 end

Python join() 方法用于将序列中的元素以指定的字符连接生成一个新的字符串

语法:

str.join(sequence)

实例:

str = "-"

seq = ("a", "b", "c"); # 字符串序列 print str.join( seq );

输出:

a-b-c

Python end=' '意思是末尾不换行,加空格

RNNs:

  •  h2不仅受到x2的与影响,也受到x1的影响,因此要对他们做一个融合
  • RNN CELL中,常用的激活函数是tanh,因为取值在-1 1 
  • 可以把RNN看成一个线性层

PyTorch深度学习实践-P12循环神经网络_第1张图片

代码:

#最重要的两个参数是input size和hidden size
import torch.nn.RNNCell(input_size=input size,hidden_size=hidden_size)
#实例化RNN CELL
cell=torch.nn.RNNCell(input_size=input size,hidden_size=hidden_size)
#调用,input 当前时刻的输入,加上当前的hidden,得到结果为hidden
hidden=cell(input,hidden)

PyTorch深度学习实践-P12循环神经网络_第2张图片

 PyTorch深度学习实践-P12循环神经网络_第3张图片

  •  在RNN中,需要注意的是 超参数~
  • PyTorch深度学习实践-P12循环神经网络_第4张图片

 代码:调用RNNCELL 需要循环,循环长度是序列长度

PyTorch深度学习实践-P12循环神经网络_第5张图片

import torch
#先设置参数
barch_size=1
seq_len=3
input_size=4
hidden_size=2
#构造RNN CELL
cell=torch.nn.RNNCell(input_size=input_size,hidden_size=hidden_size)

#示例 (seq,batch,features)
dataset=torch.randn(seq_len,batch_size,input_size)
#隐层设为全0
hidden=torch.zeros(batch_size,hidden_size)
#训练循环
for idx,input in enumerate(dataset):
    print('='*20,idx,'='*20)
    print('Input size:',input.shape)
    #隐层维度=batch_size*hidden_size
    hidden=cell('output size:',hidden.shape)
    print(hidden)

  • 需要搞清楚维度~多了一个序列维度
#RNN有多少层
cell=torch.nn.RNNCell(input_size=input size,hidden_size=hidden_size,
                        num_layers=num_layers)
#RNN输入inputs整个输入序列x1...xN,hidden 是h0,输入第一个张量outh1...hn,第二个张量hidden是hn
out,hidden=cell(inputs,hidden)

PyTorch深度学习实践-P12循环神经网络_第6张图片

  •  用RNN就不用自己写循环~最后会把所有隐层输出和最后一层隐层输出给我们~
  • PyTorch深度学习实践-P12循环神经网络_第7张图片
  •  PyTorch深度学习实践-P12循环神经网络_第8张图片

 PyTorch深度学习实践-P12循环神经网络_第9张图片

  • numlayers来设置多层RNN,同一颜色的RNN CELL是同一个线性层,共三个~,第一层h1,第二层的h1 
  • 左下是输入,右上是输出output和hN

RNN不用写循环啦!

import torch
#基本参数
batch_size = 1
seq_len = 3
input_size = 4
hidden_size = 2
num_layers = 1
#RNN初始化 要指明输入维度、隐层维度、RNN层数
cell = torch.nn.RNN(input_size=input_size, hidden_size=hidden_size,
num_layers=num_layers)
# (seqLen, batchSize, inputSize)
inputs = torch.randn(seq_len, batch_size, input_size)
hidden = torch.zeros(num_layers, batch_size, hidden_size)
#最后一个RNN的输出  和hidden
out, hidden = cell(inputs, hidden)
print('Output size:', out.shape)
print('Output:', out)
print('Hidden size: ', hidden.shape)
print('Hidden: ', hidden)

输出:

Output size: torch.Size([3, 1, 2])
Output: tensor([[[ 0.8032, -0.6970]],

        [[ 0.9277, -0.9576]],

        [[ 0.5647,  0.3562]]], grad_fn=)
Hidden size:  torch.Size([1, 1, 2])

Hidden:  tensor([[[0.5647, 0.3562]]], grad_fn=)

PyTorch深度学习实践-P12循环神经网络_第10张图片

 PyTorch深度学习实践-P12循环神经网络_第11张图片

 例子:序列到序列的转换

1、使用RNN CELL 

PyTorch深度学习实践-P12循环神经网络_第12张图片

  •  1把字符向量化:nlp中字符级的,根据字符构造一个词典,如果是词级别的,就构造一个字典写都出现了那些词,给每个词分配一个索引,然后根据词典把每个词变成相应的索引,然后变成向量。h对应字典中1,那么除了1之外,其余元素都为0(one hot 编码),每个向量中的元素与词典的元素数量一样~
  • PyTorch深度学习实践-P12循环神经网络_第13张图片
  •  inputsize=4
  • 输出相当于是h\e\l\o四个类别的一个四维向量,是一个分类问题,接一个softmax交叉熵,变成了一个分布~
  • PyTorch深度学习实践-P12循环神经网络_第14张图片
  •  PyTorch深度学习实践-P12循环神经网络_第15张图片

 代码:

#参数
import torch
input_size = 4
hidden_size = 4
batch_size = 1

#preapre data
#字母列表
idx2char = ['e', 'h', 'l', 'o']
x_data = [1, 0, 2, 2, 3]
y_data = [3, 1, 2, 3, 2]
#查询
one_hot_lookup = [[1, 0, 0, 0],
                    [0, 1, 0, 0],
                    [0, 0, 1, 0],
                    [0, 0, 0, 1]]
#独热向量  seq*input
x_one_hot = [one_hot_lookup[x] for x in x_data]
#reshape the inputs to (seqLen,batchSize,inputSize)  -1表示维度自动判断
inputs = torch.Tensor(x_one_hot).view(-1, batch_size, input_size)
#reshape the labels to (seqLen,1)
labels = torch.LongTensor(y_data).view(-1, 1)

PyTorch深度学习实践-P12循环神经网络_第16张图片

#Loss and Optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.1)
#Training Cycle
for epoch in range(15):
    loss = 0
    #梯度归零
    optimizer.zero_grad()
    #h0
    hidden = net.init_hidden()
    print('Predicted string: ', end='')
    #inputs 维度 seq*b*inputsize,input维度 batch size,inputsize
    #Shape of labels :(, );Shape of label:(1)
    for input, label in zip(inputs, labels):
        hidden = net(input, hidden)
        loss += criterion(hidden, label)
        #output predilection
        _, idx = hidden.max(dim=1)
        print(idx2char[idx.item()], end='')
    loss.backward()
    optimizer.step()
    print(', Epoch [%d/15] loss=%.4f' % (epoch+1, loss.item()))

 结果:

PyTorch深度学习实践-P12循环神经网络_第17张图片

 训练2:用RNN 

不同之处:

#Loss and Optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.05)

#Training Cycle
for epoch in range(15):
    optimizer.zero_grad()
    outputs = net(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    _, idx = outputs.max(dim=1)
    idx = idx.data.numpy()
    print('Predicted: ', ''.join([idx2char[x] for x in idx]), end='')
    print(', Epoch [%d/15] loss = %.3f' % (epoch + 1, loss.item()))

import torch
import torch.nn as nn
batch_size = 1
input_size = 4
hidden_size = 4
num_layers = 1
seq_len = 5

idx2char = ['e', 'h', 'l', 'o']
x_data = [1, 0, 2, 2, 3]  # hello
y_data = [3, 1, 2, 3, 2]  # ohlol
one_hot_lookup = [[1, 0, 0, 0],
                  [0, 1, 0, 0],
                  [0, 0, 1, 0],
                  [0, 0, 0, 1]]
x_one_hot = [one_hot_lookup[x] for x in x_data]

inputs = torch.Tensor(x_one_hot).view(seq_len, batch_size, input_size)
labels = torch.LongTensor(y_data)  # (seqlen×batchsize,1)
print(inputs.shape, labels.shape)


class Model(torch.nn.Module):
    def __init__(self, input_size, hidden_size, batch_size, num_layers):
        super(Model, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.batch_size = batch_size
        self.num_layers = num_layers
        self.rnn = torch.nn.RNN(self.input_size, self.hidden_size, self.num_layers)

    def forward(self, input):
        hidden = torch.zeros(self.num_layers, self.batch_size, self.hidden_size)
        out, _ = self.rnn(input, hidden)
        return out.view(-1, self.hidden_size)  # rashpe out to (seq_len×batch_size, hiddensize)


net = Model(input_size, hidden_size, batch_size, num_layers)
critirion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.1)

for epoch in range(15):
    optimizer.zero_grad()
    outputs = net(inputs)
    loss = critirion(outputs, labels)
    loss.backward()
    optimizer.step()

    _, idx = outputs.max(dim=1)
    idx = idx.data.numpy()  # reshape to numpy
    print('Pridected:', ''.join([idx2char[x] for x in idx]), end='')  # end是不自动换行,''.join是连接字符串数组
    print(',Epoch [%d/15] loss = %.3f' % (epoch + 1, loss.item()))

torch.Size([5, 1, 4]) torch.Size([5])
Pridected: elhhl,Epoch [1/15] loss = 1.518
Pridected: hllll,Epoch [2/15] loss = 1.329
Pridected: holll,Epoch [3/15] loss = 1.150
Pridected: oolll,Epoch [4/15] loss = 1.017
Pridected: ohlll,Epoch [5/15] loss = 0.933
Pridected: oholl,Epoch [6/15] loss = 0.863
Pridected: oholl,Epoch [7/15] loss = 0.798
Pridected: oholl,Epoch [8/15] loss = 0.735
Pridected: oholl,Epoch [9/15] loss = 0.688
Pridected: oholl,Epoch [10/15] loss = 0.657
Pridected: oholl,Epoch [11/15] loss = 0.634
Pridected: oholl,Epoch [12/15] loss = 0.615
Pridected: oholl,Epoch [13/15] loss = 0.597
Pridected: oholl,Epoch [14/15] loss = 0.582
Pridected: oholl,Epoch [15/15] loss = 0.571
 

  • 我们上面是用的都是独热向量,他的缺点是: 维度太高,维度太稀疏,硬编码
  • EMBEDDING 嵌入层:低维、稠密、学习到的,他把高位稀疏样本映射到低维稠密的空间即:降维
  • PyTorch深度学习实践-P12循环神经网络_第18张图片
  • 假设独热向量输入4维,embedding嵌入层5维,4维到5维,构建矩阵,然后查询第二行 输出向量,反向传播求导数就转置成5*4,然后右乘0010,

例3:用EMBEDDING 和 线性层

PyTorch深度学习实践-P12循环神经网络_第19张图片

  •  PyTorch深度学习实践-P12循环神经网络_第20张图片

 代码:

import torch
# parameters
num_class = 4
input_size = 4
hidden_size = 8
embedding_size = 10
num_layers = 2
batch_size = 1
seq_len = 5

idx2char = ['e', 'h', 'l', 'o']
x_data = [[1, 0, 2, 2, 3]] # (batch, seq_len)
y_data = [3, 1, 2, 3, 2] # (batch * seq_len)
inputs = torch.LongTensor(x_data)
labels = torch.LongTensor(y_data)



class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        #嵌入层
        self.emb = torch.nn.Embedding(input_size, embedding_size)
        #RNN
        self.rnn = torch.nn.RNN(input_size=embedding_size,
                                hidden_size=hidden_size,
                                num_layers=num_layers,
                                batch_first=True)
        #全连接层完成了hidden_size到num_class的变换
        self.fc = torch.nn.Linear(hidden_size, num_class)

    def forward(self, x):
        hidden = torch.zeros(num_layers, x.size(0), hidden_size)
        #嵌入层~
        x = self.emb(x) # (batch, seqLen, embeddingSize)
        x, _ = self.rnn(x, hidden)
        x = self.fc(x)
        #变成矩阵,然后有loss帮我们处理
        return x.view(-1, num_class)

net = Model()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.05)

for epoch in range(15):
    optimizer.zero_grad()
    outputs = net(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
    _, idx = outputs.max(dim=1)
    idx = idx.data.numpy()
    print('Predicted: ', ''.join([idx2char[x] for x in idx]), end='')
    print(', Epoch [%d/15] loss = %.3f' % (epoch + 1, loss.item()))

 结果:

Predicted:  oeooe, Epoch [1/15] loss = 1.397
Predicted:  ooooo, Epoch [2/15] loss = 1.145
Predicted:  ooooo, Epoch [3/15] loss = 0.929
Predicted:  ohlol, Epoch [4/15] loss = 0.708
Predicted:  ohlol, Epoch [5/15] loss = 0.539
Predicted:  ohlol, Epoch [6/15] loss = 0.415
Predicted:  ohlol, Epoch [7/15] loss = 0.312
Predicted:  ohlol, Epoch [8/15] loss = 0.231
Predicted:  ohlol, Epoch [9/15] loss = 0.169
Predicted:  ohlol, Epoch [10/15] loss = 0.123
Predicted:  ohlol, Epoch [11/15] loss = 0.091
Predicted:  ohlol, Epoch [12/15] loss = 0.068
Predicted:  ohlol, Epoch [13/15] loss = 0.052
Predicted:  ohlol, Epoch [14/15] loss = 0.040
Predicted:  ohlol, Epoch [15/15] loss = 0.030

Process finished with exit code 0

  •  可以看到第四轮模型就输出了ohlol,说明模型学习能力更强了
  • 作业:Using LSTM 遗忘门-听起来厉害而已 线性层yyds,时间复杂度较高, 效果比RNN好得多,计算复杂~
  • PyTorch深度学习实践-P12循环神经网络_第21张图片
  • PyTorch深度学习实践-P12循环神经网络_第22张图片
  • 作业:Using GRU 效果比RNN好,没有LSTM运行时间长
  • PyTorch深度学习实践-P12循环神经网络_第23张图片
  •  w是对应权重
  • PyTorch深度学习实践-P12循环神经网络_第24张图片

你可能感兴趣的:(pytorch,深度学习,rnn)