此处使用普通的RNN
推荐一个RNN入门资料:https://zhuanlan.zhihu.com/p/28054589
28*28的图片,每个输入序列长度(seq_len)为28,每个输入的元素维度(input_size)为28,将一张图片的分为28列,为长度28的序列,序列中每个元素为28个元素(即每一列的像素)。
注意,如果batch_first设置为1,则输出维度out: batch, seq_len, hidden_size
方案:
RNN1: 输出的时候,输出的维度(hidden_size)为一个较大值(例如128),然后取最右上角输出,用linear映射到10的类别上面
RNN2:直接输出维度(hidden_size)为10,作为类别,取最右上角输出作为输出,计算loss
由于RNN很难记住很久之前的,为了增强记忆性,想到将前面输出的一起做个全连接,增强记忆性。
RNN3:hidden_size=1 ,将所有输出的28个1维度(28个序列,每个序列1维度),最后 linear-> 10
RNN4 : hidden_size = 128, 每次先对每个序列元素的128个维度 linear-> 1,最后对序列中28个元素 linear-> 10
RNN5:hidden_size=128,对序列中28个128维度的输出,直接torch.cat,合并为1个128*28维度的tensor,做128*28 linear-> 10
结果:
训练5个epochs
RNN1: 右上角 128 linear-> 10
Test ACC:0.9581
RNN2:右上角直接10维度
Test ACC:0.4454
由于RNN很难记住很久之前的,为了增强记忆性,想到将前面输出的一起做个全连接
RNN3:hidden_size=1 将所有输出的28个1维度(28个序列,每个序列1维度) linear-> 10
Test ACC:0.7156
RNN4:hidden_size = 128,每次先对每个序列元素的128个维度 linear-> 1,最后对序列中28个元素 linear-> 10
Test ACC:0.6606
RNN5:hidden_size=128,对序列中28个128维度的输出,直接torch.cat,合并为1个128*28维度的tensor,做128*28 linear-> 10
Test ACC:0.9789
RNN5精度最高。速度也最慢
关于RNN与CNN在MNIST上面的精度,具体要看输出RNN输出维度大小,以及RNN的设计,以及CNN的规模。同等计算量下(设计都合理),应该是CNN精度更高,因为RNN将图像当成序列,忽略了图像每个像素与上下左右像素之间的关联(准确来讲是忽略了左右关联),而CNN卷积的时候对一个个像素块进行卷积,更加考虑到了图像的像素块之间的联系(上下左右联系)。
(以上叙述过程中tensor维度其实有batch的维度,省略未讲)
分析:
hidden_size越大,抽取的特征越多,结果越准确,所以RNN3精度较低。一开始以为RNN4精度会比较高,结果却比较低,我认为应该是将128->1的过程中丢失了大量的信息,导致精度很低。
RNN2的问题也在于抽取的特征太少(hidden_size太小)。
所以现在比较流行的做法是RNN1,直接取序列最后一个输出(右上角输出),输出维度(hidden_size)设置的比较大,然后用一个linear映射到类别数量。速度还可以。之前看到RNN长时间记忆性不行,于是想到对所有的向上输出都拿来linear,人工进行记忆,如RNN5,(RNN4不谈,第一次linear过程丢失了大量信息),RNN5的精度确实高于RNN1,但是!但是!速度慢了很多很多。
由此可见,RNN还是比较灵活的,要勇于尝试!
代码
RNN1:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.autograd import Variable
from matplotlib import pyplot as plt
device = torch.device('cuda')
class RNN(nn.Module):
def __init__(self):
super().__init__()
self.rnn = nn.RNN(
input_size = 28,
hidden_size = 128,
num_layers = 1,
batch_first = True,
)
self.Out2Class = nn.Linear(128,10)
def forward(self, input):
output,hn = self.rnn(input,None)
print('hn,shape:{}'.format(hn.shape))
tmp = self.Out2Class(output[:,-1,:]) #output[:,-1,:]是取输出序列中的最后一个,也可以用hn[0,:,:]或者hn.squeeze(0)代替,
# 为什么用hn[0,:,:],而不是hn,因为hn第一个维度为num_layers * num_directions,此处为1,即hn为(1,x,x),需要去掉1
# 这边将最右上角的输出的128维度映射到10的分类上面去
return tmp
model = RNN()
model = model.to(device)
print(model)
model = model.train()
img_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean = [0.5,0.5,0.5],std = [0.5,0.5,0.5])])
dataset_train = datasets.MNIST(root = './data',transform = img_transform,train = True,download = True)
dataset_test = datasets.MNIST(root = './data',transform = img_transform,train = False,download = True)
train_loader = torch.utils.data.DataLoader(dataset = dataset_train,batch_size=64,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = dataset_test,batch_size=64,shuffle = False)
# images,label = next(iter(train_loader))
# print(images.shape)
# print(label.shape)
# images_example = torchvision.utils.make_grid(images)
# images_example = images_example.numpy().transpose(1,2,0)
# mean = [0.5,0.5,0.5]
# std = [0.5,0.5,0.5]
# images_example = images_example*std + mean
# plt.imshow(images_example)
# plt.show()
def Get_ACC():
correct = 0
total_num = len(dataset_test)
for item in test_loader:
batch_imgs,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
batch_imgs = Variable(batch_imgs)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
_,pred = torch.max(out.data,1)
correct += torch.sum(pred==batch_labels)
# print(pred)
# print(batch_labels)
correct = correct.data.item()
acc = correct/total_num
print('correct={},Test ACC:{:.5}'.format(correct,acc))
optimizer = torch.optim.Adam(model.parameters())
loss_f = nn.CrossEntropyLoss()
Get_ACC()
for epoch in range(10):
print('epoch:{}'.format(epoch))
cnt = 0
for item in train_loader:
batch_imgs ,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
# print(batch_imgs.shape)
batch_imgs,batch_labels = Variable(batch_imgs),Variable(batch_labels)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
# print(out.shape)
loss = loss_f(out,batch_labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(cnt%100==0):
print_loss = loss.data.item()
print('epoch:{},cnt:{},loss:{}'.format(epoch,cnt,print_loss))
cnt+=1
Get_ACC()
torch.save(model,'model')
RNN2:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.autograd import Variable
from matplotlib import pyplot as plt
import sys
device = torch.device('cuda')
class RNN(nn.Module):
def __init__(self):
super().__init__()
self.rnn = nn.RNN(
input_size = 28,
hidden_size = 10,
num_layers = 1,
batch_first = True,
)
def forward(self, input):
output,hn = self.rnn(input,None)
hn = hn[0,:,:]
# print(hn.shape)
# last = output[0,-1,:]
# print('outlast:{}'.format(last))
# tmp_hn = hn[0,:]
# print('hn:{}'.format(tmp_hn))
# print(hn.shape)
# hn = hn.squeeze(0)
# print('hn,shape:{}'.format(hn.shape))
return hn
model = RNN()
model = model.to(device)
print(model)
model = model.train()
img_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean = [0.5,0.5,0.5],std = [0.5,0.5,0.5])])
dataset_train = datasets.MNIST(root = './data',transform = img_transform,train = True,download = True)
dataset_test = datasets.MNIST(root = './data',transform = img_transform,train = False,download = True)
train_loader = torch.utils.data.DataLoader(dataset = dataset_train,batch_size=64,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = dataset_test,batch_size=64,shuffle = False)
# images,label = next(iter(train_loader))
# print(images.shape)
# print(label.shape)
# images_example = torchvision.utils.make_grid(images)
# images_example = images_example.numpy().transpose(1,2,0)
# mean = [0.5,0.5,0.5]
# std = [0.5,0.5,0.5]
# images_example = images_example*std + mean
# plt.imshow(images_example)
# plt.show()
def Get_ACC():
correct = 0
total_num = len(dataset_test)
for item in test_loader:
batch_imgs,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
batch_imgs = Variable(batch_imgs)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
_,pred = torch.max(out.data,1)
correct += torch.sum(pred==batch_labels)
# print(pred)
# print(batch_labels)
correct = correct.data.item()
acc = correct/total_num
print('correct={},Test ACC:{:.5}'.format(correct,acc))
optimizer = torch.optim.Adam(model.parameters())
loss_f = nn.CrossEntropyLoss()
Get_ACC()
for epoch in range(10):
print('epoch:{}'.format(epoch))
cnt = 0
for item in train_loader:
batch_imgs ,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
# print(batch_imgs.shape)
batch_imgs,batch_labels = Variable(batch_imgs),Variable(batch_labels)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
# print(out.shape)
loss = loss_f(out,batch_labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(cnt%100==0):
print_loss = loss.data.item()
print('epoch:{},cnt:{},loss:{}'.format(epoch,cnt,print_loss))
cnt+=1
Get_ACC()
torch.save(model,'model')
RNN3:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.autograd import Variable
from matplotlib import pyplot as plt
device = torch.device('cuda')
class RNN(nn.Module):
def __init__(self):
super().__init__()
self.rnn = nn.RNN(
input_size = 28,
hidden_size = 1,
num_layers = 1,
batch_first = True,
)
self.Out2Class = nn.Linear(28,10)
def forward(self, input):
output,hn = self.rnn(input,None)
# print('hn,shape:{}'.format(hn.shape))
outreshape = output[:,:,0]
# print(outreshape.shape)
tmp = self.Out2Class(outreshape)
# print(tmp.shape)
return tmp
model = RNN()
model = model.to(device)
print(model)
model = model.train()
img_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean = [0.5,0.5,0.5],std = [0.5,0.5,0.5])])
dataset_train = datasets.MNIST(root = './data',transform = img_transform,train = True,download = True)
dataset_test = datasets.MNIST(root = './data',transform = img_transform,train = False,download = True)
train_loader = torch.utils.data.DataLoader(dataset = dataset_train,batch_size=64,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = dataset_test,batch_size=64,shuffle = False)
# images,label = next(iter(train_loader))
# print(images.shape)
# print(label.shape)
# images_example = torchvision.utils.make_grid(images)
# images_example = images_example.numpy().transpose(1,2,0)
# mean = [0.5,0.5,0.5]
# std = [0.5,0.5,0.5]
# images_example = images_example*std + mean
# plt.imshow(images_example)
# plt.show()
def Get_ACC():
correct = 0
total_num = len(dataset_test)
for item in test_loader:
batch_imgs,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
batch_imgs = Variable(batch_imgs)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
_,pred = torch.max(out.data,1)
correct += torch.sum(pred==batch_labels)
# print(pred)
# print(batch_labels)
correct = correct.data.item()
acc = correct/total_num
print('correct={},Test ACC:{:.5}'.format(correct,acc))
optimizer = torch.optim.Adam(model.parameters())
loss_f = nn.CrossEntropyLoss()
Get_ACC()
for epoch in range(5):
print('epoch:{}'.format(epoch))
cnt = 0
for item in train_loader:
batch_imgs ,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
# print(batch_imgs.shape)
batch_imgs,batch_labels = Variable(batch_imgs),Variable(batch_labels)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
# print(out.shape)
loss = loss_f(out,batch_labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(cnt%100==0):
print_loss = loss.data.item()
print('epoch:{},cnt:{},loss:{}'.format(epoch,cnt,print_loss))
cnt+=1
Get_ACC()
torch.save(model,'model')
RNN4:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.autograd import Variable
from matplotlib import pyplot as plt
import sys
device = torch.device('cpu')
class RNN(nn.Module):
def __init__(self):
super().__init__()
self.rnn = nn.RNN(
input_size = 28,
hidden_size = 128,
num_layers = 1,
batch_first = True,
)
self.hidden2one_list = []
for i in range(28):
self.hidden2one_list.append(nn.Linear(128,1))
self.Out2Class = nn.Linear(28,10)
def forward(self, input):
output,hn = self.rnn(input,None)
hidden2one_res = []
for i in range(28):
tmp_res = self.hidden2one_list[i](output[:,i,:])
# print(tmp_res.shape)
hidden2one_res.append(tmp_res.data)
hidden2one_res = torch.cat(hidden2one_res,dim=1) #或者先对hidden2one_res中的元素squeeze(1),再用torch.stack
# print(hidden2one_res.shape) #torch.Size([64, 28])
res = self.Out2Class(hidden2one_res)
return res
model = RNN()
model = model.to(device)
print(model)
model = model.train()
img_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean = [0.5,0.5,0.5],std = [0.5,0.5,0.5])])
dataset_train = datasets.MNIST(root = './data',transform = img_transform,train = True,download = True)
dataset_test = datasets.MNIST(root = './data',transform = img_transform,train = False,download = True)
train_loader = torch.utils.data.DataLoader(dataset = dataset_train,batch_size=64,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = dataset_test,batch_size=64,shuffle = False)
# images,label = next(iter(train_loader))
# print(images.shape)
# print(label.shape)
# images_example = torchvision.utils.make_grid(images)
# images_example = images_example.numpy().transpose(1,2,0)
# mean = [0.5,0.5,0.5]
# std = [0.5,0.5,0.5]
# images_example = images_example*std + mean
# plt.imshow(images_example)
# plt.show()
def Get_ACC():
correct = 0
total_num = len(dataset_test)
for item in test_loader:
batch_imgs,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
batch_imgs = Variable(batch_imgs)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
_,pred = torch.max(out.data,1)
correct += torch.sum(pred==batch_labels)
# print(pred)
# print(batch_labels)
correct = correct.data.item()
acc = correct/total_num
print('correct={},Test ACC:{:.5}'.format(correct,acc))
optimizer = torch.optim.Adam(model.parameters())
loss_f = nn.CrossEntropyLoss()
Get_ACC()
for epoch in range(5):
print('epoch:{}'.format(epoch))
cnt = 0
for item in train_loader:
batch_imgs ,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
# print(batch_imgs.shape)
batch_imgs,batch_labels = Variable(batch_imgs),Variable(batch_labels)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
# print(out.shape)
loss = loss_f(out,batch_labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(cnt%100==0):
print_loss = loss.data.item()
print('epoch:{},cnt:{},loss:{}'.format(epoch,cnt,print_loss))
cnt+=1
Get_ACC()
torch.save(model,'model')
RNN5:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.autograd import Variable
from matplotlib import pyplot as plt
import sys
device = torch.device('cuda')
class RNN(nn.Module):
def __init__(self):
super().__init__()
self.rnn = nn.RNN(
input_size = 28,
hidden_size = 128,
num_layers = 1,
batch_first = True,
)
self.Out2Class = nn.Linear(128*28,10)
def forward(self, input):
output,hn = self.rnn(input,None)
hidden2one_res = []
for i in range(28):
hidden2one_res.append(output[:,i,:])
hidden2one_res = torch.cat(hidden2one_res,dim=1)
# print(hidden2one_res.shape)
res = self.Out2Class(hidden2one_res)
return res
model = RNN()
model = model.to(device)
print(model)
model = model.train()
img_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean = [0.5,0.5,0.5],std = [0.5,0.5,0.5])])
dataset_train = datasets.MNIST(root = './data',transform = img_transform,train = True,download = True)
dataset_test = datasets.MNIST(root = './data',transform = img_transform,train = False,download = True)
train_loader = torch.utils.data.DataLoader(dataset = dataset_train,batch_size=64,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = dataset_test,batch_size=64,shuffle = False)
# images,label = next(iter(train_loader))
# print(images.shape)
# print(label.shape)
# images_example = torchvision.utils.make_grid(images)
# images_example = images_example.numpy().transpose(1,2,0)
# mean = [0.5,0.5,0.5]
# std = [0.5,0.5,0.5]
# images_example = images_example*std + mean
# plt.imshow(images_example)
# plt.show()
def Get_ACC():
correct = 0
total_num = len(dataset_test)
for item in test_loader:
batch_imgs,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
batch_imgs = Variable(batch_imgs)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
_,pred = torch.max(out.data,1)
correct += torch.sum(pred==batch_labels)
# print(pred)
# print(batch_labels)
correct = correct.data.item()
acc = correct/total_num
print('correct={},Test ACC:{:.5}'.format(correct,acc))
optimizer = torch.optim.Adam(model.parameters())
loss_f = nn.CrossEntropyLoss()
Get_ACC()
for epoch in range(5):
print('epoch:{}'.format(epoch))
cnt = 0
for item in train_loader:
batch_imgs ,batch_labels = item
batch_imgs = batch_imgs.squeeze(1)
# print(batch_imgs.shape)
batch_imgs,batch_labels = Variable(batch_imgs),Variable(batch_labels)
batch_imgs = batch_imgs.to(device)
batch_labels = batch_labels.to(device)
out = model(batch_imgs)
# print(out.shape)
loss = loss_f(out,batch_labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(cnt%100==0):
print_loss = loss.data.item()
print('epoch:{},cnt:{},loss:{}'.format(epoch,cnt,print_loss))
cnt+=1
Get_ACC()
torch.save(model,'model')