上文我们利用pytorch构建了BP神经网络,这次我们来构建CNN的经典网络LeNet,还是利用MNIST数据集,具体的数据获取方法本文不详细介绍,只介绍如何搭建模型并训练数据集。LeNet神经网络由深度学习三巨头之一的Yan LeCun提出,他同时也是卷积神经网络 (CNN,Convolutional Neural Networks)之父。LeNet最早是用在手写数字的识别上,效果较好。主要包含了卷积层、池化层、全连接层等,这里不讲解概念,只介绍如何搭建模型。后续可能会针对这些内容进行介绍。
图片来自于论文中截取。
根据网络架构可以看到,模型当中输入维度为32*32,输出维度为1*10。
网络层 | input | 卷积核 | 卷积核数目 | output |
输入层 | 32*32*1 | / | / | / |
C1层-卷积层 | 32*32*1 | 5*5 | 6 | 28*28*6 |
S2层-池化层 | 28*28*6 | 2*2 | / | 14*14*6 |
C3层-卷积层 | 14*14*6 | 5*5 | 16 | 10*10*16 |
S4层-池化层 | 10*10*16 | 2*2 | / | 5*5*16 |
C5层-全连接层 | 5*5*16 | / | / | 1*120 |
C6层-全连接层 | 1*120 | / | / | 1*84 |
输出层 | 1*84 | / | / | 1*10 |
根据上述结构利用pytorch构建:
import torch
from torch import nn
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.c1 = nn.Sequential(
nn.Conv2d(1, 6, (5,5),stride=1,padding=0),
nn.ReLU()
)
self.s2 = nn.MaxPool2d((2,2), padding=0)
self.c3 = nn.Sequential(
nn.Conv2d(6, 16, (5,5),stride=1,padding=0),
nn.ReLU()
)
self.s4 = nn.MaxPool2d((2,2), padding=0)
self.c5 = nn.Sequential(
nn.Linear(5*5*16, 120),
nn.ReLU()
)
self.c6 = nn.Sequential(
nn.Linear(120, 84),
nn.ReLU()
)
self.out = nn.Sequential(
nn.Linear(84, 10),
nn.Sigmoid()
)
def forward(self, x):
x = self.c1(x)
print(x.shape)
x = self.s2(x)
print(x.shape)
x = self.c3(x)
print(x.shape)
x = self.s4(x)
print(x.shape)
x = x.view(-1,5*5*16)
x = self.c5(x)
print(x.shape)
x = self.c6(x)
print(x.shape)
x = self.out(x)
print(x.shape)
return x
inp = torch.randn(1, 1,32,32)
le = LeNet()
out = le(inp)
并且打印每一层的shape,结果如下:
torch.Size([1, 6, 28, 28])
torch.Size([1, 6, 14, 14])
torch.Size([1, 16, 10, 10])
torch.Size([1, 16, 5, 5])
torch.Size([1, 120])
torch.Size([1, 84])
torch.Size([1, 10])
利用数据集训练模型,其余内容与上一节内容相同,这里不在赘述,直接上代码。
import torch
from torchvision import datasets, transforms
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import numpy as np
device = torch.device('cuda:0')
class Config:
batch_size = 128
epoch = 10
alpha = 1e-3
print_per_step = 100 # 控制输出
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.c1 = nn.Sequential(
nn.Conv2d(1, 6, (5,5),stride=1,padding=0),
nn.ReLU()
)
self.s2 = nn.MaxPool2d((2,2), padding=0)
self.c3 = nn.Sequential(
nn.Conv2d(6, 16, (5,5),stride=1,padding=0),
nn.ReLU()
)
self.s4 = nn.MaxPool2d((2,2), padding=0)
self.c5 = nn.Sequential(
nn.Linear(5*5*16, 120),
nn.ReLU()
)
self.c6 = nn.Sequential(
nn.Linear(120, 84),
nn.ReLU()
)
self.out = nn.Sequential(
nn.Linear(84, 10),
nn.Sigmoid()
)
def forward(self, x):
x = self.c1(x)
x = self.s2(x)
x = self.c3(x)
x = self.s4(x)
x = x.view(-1,5*5*16)
x = self.c5(x)
x = self.c6(x)
x = self.out(x)
return x
class TrainProcess:
def __init__(self):
self.train, self.test = self.load_data()
self.net = LeNet().to(device)
self.criterion = nn.CrossEntropyLoss() # 定义损失函数
self.optimizer = optim.Adam(self.net.parameters(), lr=Config.alpha)
@staticmethod
def load_data():
train_data = datasets.MNIST(root='./data/',
train=True,
transform=transforms.Compose([
transforms.Resize((32, 32)),transforms.ToTensor()]
),
download=True)
test_data = datasets.MNIST(root='./data/',
train=False,
transform=transforms.Compose([
transforms.Resize((32, 32)), transforms.ToTensor()]
))
# 返回一个数据迭代器
# shuffle:是否打乱顺序
train_loader = torch.utils.data.DataLoader(dataset=train_data,
batch_size=Config.batch_size,
shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_data,
batch_size=Config.batch_size,
shuffle=False)
return train_loader, test_loader
def train_step(self):
print("Training & Evaluating based on LeNet......")
file = './result/train_mnist.txt'
fp = open(file,'w',encoding='utf-8')
fp.write('epoch\tbatch\tloss\taccuracy\n')
for epoch in range(Config.epoch):
print("Epoch {:3}.".format(epoch + 1))
for batch_idx,(data,label) in enumerate(self.train):
data, label = Variable(data.cuda()), Variable(label.cuda())
self.optimizer.zero_grad()
outputs = self.net(data)
loss =self.criterion(outputs, label)
loss.backward()
self.optimizer.step()
# 每100次打印一次结果
if batch_idx % Config.print_per_step == 0:
_, predicted = torch.max(outputs, 1)
correct = 0
for _ in predicted == label:
if _:
correct += 1
accuracy = correct / Config.batch_size
msg = "Batch: {:5}, Loss: {:6.2f}, Accuracy: {:8.2%}."
print(msg.format(batch_idx, loss, accuracy))
fp.write('{}\t{}\t{}\t{}\n'.format(epoch,batch_idx,loss,accuracy))
fp.close()
test_loss = 0.
test_correct = 0
for data, label in self.test:
data, label = Variable(data.cuda()), Variable(label.cuda())
outputs = self.net(data)
loss = self.criterion(outputs, label)
test_loss += loss * Config.batch_size
_, predicted = torch.max(outputs, 1)
correct = 0
for _ in predicted == label:
if _:
correct += 1
test_correct += correct
accuracy = test_correct / len(self.test.dataset)
loss = test_loss / len(self.test.dataset)
print("Test Loss: {:5.2f}, Accuracy: {:6.2%}".format(loss, accuracy))
torch.save(self.net.state_dict(), './result/raw_train_mnist_model.pth')
if __name__ == "__main__":
p = TrainProcess()
p.train_step()
需要强调的是,由于MNIST数据集的尺寸为28*28,因此需要在读取数据是进行尺寸转换。
transforms.Compose([ transforms.Resize((32, 32)), transforms.ToTensor()] )
这里主要是将数据集修改尺寸,并且将数据类型转换成Tensor。训练部分可以认为是固定的一套模板或流程,在解决一些常规问题上是可以通用的。训练结果如下:
Training & Evaluating based on LeNet......
Epoch 1.
Batch: 0, Loss: 2.30, Accuracy: 8.59%.
Batch: 100, Loss: 1.62, Accuracy: 80.47%.
Batch: 200, Loss: 1.57, Accuracy: 89.06%.
Batch: 300, Loss: 1.56, Accuracy: 90.62%.
Batch: 400, Loss: 1.54, Accuracy: 92.97%.
Epoch 2.
Batch: 0, Loss: 1.52, Accuracy: 92.97%.
Batch: 100, Loss: 1.50, Accuracy: 96.88%.
Batch: 200, Loss: 1.52, Accuracy: 89.06%.
Batch: 300, Loss: 1.50, Accuracy: 92.97%.
Batch: 400, Loss: 1.49, Accuracy: 95.31%.
Epoch 3.
Batch: 0, Loss: 1.50, Accuracy: 94.53%.
Batch: 100, Loss: 1.50, Accuracy: 97.66%.
Batch: 200, Loss: 1.48, Accuracy: 96.88%.
Batch: 300, Loss: 1.50, Accuracy: 95.31%.
Batch: 400, Loss: 1.48, Accuracy: 97.66%.
Epoch 4.
Batch: 0, Loss: 1.47, Accuracy: 96.88%.
Batch: 100, Loss: 1.48, Accuracy: 96.88%.
Batch: 200, Loss: 1.49, Accuracy: 97.66%.
Batch: 300, Loss: 1.48, Accuracy: 98.44%.
Batch: 400, Loss: 1.47, Accuracy: 96.88%.
Epoch 5.
Batch: 0, Loss: 1.47, Accuracy: 100.00%.
Batch: 100, Loss: 1.49, Accuracy: 96.88%.
Batch: 200, Loss: 1.48, Accuracy: 97.66%.
Batch: 300, Loss: 1.48, Accuracy: 96.88%.
Batch: 400, Loss: 1.48, Accuracy: 96.88%.
Epoch 6.
Batch: 0, Loss: 1.47, Accuracy: 98.44%.
Batch: 100, Loss: 1.47, Accuracy: 98.44%.
Batch: 200, Loss: 1.47, Accuracy: 99.22%.
Batch: 300, Loss: 1.47, Accuracy: 99.22%.
Batch: 400, Loss: 1.49, Accuracy: 98.44%.
Epoch 7.
Batch: 0, Loss: 1.48, Accuracy: 97.66%.
Batch: 100, Loss: 1.49, Accuracy: 97.66%.
Batch: 200, Loss: 1.47, Accuracy: 99.22%.
Batch: 300, Loss: 1.48, Accuracy: 98.44%.
Batch: 400, Loss: 1.48, Accuracy: 97.66%.
Epoch 8.
Batch: 0, Loss: 1.48, Accuracy: 98.44%.
Batch: 100, Loss: 1.48, Accuracy: 98.44%.
Batch: 200, Loss: 1.47, Accuracy: 100.00%.
Batch: 300, Loss: 1.48, Accuracy: 98.44%.
Batch: 400, Loss: 1.47, Accuracy: 99.22%.
Epoch 9.
Batch: 0, Loss: 1.46, Accuracy: 99.22%.
Batch: 100, Loss: 1.46, Accuracy: 100.00%.
Batch: 200, Loss: 1.49, Accuracy: 96.09%.
Batch: 300, Loss: 1.47, Accuracy: 97.66%.
Batch: 400, Loss: 1.46, Accuracy: 100.00%.
Epoch 10.
Batch: 0, Loss: 1.47, Accuracy: 98.44%.
Batch: 100, Loss: 1.47, Accuracy: 99.22%.
Batch: 200, Loss: 1.47, Accuracy: 98.44%.
Batch: 300, Loss: 1.47, Accuracy: 99.22%.
Batch: 400, Loss: 1.47, Accuracy: 100.00%.
Test Loss: 1.49, Accuracy: 98.39%
较之前的BP网络有所提升。