Pytorch作为新兴的深度学习框架,目前的使用率正在逐步上升。相比TensorFlow,Pytorch的上手难度更低,同时Pytorch支持对图的动态定义,并且能够方便的将网络中的tensor格式数据与numpy格式数据进行转换,使得其对某些特殊结构的网络定义起来更加方便,但是Pytorch对于分布式训练之类的支持相对较差,同时没有Tensorboard之类的工具对网络进行方便的可视化。当然,Tensorflow能够选择Keras之类的框架,来大幅简化网络架设工作。
Pytorch拥有一个不错的官方教程https://pytorch.org/tutorials/,包含了从基本运算到图像分类、语义识别、增强学习和今年大火的GAN等案例,解释的也非常清楚。这里主要依据官网的这篇教程https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html,并且对网络结构了一些改进,来练习Pytorch的使用。
这里也按照官网的步骤来,首先是通过torchvision库导入cifar10数据集:
import torch
import torchvision
import torchvision.transforms as transforms
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
查看图片之类的操作就看官网教程好了,这里省略掉。
第二步是定义卷积神经网络,官网使用的作为示例的网络如下:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
这个网络是卷积+全连接层的形式,这种结构的网络效果其实不好,因为全连接层传递效率较低,同时会干扰到卷积层提取出的局部特征,并且也没有用到BatchNorm和Dropout来防止过拟合的问题。现在流行的网络结构大多采用全卷积层的结构,下面的结构效果会好很多:
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 64, 3, padding = 1)
self.conv2 = nn.Conv2d(64, 64, 3, padding = 1)
self.conv3 = nn.Conv2d(64, 128, 3, padding = 1)
self.conv4 = nn.Conv2d(128, 128, 3, padding = 1)
self.conv5 = nn.Conv2d(128, 256, 3, padding = 1)
self.conv6 = nn.Conv2d(256, 256, 3, padding = 1)
self.maxpool = nn.MaxPool2d(2, 2)
self.avgpool = nn.AvgPool2d(2, 2)
self.globalavgpool = nn.AvgPool2d(8, 8)
self.bn1 = nn.BatchNorm2d(64)
self.bn2 = nn.BatchNorm2d(128)
self.bn3 = nn.BatchNorm2d(256)
self.dropout50 = nn.Dropout(0.5)
self.dropout10 = nn.Dropout(0.1)
self.fc = nn.Linear(256, 10)
def forward(self, x):
x = self.bn1(F.relu(self.conv1(x)))
x = self.bn1(F.relu(self.conv2(x)))
x = self.maxpool(x)
x = self.dropout10(x)
x = self.bn2(F.relu(self.conv3(x)))
x = self.bn2(F.relu(self.conv4(x)))
x = self.avgpool(x)
x = self.dropout10(x)
x = self.bn3(F.relu(self.conv5(x)))
x = self.bn3(F.relu(self.conv6(x)))
x = self.globalavgpool(x)
x = self.dropout50(x)
x = x.view(x.size(0), -1)
x = self.fc(x)
return x
net = Net()
Pytorch也可以用nn.Sequential函数来很简单的定义序列网络,和keras的Sequential差不多,但是pytorch需要给出每一层网络的输入与输出参数,这一点就不像keras那么无脑。由于pytorch不像keras自带GlobalAveragePooling,手写一个怕自己忘记,其实这里不加效果会更好,毕竟这相当于强行压缩特征数据之后再进行分类。
这里再举个nn.Sequential的基本栗子:
channel_1 = 32
channel_2 = 16
model = nn.Sequential(
nn.Conv2d(3, channel_1, 5, padding = 2),
nn.ReLU(),
nn.Conv2d(channel_1, channel_2, 3, padding = 1),
nn.ReLU(),
Flatten(),
nn.Linear(channel_2 * 32 * 32, 10),
)
第三步是定义损失函数和优化器,官网这里用的是带动量项的SGD,但是个人感觉Adam对复杂函数的优化效果会比SGD好,所以这里用Adam来代替:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
接下来,第四步就是训练网络了。可以首先使用下列语句来自动判断使用GPU还是CPU进行计算,不过一般而言,GPU和同档次的CPU计算速度可以差到50~70倍……
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device)
下面就可以开始训练了,这里要注意训练的数据也要.to(device):
for epoch in range(10):
running_loss = 0.
batch_size = 100
for i, data in enumerate(
torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=2), 0):
inputs, labels = data
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
print('[%d, %5d] loss: %.4f' %(epoch + 1, (i+1)*batch_size, loss.item()))
print('Finished Training')
当然,Pytorch不如Keras直接一个.fit来的方便,但是也不算麻烦。由于不用在一个session里边进行计算,灵活性还是比tensorflow和封装的严严实实的keras高很多。
之后,可以用下面的语句存储或读取保存好的模型:
torch.save(net, 'cifar10.pkl')
net = torch.load('cifar10.pkl')
在训练完成之后,就可以用测试集查看训练结果:
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
for i in range(10):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
这个网络模型在batch_size=100的条件下训练10个epoch之后,测试集正确率大概在80%左右,对cifar10数据集而言还算可以啦。
源码放在github上,欢迎取用~地址:https://github.com/PolarisShi/cifar10