王世豪,男,西安工程大学电子信息学院,2020级硕士研究生,张宏伟人工智能课题组。
研究方向:机器视觉与人工智能。
电子邮件:[email protected]
CIFAR-10数据集由10个类的60000个32x32彩色图像组成,每个类有6000个图像。有50000个训练图像和10000个测试图像。可点击下载或者从百度云下载。
链接:https://pan.baidu.com/s/1GG0c91T5E92WbJS4bueOzg
提取码:2021
50000个训练图像和10000个测试图像分别分为5个训练批次和1个测试批次,每个批次有10000个图像。
数据集中的10类分别是airplane(飞机),automobile(汽车),bird(鸟),cat(猫),deer(鹿),dog(狗),frog(青蛙),horse(马),ship(船)和truck(卡车),其中没有任何的重叠情况,即airplane只包括飞机,automobile只包括小型汽车,也不会在同一张照片中出现两类事物。以下是来自每个类的10个随机图像:
可更改epoch_num以及其他参数进行测试,运行前注意修改数据集路径为自己下载的数据集路径。
import torchvision #可用来加载数据集
import torch
import torchvision.transforms as transforms #实现图片变换处理
from torch import optim
from torch.autograd import Variable
import time
# 检验GPU是否可用
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("训练所用设备", device)
# 定义超参数
epoch_num = 10 # 训练循环次数
batch_size = 100
LR = 0.001 # 学习率
#使用torchvision加载并预处理CIFAR10数据集
transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize(mean = (0.5,0.5,0.5),std = (0.5,0.5,0.5))])#把数据变为tensor并且归一化range [0, 255] -> [0.0,1.0]
trainset = torchvision.datasets.CIFAR10(root='D:/Resource/Datasets/cifar-10-python/',train = True,download=True,transform=transform)
trainloader = torch.utils.data.DataLoader(trainset,batch_size=batch_size,shuffle=True,num_workers=0)
testset = torchvision.datasets.CIFAR10('D:/Resource/Datasets/cifar-10-python/',train=False,download=True,transform=transform)
testloader = torch.utils.data.DataLoader(testset,batch_size=batch_size,shuffle=True,num_workers=0)
classes = ('plane','car','bird','cat','deer','dog','frog','horse','ship','truck')
#定义网络
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
self.conv1 = nn.Conv2d(3,6,5)
self.conv2 = nn.Conv2d(6,16,5)
self.fc1 = nn.Linear(16*5*5,120)
self.fc2 = nn.Linear(120,84)
self.fc3 = nn.Linear(84,10)
def forward(self,x):
x = F.max_pool2d(F.relu(self.conv1(x)),(2,2))
x = F.max_pool2d(F.relu(self.conv2(x)),2)
x = x.view(x.size()[0],-1)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net().to(device)
# print(net)
#定义损失函数和优化器
criterion = nn.CrossEntropyLoss()#定义交叉熵损失函数
optimizer = optim.SGD(net.parameters(),lr = LR,momentum=0.9)
#训练网络
t0 = time.time()
for epoch in range(epoch_num):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):#enumerate将其组成一个索引序列,利用它可以同时获得索引和值,enumerate还可以接收第二个参数,用于指定索引起始值
train_num = 0
inputs, labels = data
inputs, labels = Variable(inputs).to(device), Variable(labels).to(device)
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
train_num += i
train_loss = running_loss / train_num
print('epoch {} train_loss : {}'.format(epoch+1, train_loss))
print("----------finished training---------")
# 打印训练所花费的时间
t1=time.time()
T=t1-t0
print('The training time took %.2f'%(T)+' s.')
# 对比实际标签与预测标签
dataiter = iter(testloader)
images, labels = dataiter.next()
images, labels = images.to(device), labels.to(device)
print('实际的label: ',' '.join('%08s'%classes[labels[j]] for j in range(4)))
outputs = net(Variable(images))
_, predicted = torch.max(outputs.data,1)#返回最大值和其索引
print('预测结果:',' '.join('%5s'%classes[predicted[j]] for j in range(4)))
# 测试集准确率
correct = 0
total = 0
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(Variable(images))
_, predicted = torch.max(outputs.data, 1)
total +=labels.size(0)
correct +=(predicted == labels).sum()
print('10000张测试集中的准确率为: %d %%'%(100*correct/total))
在PyCharm或者Windows的终端下定位到分类程序文件所在文件夹,运行该程序即可进行分类。
分类结果示意如下
显然只有两层卷积三层全连接的网络学习能力较差,实现数据集分类比较勉强。
CIFAR-10数据集说明
https://www.cnblogs.com/Jerry-Dong/p/8109938.html
利用卷积神经网络处理CIFAR图像分类
https://zhuanlan.zhihu.com/p/28035475
Pytorch的nn.Conv2d()详解
https://blog.csdn.net/qq_42079689/article/details/102642610
[PyTorch] 基于Python和PyTorch的cifar-10分类
https://blog.csdn.net/qq_41683065/article/details/91368288
PYTHON 中的" %S"%用法
https://www.cnblogs.com/wh-ff-ly520/p/9390855.html