参考链接:1.pytorch下搭建网络训练并保存模型 - sjtuxx_lee的博客 - CSDN博客
2.详解 MNIST 数据集 - 闲汉 - 博客园
3.image — Matplotlib 3.0.3 documentation
4.torchvision库简介 - DrHW - 博客园
5.pytorch方法测试——损失函数(CrossEntropyLoss) - tmk_01的博客 - CSDN博客
加载并保存图像信息
首先下载mnist的数据集:官网的那个可能被墙了,贴上网盘链接,提取码: r2k7
按照规定建立好文件夹以及子文件夹。
修改相应代码(两部分):
1.更改root_path,我的是root_path = 'E:/data_JH/pytorch/mnist/'
增加mnist.npz的地址,为了以后的本地导入,
path = 'E:/data_JH/pytorch/mnist/mnist.npz'
2.更改成本地导入,改动点如下
def LoadData(root_path, base_path, training_path, test_path):
##############本地下载版本##############
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
#######################################
#(x_train, y_train), (x_test, y_test) = mnist.load_data() 在线下载版本
遇到的问题以及解决方案:
1.matplotlib没有image这个模块(module ‘matplotlib’ has no attribute 'image' )
查找了matplotlib官方文件库[3],发现是有matplotlib.image.imsave这个命令的,但是他的版本是Version3.0.3,我的版本是Version2.0.2。推测是安装这个包没安全面,打开Anaconda Navigator,搜索matplotlib,把相关的包都点击Apply。
定义自己的DATASET
pytorch训练数据时需要数据集为Dataset类,便于迭代等等,这里将加载保存之后的数据封装成Dataset类,分别为:
1.继承该类需要写初始化方法(__init__)
2.获取指定下标数据的方法(__getitem__)
3.获取数据个数的方法(__len__)。
尤其需要注意的是要把label转为LongTensor类型的。
用mnist训练网络
前期准备工作:
1.把DataProcessingMnist类和BuildAlexNet类文件分别命名成DataProcessing和BuildModel,否则导入的时候会出现问题。
2.root_path,model_path改成你文件的位置
3.把代码中的.data[0]改成item()
训练中。。。
下面是修改后的训练代码以及我的理解,原版在参考链接1里面。
import torch
import os
from torchvision import transforms #图像变换包,具体见[4]
import torch.optim as optim #优化算法包
from torch.autograd import Variable
from torch.utils.data import DataLoader
import DataProcessing as DP
import BuildModel as BM
import torch.nn as nn
if __name__ == '__main__':
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' #设置系统环境变量
root_path = 'E:/data_JH/pytorch/mnist/'
training_path = 'trainingset/'
test_path = 'testset/'
model_path = 'E:/data_JH/pytorch/mnist/model/'
training_imgfile = training_path + 'trainingset_img.txt'
training_labelfile = training_path + 'trainingset_label.txt'
training_imgdata = training_path + 'img/'
test_imgfile = test_path + 'testset_img.txt'
test_labelfile = test_path + 'testset_label.txt'
test_imgdata = test_path + 'img/'
#parameter
batch_size = 128
epochs = 20
model_type = 'pre'
nclasses = 10 #最终分类数目
lr = 0.01 #梯度下降率
use_gpu = torch.cuda.is_available()
transformations = transforms.Compose( #定义组合变换
[transforms.Scale(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
])
dataset_train = DP.DataProcessingMnist(root_path, training_imgfile, training_labelfile, training_imgdata, transformations)
dataset_test = DP.DataProcessingMnist(root_path, test_imgfile, test_labelfile, test_imgdata, transformations)
num_train, num_test = len(dataset_train), len(dataset_test)
train_loader = DataLoader(dataset_train, batch_size = batch_size, shuffle = True, num_workers = 0)
test_loader = DataLoader(dataset_test, batch_size = batch_size, shuffle = False, num_workers = 0)
# build model
model = BM.BuildAlexNet(model_type, nclasses)
optimizer = optim.SGD(model.parameters(), lr = lr)
criterion = nn.CrossEntropyLoss() #损失函数,详见[5]
for epoch in range(epochs):
epoch_loss = 0
correct_num = 0
for i, traindata in enumerate(train_loader):
x_train, y_train = traindata
if use_gpu:
x_train, y_train = Variable(x_train.cuda()),Variable(y_train.cuda())
model = model.cuda()
else:
x_train, y_train = Variable(x_train),Variable(y_train)
y_pre = model(x_train)
_, label_pre = torch.max(y_pre.data, 1)
if use_gpu:
y_pre = y_pre.cuda()
label_pre = label_pre.cuda()
model.zero_grad()
loss = criterion(y_pre, y_train)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
correct_num += torch.sum(label_pre == y_train.data)
acc = (torch.sum(label_pre == y_train.data).float()/len(y_train))
print('batch loss: {} batch acc: {}'.format(loss.item(),acc.item()))
print('epoch: {} training loss: {}, training acc: {}'.format(epoch, epoch_loss, correct_num.float()/num_train))
if (epoch+1) % 5 ==0:
test_loss = 0
test_acc_num = 0
for j, testdata in enumerate(test_loader):
x_test, y_test = testdata
if use_gpu:
x_test, y_test = Variable(x_test.cuda()), Variable(y_test.cuda())
else:
x_test, y_test = Variable(x_test), Variable(y_test)
y_pre = model(x_test)
_, label_pre = torch.max(y_pre.data, 1)
loss = criterion(y_pre, y_test)
test_loss += loss.item()
test_acc_num += torch.sum(label_pre == y_test.data)
print('epoch: {} test loss: {} test acc: {}'.format(epoch, test_loss, test_acc_num.float()/num_test))
torch.save(model.state_dict(), model_path + 'AlexNet_params.pkl')
PS:哼,我学会在里面用那种代码的排版了。下一篇就用!(怎么添加并且显示代码_百度经验)
待解决:训练原理还搞得不是太清楚。链接5记得有时间搞透彻