Pytorch训练自己数据完整步骤--Pytorch训练模板

用pytorch训练自己的任务,是有模板可以写的,下面将整理下训练模型必须会出现的代码。每一步都是接着上一步骤。

文章目录

    • 1 导入各种包
    • 2 超参数和一些常量的定义,方便以后修改
    • 3 定义数据集
    • 4 实例化数据类
    • 5 构建网络
    • 6 实例化网络
    • 7 定义优化器
    • 8 损失函数
    • 9 加载模型
    • 10 开始训练
    • 11 验证模型和保存模型
    • 12 可视化步骤
      • 12.1 如果用tensorboard,则需要这么写:

1 导入各种包

import torch
import csv
import torch.nn as nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from PIL import Image, ImageFile
import torchvision.transforms as transforms
from torchvision import models
import torch.utils.model_zoo as model_zoo
import torch.nn.functional as F
import torch.optim as optim
import os

2 超参数和一些常量的定义,方便以后修改

label_file_train = r'./train.csv'
data_dir_train = r'./dataset/train/'

label_file_val = r'.test.csv'
data_dir_val = r'./dataset/test/'

#设置tensor 网络数据都是在GPU设备上运行
device = torch.device('cuda')
classNum = 2
batch_size = 2
acc_best = float('inf')
CKPT_PATH = '*best.pkl' #可不要

3 定义数据集

class MyDataset(Dataset):
    def __init__(self, data_dir,label_file,transform=None): #保存数据路径
        pass
        
    def __len__(self):
        return len(self.labels)    

    def __getitem__(self,index):
        return image,label

4 实例化数据类

这一步是为了将数据实例化,且用批次读取出来,之后可以直接训练。

train_dataset = MyDataset(data_dir=data_dir,label_file=label_file,transform=transforms)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

5 构建网络

class MyNet(nn.Module):
	def __init__(self, num_class):
		super(MyNet, self).__init__()
		pass
		
    def forward(self, x):
        return x

6 实例化网络

model = Mynet(classNum).to(device)

7 定义优化器

optimizer = optim.Adam(model.parameters(), lr=1e-3)

8 损失函数

criteon = nn.CrossEntropyLoss().to(device)

也可以用自己设计的损失函数:

Class NewLoss(nn.Module):
    def __init__(self):
        pass

    def forward(self,outputs, targets):
        pass
criterion = NewLoss().to(device)

9 加载模型

如果有训练好的模型,需要加载:

checkpoints = torch.load(CKPT_PATH)  #是字典型,包含训练次数等
checkpoint = checkpoints['state_dict']
step = checkpoints['epoch']   #训练的批次
model.load_state_dict(checkpoint)
print("=> loaded checkpoint: %s"%CKPT_PATH)

10 开始训练

for epoch in range(10):        
    model.train()   #必须要写这句
    for batchidx, (x, label) in enumerate(train_loader):
    	x, label = x.to(device), label.to(device)
        logits = model(x)
        loss = criteon(logits, label)        
        # backprop
        optimizer.zero_grad()  #梯度清0
        loss.backward()   #梯度反传
        optimizer.step()   #保留梯度
    print(epoch, 'loss:', loss.item())

11 验证模型和保存模型

    model.eval()    #这句话也是必须的
    with torch.no_grad():
        total_correct = 0
        total_num = 0
        for x, label in val_loader:
            x, label = x.to(device), label.to(device)
            logits = model(x)
            pred = logits.argmax(dim=1)
            correct = torch.eq(pred, label).float().sum().item()
            total_correct += correct
            total_num += x.size(0)
            print(correct)
        acc = total_correct / total_num
        print(epoch, 'test acc:', acc)

        if acc < acc_best:
            acc_best = acc 
            torch.save({'state_dict': model.state_dict(), 'epoch': epoch},'MyNet_'+str(epoch) + '_best.pkl')
            print('Save best statistics done!')

12 可视化步骤

viz = visdom.Visdom()
viz.line([0], [-1], win='loss', opts=dict(title='loss'))  #初始化
viz.line([0], [-1], win='val_acc', opts=dict(title='val_acc'))
optimizer.zero_grad()
loss.backward()
optimizer.step()
viz.line([loss.item()], [global_step], win='loss', update='append') #在这里加入loss值

验证集准确率也是一样的:

 viz.line([val_acc],[global_step], win='val_acc',update='append')

12.1 如果用tensorboard,则需要这么写:

from tensorboardX import SummaryWriter
writer = SummaryWriter('mytest')
writer.add_scalar('Loss1', loss, epoch)
writer.close()

注意:tensorboard保存不能有中文路径,否则会显示不出来

个人建议还是用Visdom。

你可能感兴趣的:(深度学习,pytorch,机器学习,深度学习,pytorch,机器学习,神经网络)