【无标题】

pytorch 主要组成部分+实战

主要组成部分ref:http://t.csdn.cn/UHGIj(这个大佬整理的太全了 直接参考了)

pytorch模型训练基本流程

基本参数设置

import os
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import torch.optim as optimizer
#初始化超参数
batch_size = 16 #batch size
lr = 1e-4 #初始学习率
max_epochs = 100 #最大训练次数
#GPU设置
# 方案一:使用os.environ,这种情况如果使用GPU不需要设置
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
# 方案二:使用“device”,后续对要使用GPU的变量用.to(device)即可
device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu")
123456789101112131415

数据读入

自定义数据类

自定义Dataset类需要继承PyTorch自身的Dataset类。主要包含三个函数:

  • __init __: 用于向类中传入外部参数,同时定义样本集
  • __getitem __: 用于逐个读取样本集合中的元素,可以进行一定的变换,并将返回训练/验证所需的数据
  • __len __: 用于返回数据集的样本数

例如:

class MyDataset(Dataset):
    def __init__(self, data_dir, info_csv, image_list, transform=None):
        """
        Args:
            data_dir: path to image directory.
            info_csv: path to the csv file containing image indexes
                with corresponding labels.
            image_list: path to the txt file contains image names to training/validation set
            transform: optional transform to be applied on a sample.
        """
        label_info = pd.read_csv(info_csv)
        image_file = open(image_list).readlines()
        self.data_dir = data_dir
        self.image_file = image_file
        self.label_info = label_info
        self.transform = transform

    def __getitem__(self, index):
        """
        Args:
            index: the index of item
        Returns:
            image and its labels
        """
        image_name = self.image_file[index].strip('\n')
        raw_label = self.label_info.loc[self.label_info['Image_index'] == image_name]
        label = raw_label.iloc[:,0]
        image_name = os.path.join(self.data_dir, image_name)
        image = Image.open(image_name).convert('RGB')
        if self.transform is not None:
            image = self.transform(image)
        return image, label

    def __len__(self):
        return len(self.image_file)
1234567891011121314151617181920212223242526272829303132333435

从本地读入数据

from torchvision import datasets
# train_path = '' #训练集路径
# val_path = '' # 测试集路径
train_data = datasets.ImageFolder(train_path, transform=data_transform)
val_data = datasets.ImageFolder(val_path, transform=data_transform)
# 或
train_data = MyDataset(train_path, transform=data_transform)
val_data = MyDataset(val_path, transform=data_transform)
12345678

数据分批加载

train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, num_workers=4, shuffle=True, drop_last=True)
val_loader = torch.utils.data.DataLoader(val_data, batch_size=batch_size, num_workers=4, shuffle=False)
# batch_size:每批读入的样本数
# num_workers:有多少个进程用于读取数据
# shuffle:是否将读入的数据打乱
# drop_last:对于样本最后一部分没有达到批次数的样本,使其不再参与训练
123456

图片数据查看

import matplotlib.pyplot as plt
images, labels = next(iter(val_loader))
print(images.shape)
plt.imshow(images[0].transpose(1,2,0))
plt.show()
12345

模型构建

搭建深度学习神经网络主要通过torch.nn模块。torch.nn主要包含以下几个部分:

分类 模块 子模块 说明
参数 parameter Parameter 模型参数,Tensor
UninitializedParameter 无需初始化参数
UninitializedBuffer 无需初始化Tensor
基本单元 Containers Module 构建神经网络基础单元
Sequential 将不同模块连接起来构成一个神经网络模型
ModuleList/Dict Module组成的List/Dict,无顺序连接关系
ParameterList/Dict 参数的List/Dict
基础层 Convolution Layers (卷积层) nn.Conv1d nn.Conv2d nn.Conv3d 1、2、3维信号卷积
nn.ConvTranspose1d nn.ConvTranspose2d nn.ConvTranspose3d 1、2、3维图像转置卷积
nn.LazyConv1d nn.LazyConv2d nn.LazyConv3d 使用第一个输入初始化参数1、2、3维信号卷积
nn.LazyConvTranspose1d nn.LazyConvTranspose2d nn.LazyConvTranspose3d 使用第一个输入初始化参数1、2、3维图像转置卷积
nn.Unfold 从滑动窗口中提取元素
nn.Fold 将滑动窗口中的元素还原至Tensor
Pooling layers (池化层) nn.MaxPool1d nn.MaxPool2d nn.MaxPool3d 1、2、3维最大池化
nn.MaxUnpool1d nn.MaxUnpool2d nn.MaxUnpool3d 1、2、3维最大池化加0还原
nn.AvgPool1d nn.AvgPool2d nn.AvgPool3d 1、2、3维平均值池化
nn.FractionalMaxPool2d nn.FractionalMaxPool3d 2、3维分数阶最大池化
nn.LPPool1d nn.LPPool2d 1、2维幂平均池化
nn.MaxPool1d nn.MaxPool2d nn.MaxPool3d 1、2、3维最大池化
nn.AdaptiveMaxPool1d nn.AdaptiveMaxPool2d nn.AdaptiveMaxPool3d nn.AdaptiveAvgPool1d nn.AdaptiveAvgPool2d nn.AdaptiveAvgPool3d 1、2、3维自适应最大/平均池化
Padding Layers nn.ReflectionPad1d nn.ReflectionPad2d nn.ReflectionPad3d 用输入边界的反射(以边界为轴对称元素)填充输入张量
nn.ReplicationPad1d nn.ReplicationPad2d nn.ReplicationPad3d 用输入边界元素填充输入张量
nn.ZeroPad2d 用0输入张量
nn.ConstantPad1d nn.ConstantPad2d nn.ConstantPad3d 用指定常数填充输入张量
Non-linear Activations (非线性激活函数) nn.Softmax nn.Sigmoid nn.ReLU nn.Tanh等 详情参照官网
Linear Layers (线性层) nn.Identity nn.Linear nn.Bilinear nn.LazyLinear 线性变化
Normalization Layers nn.BatchNorm1d nn.BatchNorm2d nn.BatchNorm3d nn.LazyBatchNorm1d nn.LazyBatchNorm2d nn.LazyBatchNorm3d 一个数据batch内进行归一化,详情参考论文
nn.InstanceNorm1d nn.InstanceNorm2d nn.InstanceNorm3d nn.LazyInstanceNorm1d nn.LazyInstanceNorm2d nn.LazyInstanceNorm3d 一个通道内进行归一化
nn.LayerNorm 一层进行归一化
nn.GroupNorm 一组数据(mini-batch)内进行归一化
nn.SyncBatchNorm 一组指定维度数据内进行归一化
nn.LocalResponseNorm 指定数据周围局部进行归一化
Recurrent Layers nn.RNNBase nn.RNN nn.LSTM nn.GRU nn.RNNCell nn.LSTMCell nn.GRUCell 循环神经网络相关结构层
Transformer Layers nn.Transformer Transformer模型
nn.TransformerEncoder nn.TransformerDecoder 由多层编码层(解码层)组成的编码器(解码器)
nn.TransformerEncoderLayer 由自注意力网络和前馈神经网络组成
nn.TransformerDecoderLayer 由自注意力网络、multi-head自注意力网络和前馈神经网络组成
Dropout Layers nn.Dropout nn.Dropout2d nn.Dropout3d 在训练过程中按Bernoulli分布将概率p的数据随机变为0(防止过拟合)
nn.AlphaDropout nn.FeatureAlphaDropout dropout过程保持均值、标准差不变
Sparse Layers nn.Embedding 嵌入向量
nn.EmbeddingBag 将embedding进行分组求和、均值计算
函数 距离函数 nn.CosineSimilarity 余弦相似度
nn.PairwiseDistance p范式成对距离
损失函数 nn.L1Loss nn.MSELoss nn.CrossEntropyLoss nn.KLDivLoss等 详情参照官网
其他 Vision Layers nn.PixelShuffle nn.PixelUnshuffle 像素重组/还原
nn.Upsample nn.UpsamplingNearest2d nn.UpsamplingBilinear2d 上采样
Shuffle Layers nn.ChannelShuffle 通道数据打乱
DataParallel Layers nn.DataParallel nn.parallel.DistributedDataParallel 多GPU并行计算
Utilities from torch.nn.utils import… 详情参照官网

Module构造神经网络

以多层感知器为例:

import torch
from torch import nn
#定义一个MLP类
class MLP(nn.Module):
  # 声明带有模型参数的层,这里声明了两个全连接层
  def __init__(self, **kwargs):
    super(MLP, self).__init__(**kwargs) #调用MLP父类Block的构造函数来进行必要的初始化
    self.hidden = nn.Linear(784, 256)
    self.act = nn.ReLU()
    self.output = nn.Linear(256,10)
    
   # 定义模型的前向计算,即如何根据输入x计算返回所需要的模型输出
  def forward(self, x):
    o = self.act(self.hidden(x))
    return self.output(o)  
#实例化
X = torch.rand(2,784)
net = MLP()
net(X)
12345678910111213141516171819

自己构造Layer

  • 不含模型参数层
class MyLayer(nn.Module):
    def __init__(self, **kwargs):
        super(MyLayer, self).__init__(**kwargs)
    def forward(self, x):
        return x - x.mean()
12345
  • 含模型参数层
class MyListDense(nn.Module):
    def __init__(self):
        super(MyListDense, self).__init__()
        self.params = nn.ParameterList([nn.Parameter(torch.randn(4, 4)) for i in range(3)])
        self.params.append(nn.Parameter(torch.randn(4, 1)))

    def forward(self, x):
        for i in range(len(self.params)):
            x = torch.mm(x, self.params[i])
        return x


class MyDictDense(nn.Module):
    def __init__(self):
        super(MyDictDense, self).__init__()
        self.params = nn.ParameterDict({
                'linear1': nn.Parameter(torch.randn(4, 4)),
                'linear2': nn.Parameter(torch.randn(4, 1))
        })
        self.params.update({'linear3': nn.Parameter(torch.randn(4, 2))}) # 新增

    def forward(self, x, choice='linear1'):
        return torch.mm(x, self.params[choice])
1234567891011121314151617181920212223
  • 2维卷积层
def corr2d(X, K): 
    h, w = K.shape
    X, K = X.float(), K.float()
    Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i: i + h, j: j + w] * K).sum()
    return Y

# 二维卷积层
class Conv2D(nn.Module):
    def __init__(self, kernel_size):
        super(Conv2D, self).__init__()
        #随机初始化
        self.weight = nn.Parameter(torch.randn(kernel_size))
        self.bias = nn.Parameter(torch.randn(1))

    def forward(self, x):
        return corr2d(x, self.weight) + self.bias
12345678910111213141516171819
  • 池化层
def pool2d(X, pool_size, mode='max'):
    p_h, p_w = pool_size
    Y = torch.zeros((X.shape[0] - p_h + 1, X.shape[1] - p_w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            if mode == 'max':
                Y[i, j] = X[i: i + p_h, j: j + p_w].max()
            elif mode == 'avg':
                Y[i, j] = X[i: i + p_h, j: j + p_w].mean()
    return Y
12345678910

构造模型

  • LeNet模型
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 输入图像channel:1;输出channel:6;5x5卷积核
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # 2x2 Max pooling
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # 如果是方阵,则可以只使用一个数字进行定义
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # 除去批处理维度的其他所有维度
        num_features = 1
        for s in size:
            num_features *= s
        return num_features
1234567891011121314151617181920212223242526272829
  • AlexNet
class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 96, 11, 4), # in_channels, out_channels, kernel_size, stride, padding
            nn.ReLU(),
            nn.MaxPool2d(3, 2), # kernel_size, stride
            # 减小卷积窗口,使用填充为2来使得输入与输出的高和宽一致,且增大输出通道数
            nn.Conv2d(96, 256, 5, 1, 2),
            nn.ReLU(),
            nn.MaxPool2d(3, 2),
            # 连续3个卷积层,且使用更小的卷积窗口。除了最后的卷积层外,进一步增大了输出通道数。
            # 前两个卷积层后不使用池化层来减小输入的高和宽
            nn.Conv2d(256, 384, 3, 1, 1),
            nn.ReLU(),
            nn.Conv2d(384, 384, 3, 1, 1),
            nn.ReLU(),
            nn.Conv2d(384, 256, 3, 1, 1),
            nn.ReLU(),
            nn.MaxPool2d(3, 2)
        )
         # 这里全连接层的输出个数比LeNet中的大数倍。使用丢弃层来缓解过拟合
        self.fc = nn.Sequential(
            nn.Linear(256*5*5, 4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            # 输出层。由于这里使用Fashion-MNIST,所以用类别数为10,而非论文中的1000
            nn.Linear(4096, 10),
        )

    def forward(self, img):
        feature = self.conv(img)
        output = self.fc(feature.view(img.shape[0], -1))
        return output
12345678910111213141516171819202122232425262728293031323334353637

模型初始化

不同结构应选用不同初始化方法,pytorch初始化函数在torch.nn.init中,具体方法详见官网。
遍历,对模型所有模块参数进行初始化:

def initialize_weights(self):
	for m in self.modules():
		# 判断是否属于Conv2d
		if isinstance(m, nn.Conv2d):
			torch.nn.init.xavier_normal_(m.weight.data)
			# 判断是否有偏置
			if m.bias is not None:
				torch.nn.init.constant_(m.bias.data,0.3)
		elif isinstance(m, nn.Linear):
			torch.nn.init.normal_(m.weight.data, 0.1)
			if m.bias is not None:
				torch.nn.init.zeros_(m.bias.data)
		elif isinstance(m, nn.BatchNorm2d):
			m.weight.data.fill_(1) 		 
			m.bias.data.zeros_()	
123456789101112131415

常用损失函数

名称 函数 公式 应用
二分类交叉熵损失函数 torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction=‘mean’) ℓ ( x , y ) = { m e a n ( L ) reduction=’mean’ s u m ( L ) reduction=’sum’ \ell(x, y)={mean(L)sum(L)reduction='mean’reduction=‘sum’{mean(L)reduction='mean’sum(L)reduction='sum’ℓ(x,y)={mea**n(L)sum(L)reduction=’mean’reduction=’sum’ 计算二分类任务的交叉熵
交叉熵损失函数 torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=‘mean’) loss ⁡ ( x , class ) = − log ⁡ ( exp ⁡ ( x [ class ] ) ∑ j exp ⁡ ( x [ j ] ) ) = − x [ class ] + log ⁡ ( ∑ j exp ⁡ ( x [ j ] ) ) \operatorname{loss}(x, \text { class })=-\log \left(\frac{\exp (x[\text { class }])}{\sum_{j} \exp (x[j])}\right)=-x[\text { class }]+\log \left(\sum_{j} \exp (x[j])\right)loss(x, class )=−log(∑jexp(x[j])exp(x[ class ]))=−x[ class ]+log(j∑exp(x[j])) 多分类
L1损失函数 torch.nn.L1Loss(size_average=None, reduce=None, reduction=‘mean’) L n = a b s ( x n − y n ) L_{n} = abs(x_{n}-y_{n})L**n=abs(x**ny**n) 回归问题,返回误差绝对值
MSE损失函数 torch.nn.MSELoss(size_average=None, reduce=None, reduction=‘mean’) l n = ( x n − y n ) 2 l_{n}=\left(x_{n}-y_{n}\right)^{2}l**n=(x**ny**n)2 回归问题,返回误差平方
平滑L1损失函数 torch.nn.SmoothL1Loss(size_average=None, reduce=None, reduction=‘mean’, beta=1.0) loss ⁡ ( x , y ) = 1 n ∑ i = 1 n z i \operatorname{loss}(x, y)=\frac{1}{n}\sum_{i=1}^{n} z_{i}loss(x,y)=n1i=1∑nzi z i = { 0.5 ( x i − y i ) 2 abs(xi-yi)<1 a b s ( x i − y i ) − 0.5 else z_{i}={0.5(xi−yi)2abs(xi−yi)−0.5abs(xi-yi)<1else{0.5(xi−yi)2abs(xi-yi)<1abs(xi−yi)−0.5elsez**i={0.5(x**iy**i)2abs(x**iy**i)−0.5abs(xi-yi)<1else L1的平滑输出,可减轻离群点带来的影响
目标泊松分布的负对数似然损失 torch.nn.PoissonNLLLoss(log_input=True, full=False, size_average=None, eps=1e-08, reduce=None, reduction=‘mean’) loss ⁡ ( x , y ) = { e x n − x n y n log_input=True x n − y n ⋅ log ⁡ ( x n + eps ) log_input=False \operatorname{loss}(x, y)={exn−xnynxn−yn⋅log(xn+ eps )log_input=Truelog_input=False{exn−xnynlog_input=Truexn−yn⋅log⁡(xn+ eps )log_input=Falseloss(x,y)={exnxnynx**ny**n⋅log(x**n+ eps )log_input=Truelog_input=False 泊松分布的负对数似然损失函数
KL散度 torch.nn.KLDivLoss(size_average=None, reduce=None, reduction=‘mean’, log_target=False) D K L ( P , Q ) = ∑ i = 1 n P ( x i ) ( log ⁡ P ( x i ) − log ⁡ Q ( x i ) ) D_{\mathrm{KL}}(P, Q)=\sum_{i=1}^{n} P\left(x_{i}\right)\left(\log P\left(x_{i}\right)-\log Q\left(x_{i}\right)\right)DKL(P,Q)=i=1∑n**P(x**i)(logP(x**i)−logQ(x**i)) 用于连续分布的距离度量,并且对离散采用的连续输出空间分布进行回归通常很有用
MarginRankingLoss torch.nn.MarginRankingLoss(margin=0.0, size_average=None, reduce=None, reduction=‘mean’) loss ⁡ ( x 1 , x 2 , y ) = max ⁡ ( 0 , − y ∗ ( x 1 − x 2 ) + margin ⁡ ) \operatorname{loss}(x 1, x 2, y)=\max (0,-y (x 1-x 2)+\operatorname{margin})loss(x1,x2,y)=max(0,−y∗(x1−x*2)+margin) 用于排序任务
多标签边界损失函数 torch.nn.MultiLabelMarginLoss(size_average=None, reduce=None, reduction=‘mean’) loss ⁡ ( x , y ) = ∑ i j max ⁡ ( 0 , 1 − x [ y [ j ] ] − x [ i ] ) x ⋅ size ⁡ ( 0 ) \operatorname{loss}(x, y)=\sum_{i j} \frac{\max (0,1-x[y[j]]-x[i])}{x \cdot \operatorname{size}(0)}loss(x,y)=i**jx⋅size(0)max(0,1−x[y[j]]−x[i]) 多标签分类
二分类损失函数 torch.nn.SoftMarginLoss(size_average=None, reduce=None, reduction=‘mean’) loss ⁡ ( x , y ) = ∑ i log ⁡ ( 1 + exp ⁡ ( − y [ i ] ⋅ x [ i ] ) ) x ⋅ nelement ⁡ ( ) \operatorname{loss}(x, y)=\sum_{i} \frac{\log (1+\exp (-y[i] \cdot x[i]))}{x \cdot \operatorname{nelement}()}loss(x,y)=ix⋅nelement()log(1+exp(−y[i]⋅x[i])) 二分类逻辑损失函数
多分类折页损失 torch.nn.MultiMarginLoss(p=1, margin=1.0, weight=None, size_average=None, reduce=None, reduction=‘mean’) loss ⁡ ( x , y ) = ∑ i max ⁡ ( 0 , margin ⁡ − x [ y ] + x [ i ] ) p x ⋅ size ⁡ ( 0 ) \operatorname{loss}(x, y)=\frac{\sum_{i} \max (0, \operatorname{margin}-x[y]+x[i])^{p}}{x \cdot \operatorname{size}(0)}loss(x,y)=x⋅size(0)∑imax(0,margin−x[y]+x[i])p 多分类问题
三元组损失 torch.nn.TripletMarginLoss(margin=1.0, p=2.0, eps=1e-06, swap=False, size_average=None, reduce=None, reduction=‘mean’) L ( a , p , n ) = max ⁡ { d ( a i , p i ) − d ( a i , n i ) + margin ⁡ , 0 } L(a, p, n)=\max \left{d\left(a_{i}, p_{i}\right)-d\left(a_{i}, n_{i}\right)+\operatorname{margin}, 0\right}L(a,p,n)=max{d(a**i,p**i)−d(a**i,n**i)+margin,0} 三元组相似性
HingeEmbeddingLoss torch.nn.HingeEmbeddingLoss(margin=1.0, size_average=None, reduce=None, reduction=‘mean’) ℓ n = { x n yn=1 max ⁡ { 0 , Δ − x n } yn=-1 \ell_n={xnmax{0,Δ−xn}yn=1yn=-1{xnyn=1max{0,Δ−xn}yn=-1ℓn={x**nmax{0,Δ−x**n}yn=1yn=-1x为两个输入之差的绝对值 判断两个输入之间的相似性
余弦相似度 torch.nn.CosineEmbeddingLoss(margin=0.0, size_average=None, reduce=None, reduction=‘mean’) ℓ n = { 1 − cos ⁡ ( x 1 , x 2 ) yn=1 max ⁡ { 0 , cos ⁡ ( x 1 , x 2 ) − margin } yn=-1 \ell_n={1−cos(x1,x2)max{0,cos(x1,x2)− margin }yn=1yn=-1{1−cos⁡(x1,x2)yn=1max{0,cos⁡(x1,x2)− margin }yn=-1ℓn={1−cos(x1,x2)max{0,cos(x1,x2)− margin }yn=1yn=-1 两个向量做余弦相似度
CTC损失函数 torch.nn.CTCLoss(blank=0, reduction=‘mean’, zero_infinity=False) / 时序分类问题

模型训练、验证与测试

在模型训练过程中反向传播可进行参数修改;而验证/测试过程,参数不变。

训练过程

  • 流程
    • 声明训练过程
    • 读取数据(可分批放至GPU进行)
    • 将优化器的梯度置零
    • 训练
    • 计算损失函数
    • 将loss反向传播回网络
    • 使用优化器更新模型参数
  • 代码
def train(epoch):
    model.train()
    train_loss = 0
    for data, label in train_loader:
        data, label = data.cuda(), label.cuda() #放至GPU
        optimizer.zero_grad() #将优化器的梯度置零
        output = model(data) #将data送入模型中训练
        loss = criterion(label, output) #计算损失函数
        loss.backward() #将loss反向传播回网络
        optimizer.step() #使用优化器更新模型参数
        train_loss += loss.item()*data.size(0)
    train_loss = train_loss/len(train_loader.dataset)
		print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch, train_loss))
12345678910111213

验证/测试过程

  • 流程
    • 声明验证过程
    • 读取数据(可分批放至GPU进行)
    • 训练
    • 计算损失函数
  • 代码
def val(epoch):       
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for data, label in val_loader:
            data, label = data.cuda(), label.cuda()
            output = model(data)
            preds = torch.argmax(output, 1)
            loss = criterion(output, label)
            val_loss += loss.item()*data.size(0)
            running_accu += torch.sum(preds == label.data)
    val_loss = val_loss/len(val_loader.dataset)
    print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch, val_loss))
12345678910111213

优化器

  • pytorch库:torch.optim
优化器 说明
LBFGS 拟牛顿法
SGD 随机梯度下降
ASGD 平均随机梯度下降
Adagrad 自适应学习率,增加二阶动量
Adadelta Adagrad的扩展,不用依赖于全局学习率
Rprop 弹性反向传播
RMSprop Adadelta特例,对于RNN效果很好
Adam 一阶动量+二阶动量
Adamax 学习率的边界范围比Adam简单
NAdam 带有Nesterov动量项的Adam
SparseAdam 针对稀疏张量的Adam
RAdam 提供自动化的方差衰减,消除了在训练期间warmup所涉及手动调优的需要
AdamW Adam+L2正则
  • 应用
from torch import optim
from torchvision.models import resnet18
#模型
net = resnet18()
#不同层用优化器参数
optimizer = optim.SGD([{'params':net.fc.parameters()},
    {'params':net.layer4[0].conv1.parameters(),'lr':1e-2}], 
    lr=1e-5)
for epoch in range(EPOCH):
	...
	optimizer.zero_grad()  #梯度置零
	loss = ...             #计算loss
	loss.backward()        #BP反向传播
	optimizer.step()       #梯度更新

导入

import os
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

环境和超参数

os.environ['CUDA_VISIBLE_DEVICES'] = '0'
batch_size = 256
num_workers = 4   # 对于Windows用户,这里应设置为0,否则会出现多线程错误
lr = 1e-4
epochs = 20

数据读入和加载
这里同时展示两种方式:

  • 下载并使用PyTorch提供的内置数据集
  • 从网站下载以csv格式存储的数据,读入并转成预期的格式
    • 第一种数据读入方式只适用于常见的数据集,
    • 第二种数据读入方式需要自己构建Dataset,这对于PyTorch应用于自己的工作中十分重要
# 首先设置数据变换
from torchvision import transforms

image_size = 28
data_transform = transforms.Compose([
    transforms.ToPILImage(),   # 这一步取决于后续的数据读取方式,如果使用内置数据集则不需要
    transforms.Resize(image_size),
    transforms.ToTensor()
])
## 读取方式一:使用torchvision自带数据集,下载可能需要一段时间
from torchvision import datasets

train_data = datasets.FashionMNIST(root='./', train=True, download=True, transform=data_transform)
test_data = datasets.FashionMNIST(root='./', train=False, download=True, transform=data_transform)
/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448234945/work/torch/csrc/utils/tensor_numpy.cpp:180.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
## 读取方式二:读入csv格式的数据,自行构建Dataset类
# csv数据下载链接:https://www.kaggle.com/zalando-research/fashionmnist
class FMDataset(Dataset):
    def __init__(self, df, transform=None):
        self.df = df
        self.transform = transform
        self.images = df.iloc[:,1:].values.astype(np.uint8)
        self.labels = df.iloc[:, 0].values
        
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, idx):
        image = self.images[idx].reshape(28,28,1)
        label = int(self.labels[idx])
        if self.transform is not None:
            image = self.transform(image)
        else:
            image = torch.tensor(image/255., dtype=torch.float)
        label = torch.tensor(label, dtype=torch.long)
        return image, label

train_df = pd.read_csv("./FashionMNIST/fashion-mnist_train.csv")
test_df = pd.read_csv("./FashionMNIST/fashion-mnist_test.csv")
train_data = FMDataset(train_df, data_transform)
test_data = FMDataset(test_df, data_transform)

在构建训练和测试数据集完成后,需要定义DataLoader类,以便在训练和测试时加载数据

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=num_workers, drop_last=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=num_workers)

读入后,我们可以做一些数据可视化操作,主要是验证我们读入的数据是否正确

import matplotlib.pyplot as plt
image, label = next(iter(train_loader))
print(image.shape, label.shape)
plt.imshow(image[0][0], cmap="gray")
torch.Size([256, 1, 28, 28]) torch.Size([256])

【无标题】_第1张图片

模型设计
由于任务较为简单,这里我们手搭一个CNN,而不考虑当下各种模型的复杂结构,模型构建完成后,将模型放到GPU上用于训练。

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 32, 5),
            nn.ReLU(),
            nn.MaxPool2d(2, stride=2),
            nn.Dropout(0.3),
            nn.Conv2d(32, 64, 5),
            nn.ReLU(),
            nn.MaxPool2d(2, stride=2),
            nn.Dropout(0.3)
        )
        self.fc = nn.Sequential(
            nn.Linear(64*4*4, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )
        
    def forward(self, x):
        x = self.conv(x)
        x = x.view(-1, 64*4*4)
        x = self.fc(x)
        # x = nn.functional.normalize(x)
        return x

model = Net()
model = model.cuda()
# model = nn.DataParallel(model).cuda()   # 多卡训练时的写法,之后的课程中会进一步讲解

设定损失函数
使用torch.nn模块自带的CrossEntropy损失
PyTorch会自动把整数型的label转为one-hot型,用于计算CE loss
这里需要确保label是从0开始的,同时模型不加softmax层(使用logits计算),这也说明了PyTorch训练中各个部分不是独立的,需要通盘考虑

criterion = nn.CrossEntropyLoss()
# criterion = nn.CrossEntropyLoss(weight=[1,1,1,1,3,1,1,1,1,1])
?nn.CrossEntropyLoss # 这里方便看一下weighting等策略

设定优化器
这里我们使用Adam优化器

optimizer = optim.Adam(model.parameters(), lr=0.001)

训练和测试(验证)
各自封装成函数,方便后续调用
关注两者的主要区别:

  • 模型状态设置
  • 是否需要初始化优化器
  • 是否需要将loss传回到网络
  • 是否需要每步更新optimizer

此外,对于测试或验证过程,可以计算分类准确率

def train(epoch):
    model.train()
    train_loss = 0
    for data, label in train_loader:
        data, label = data.cuda(), label.cuda()
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, label)
        loss.backward()
        optimizer.step()
        train_loss += loss.item()*data.size(0)
    train_loss = train_loss/len(train_loader.dataset)
    print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch, train_loss))
def val(epoch):       
    model.eval()
    val_loss = 0
    gt_labels = []
    pred_labels = []
    with torch.no_grad():
        for data, label in test_loader:
            data, label = data.cuda(), label.cuda()
            output = model(data)
            preds = torch.argmax(output, 1)
            gt_labels.append(label.cpu().data.numpy())
            pred_labels.append(preds.cpu().data.numpy())
            loss = criterion(output, label)
            val_loss += loss.item()*data.size(0)
    val_loss = val_loss/len(test_loader.dataset)
    gt_labels, pred_labels = np.concatenate(gt_labels), np.concatenate(pred_labels)
    acc = np.sum(gt_labels==pred_labels)/len(pred_labels)
    print('Epoch: {} \tValidation Loss: {:.6f}, Accuracy: {:6f}'.format(epoch, val_loss, acc))
for epoch in range(1, epochs+1):
    train(epoch)
    val(epoch)
/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448234945/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


Epoch: 1 	Training Loss: 0.659050
Epoch: 1 	Validation Loss: 0.420328, Accuracy: 0.852000
Epoch: 2 	Training Loss: 0.403703
Epoch: 2 	Validation Loss: 0.350373, Accuracy: 0.872300
Epoch: 3 	Training Loss: 0.350197
Epoch: 3 	Validation Loss: 0.293053, Accuracy: 0.893200
Epoch: 4 	Training Loss: 0.322463
Epoch: 4 	Validation Loss: 0.283335, Accuracy: 0.892300
Epoch: 5 	Training Loss: 0.300117
Epoch: 5 	Validation Loss: 0.268653, Accuracy: 0.903500
Epoch: 6 	Training Loss: 0.282179
Epoch: 6 	Validation Loss: 0.247219, Accuracy: 0.907200
Epoch: 7 	Training Loss: 0.268283
Epoch: 7 	Validation Loss: 0.242937, Accuracy: 0.907800
Epoch: 8 	Training Loss: 0.257615
Epoch: 8 	Validation Loss: 0.234324, Accuracy: 0.912200
Epoch: 9 	Training Loss: 0.245795
Epoch: 9 	Validation Loss: 0.231515, Accuracy: 0.914100
Epoch: 10 	Training Loss: 0.238739
Epoch: 10 	Validation Loss: 0.229616, Accuracy: 0.914400
Epoch: 11 	Training Loss: 0.230499
Epoch: 11 	Validation Loss: 0.228124, Accuracy: 0.915200
Epoch: 12 	Training Loss: 0.221574
Epoch: 12 	Validation Loss: 0.211928, Accuracy: 0.921200
Epoch: 13 	Training Loss: 0.217924
Epoch: 13 	Validation Loss: 0.209744, Accuracy: 0.921700
Epoch: 14 	Training Loss: 0.206033
Epoch: 14 	Validation Loss: 0.215477, Accuracy: 0.921400
Epoch: 15 	Training Loss: 0.203349
Epoch: 15 	Validation Loss: 0.215550, Accuracy: 0.919400
Epoch: 16 	Training Loss: 0.196319
Epoch: 16 	Validation Loss: 0.210800, Accuracy: 0.923700
Epoch: 17 	Training Loss: 0.191969
Epoch: 17 	Validation Loss: 0.207266, Accuracy: 0.923700
Epoch: 18 	Training Loss: 0.185466
Epoch: 18 	Validation Loss: 0.207138, Accuracy: 0.924200
Epoch: 19 	Training Loss: 0.178241
Epoch: 19 	Validation Loss: 0.204093, Accuracy: 0.924900
Epoch: 20 	Training Loss: 0.176674
Epoch: 20 	Validation Loss: 0.197495, Accuracy: 0.928300

模型保存
训练完成后,可以使用torch.save保存模型参数或者整个模型,也可以在训练过程中保存模型

save_path = "./FahionModel.pkl"
torch.save(model, save_path)

你可能感兴趣的:(pytorch,深度学习,神经网络)