参考书籍《动手学深度学习(pytorch版),参考网址为:https://tangshusen.me/Dive-into-DL-PyTorch/#/
请大家也多多支持这一个很好用的平台~
大部分内容为书中内容,也有部分自己实验和添加的内容,如涉及侵权,会进行删除。
一、基本概念
——回归估计一个连续值
——分类预测一个离散类别
基本概念网址:https://tangshusen.me/Dive-into-DL-PyTorch/#/chapter03_DL-basics/3.4_softmax-regression
本节我们将使用torchvision包,它是服务于PyTorch深度学习框架的,主要用来构建计算机视觉模型。torchvision主要由以下几部分构成:
1.torchvision.datasets: 一些加载数据的函数及常用的数据集接口;
2.torchvision.models: 包含常用的模型结构(含预训练模型),例如AlexNet、VGG、ResNet等;
3.torchvision.transforms: 常用的图片变换,例如裁剪、旋转等;
4.torchvision.utils: 其他的一些有用的方法。
要点:
1.通过torchvision的torchvision.datasets来下载这个数据集。第一次调用时会自动从网上获取数据。我们通过参数train来指定获取训练数据集或测试数据集(testing data set)。测试数据集也叫测试集(testing set),只用来评价模型的表现,并不用来训练模型。
2.指定了参数transform = transforms.ToTensor()使所有数据转换为Tensor,如果不进行转换则返回的是PIL图片。transforms.ToTensor()将尺寸为 (H x W x C) 且数据位于[0, 255]的PIL图片或者数据类型为np.uint8的NumPy数组转换为尺寸为(C x H x W)且数据类型为torch.float32且位于[0.0, 1.0]的Tensor。
注: 由于像素值为0到255的整数,所以刚好是uint8所能表示的范围,包括transforms.ToTensor()在内的一些关于图片的函数就默认输入的是uint8型,若不是,可能不会报错但可能得不到想要的结果。所以,如果用像素值(0-255整数)表示图片数据,那么一律将其类型设置成uint8,避免不必要的bug。
二、了解图像分类数据集
demo1:
import sys
import time
import torch
import torchvision
import matplotlib.pyplot as plt
from torchvision import transforms
#读取数据集
#transforms.ToTensor()将尺寸为 (H x W x C)
# 且数据位于[0, 255]的PIL图片或者数据类型为np.uint8的NumPy数组转换为尺寸为(C x H x W)且数据类型为torch.float32且位于[0.0, 1.0]的Tensor
trans = transforms.ToTensor()
mnist_train = torchvision.datasets.FashionMNIST(root="../data", train=True,
transform=trans,
download=True)
mnist_test = torchvision.datasets.FashionMNIST(root="../data", train=False,
transform=trans, download=True)
print(type(mnist_train))
print(len(mnist_train), len(mnist_test))
feature, label = mnist_train[0] #通过下标访问任意一个样本
print(feature.shape, label) # Channel x Height x Width
# """返回Fashion-MNIST数据集的文本标签。"""
def get_fashion_mnist_labels(labels):
text_labels = [
't-shirt', 'trouser', 'pullover', 'dress', 'coat', 'sandal', 'shirt',
'sneaker', 'bag', 'ankle boot']
return [text_labels[int(i)] for i in labels]
#创建一个函数来可视化这些样本
def show_fashion_mnist(images, labels):
# 这里的_表示我们忽略(不使用)的变量
_, figs = plt.subplots(1, len(images), figsize=(12, 12))
for f, img, lbl in zip(figs, images, labels):
f.imshow(img.view((28, 28)).numpy())
f.set_title(lbl)
f.axes.get_xaxis().set_visible(False) # 不显示x轴坐标
f.axes.get_yaxis().set_visible(False) # 不显示y轴坐标
plt.show()
#看一下训练数据集中前10个样本的图像内容和文本标签
X, y = [], []
for i in range(10):
X.append(mnist_train[i][0])
y.append(mnist_train[i][1])
show_fashion_mnist(X, get_fashion_mnist_labels(y))
#读取小批量
batch_size = 256
if sys.platform.startswith('win'):
num_workers = 0 # 0表示不用额外的进程来加速读取数据
else:
num_workers = 4
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers) #打乱测试集
#查看读取一遍训练数据所花时间
start = time.time()
for X, y in train_iter:
continue
print('%.2f sec' % (time.time() - start))
out1:
torchvision.datasets.mnist.FashionMNIST
60000 10000
torch.Size([1, 28, 28]) 9
4.68 sec
三、从零实现
demo2:
import torch
import torchvision
import numpy as np
import sys
sys.path.append("..") # 为了导入上层目录的d2lzh_pytorch
import d2lzh_pytorch as d2l
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
num_inputs = 784
num_outputs = 10
W = torch.tensor(np.random.normal(0, 0.01, (num_inputs, num_outputs)), dtype=torch.float)
b = torch.zeros(num_outputs, dtype=torch.float)
W.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)
def softmax(X):
X_exp = X.exp()
partition = X_exp.sum(dim=1, keepdim=True)
return X_exp / partition # 这里应用了广播机制
def net(X):
return softmax(torch.mm(X.view((-1, num_inputs)), W) + b)#view函数将每张原始图像改成长度为num_inputs的向量
def cross_entropy(y_hat, y):
return - torch.log(y_hat.gather(1, y.view(-1, 1)))
def accuracy(y_hat, y):
return (y_hat.argmax(dim=1) == y).float().mean().item()
# 本函数已保存在d2lzh_pytorch包中方便以后使用。该函数将被逐步改进:它的完整实现将在“图像增广”一节中描述
def evaluate_accuracy(data_iter, net):
acc_sum, n = 0.0, 0
for X, y in data_iter:
acc_sum += (net(X).argmax(dim=1) == y).float().sum().item()
n += y.shape[0]
return acc_sum / n
num_epochs, lr = 5, 0.1
# 本函数已保存在d2lzh包中方便以后使用
def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
params=None, lr=None, optimizer=None):
for epoch in range(num_epochs):
train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
for X, y in train_iter:
y_hat = net(X)
l = loss(y_hat, y).sum()
# 梯度清零
if optimizer is not None:
optimizer.zero_grad()
elif params is not None and params[0].grad is not None:
for param in params:
param.grad.data.zero_()
l.backward()
if optimizer is None:
d2l.sgd(params, lr, batch_size)
else:
optimizer.step() # “softmax回归的简洁实现”一节将用到
train_l_sum += l.item()
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
n += y.shape[0]
test_acc = evaluate_accuracy(test_iter, net)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))
train_ch3(net, train_iter, test_iter, cross_entropy, num_epochs, batch_size, [W, b], lr)
out2:
epoch 1, loss 0.7863, train acc 0.749, test acc 0.789
epoch 2, loss 0.5710, train acc 0.814, test acc 0.809
epoch 3, loss 0.5258, train acc 0.826, test acc 0.815
epoch 4, loss 0.5005, train acc 0.833, test acc 0.827
epoch 5, loss 0.4857, train acc 0.838, test acc 0.825
相关已封装好的函数:
load_data_fashion_mnist
import sys
import torch
import torchvision
from torch.utils import data
from torchvision import transforms
def load_data_fashion_mnist(batch_size, root='~/Datasets/FashionMNIST'):
"""Download the fashion mnist dataset and then load into memory."""
transform = transforms.ToTensor()
mnist_train = torchvision.datasets.FashionMNIST(root=root, train=True, download=True, transform=transform)
mnist_test = torchvision.datasets.FashionMNIST(root=root, train=False, download=True, transform=transform)
if sys.platform.startswith('win'):
num_workers = 0 # 0表示不用额外的进程来加速读取数据
else:
num_workers = 4
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)
return train_iter, test_iter
#实验该函数
train_iter, test_iter = load_data_fashion_mnist(32)
for X, y in train_iter:
print(X.shape, X.dtype, y.shape, y.dtype)
break
#输出:torch.Size([32, 1, 28, 28]) torch.float32 torch.Size([32]) torch.int64
四、简洁实现
demo3:
import torch
from torch import nn
import d2lzh_pytorch as d2l
import matplotlib.pyplot as plt
import myutils #导入自己改编后的train_ch3()函数和绘图函数
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
#初始化模型参数
# PyTorch不会隐式地调整输入的形状。因此,
# 我们在线性层前定义了展平层(flatten),来调整网络输入的形状
net = nn.Sequential(nn.Flatten(), nn.Linear(784, 10)) #全连接层在Linear类中定义
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01) #通过init.normal_将权重参数每个元素初始化,这里没有写mean关键字,直接采用缺省参数(即0.0)
net.apply(init_weights); #用以初始化权重
#定义损失函数
loss = nn.CrossEntropyLoss()
#优化算法
optimizer = torch.optim.SGD(net.parameters(), lr=0.1)
#训练
num_epochs = 10
loss_list, train_acc_list, test_acc_list = myutils.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)
#绘图展示
epochs_list = list(map(lambda x:x+1,list(range(10)))) # 构造训练周期的横坐标
myutils.show_img(epochs_list, loss_list, train_acc_list, test_acc_list)
out3:
epoch 1, loss 0.0031, train acc 0.747, test acc 0.755
epoch 2, loss 0.0022, train acc 0.813, test acc 0.808
epoch 3, loss 0.0021, train acc 0.825, test acc 0.812
epoch 4, loss 0.0020, train acc 0.832, test acc 0.825
epoch 5, loss 0.0019, train acc 0.836, test acc 0.822
epoch 6, loss 0.0019, train acc 0.840, test acc 0.823
epoch 7, loss 0.0018, train acc 0.843, test acc 0.827
epoch 8, loss 0.0018, train acc 0.845, test acc 0.827
epoch 9, loss 0.0018, train acc 0.847, test acc 0.829
epoch 10, loss 0.0018, train acc 0.848, test acc 0.829
myutils包内容:
import matplotlib.pyplot as plt
#定义优化算法
def sgd(params, lr, batch_size):
for param in params:
param.data -= lr * param.grad / batch_size # 注意这里更改param时用的param.data
#计算分类准确率
def evaluate_accuracy(data_iter, net):
acc_sum, n = 0.0, 0
for X, y in data_iter:
acc_sum += (net(X).argmax(dim=1) == y).float().sum().item()
n += y.shape[0]
return acc_sum / n
#训练模型
def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
params=None, lr=None, optimizer=None):
loss_list, train_acc_list, test_acc_list = [], [], []
for epoch in range(num_epochs):
train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
for X, y in train_iter:
y_hat = net(X)
l = loss(y_hat, y).sum()
# 梯度清零
if optimizer is not None:
optimizer.zero_grad()
elif params is not None and params[0].grad is not None:
for param in params:
param.grad.data.zero_()
l.backward()
if optimizer is None:
sgd(params, lr, batch_size)
else:
optimizer.step() # “softmax回归的简洁实现”一节将用到
train_l_sum += l.item()
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
n += y.shape[0]
test_acc = evaluate_accuracy(test_iter, net)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))
loss_list.append(train_l_sum / n)
train_acc_list.append(train_acc_sum / n)
test_acc_list.append(test_acc)
return loss_list, train_acc_list, test_acc_list
#画图展示
def show_img(epochs_list, loss_list, train_acc_list, test_acc_list):
plt.subplot(211)
plt.plot(epochs_list, train_acc_list, label='train acc', color='r')
plt.plot(epochs_list, test_acc_list, label='test acc', color='g')
plt.title('Train acc and test acc')
plt.xlabel('Epochs')
plt.ylabel('Acc')
plt.tight_layout()
plt.legend()
plt.subplot(212)
plt.plot(epochs_list, loss_list, label='train loss', color='b')
plt.title('Train loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.tight_layout()
plt.legend()
plt.show()
五、函数拓展
1.Python startswith()方法
实验程序:
'''
str.startswith(str, beg=0,end=len(string));
str -- 检测的字符串
beg -- 可选参数用于设置字符串检测的起始位置
end -- 可选参数用于设置字符串检测的结束位置
'''
str = "I am your father"
print(str.startswith( 'I' ))
print(str.startswith( 'am', 2, 4 ))
print(str.startswith( 'I', 2, 4 ))
输出:
True
True
False
2.gather函数
首先,先给出torch.gather函数的函数定义:
torch.gather(input, dim, index, out=None) → Tensor
功能:对于输入的input(一个tensor),在指定的某个维度上,按照index在同纬度上的下标顺序,将input在该维度上的值依次取出。
实验程序:
import torch
a = torch.Tensor([[1,2,3],[4,5,6]])
print(a)
index_1 = torch.LongTensor([[0,1],[2,0]])
index_2 = torch.LongTensor([[0,1,1],[0,0,0]])
b = torch.gather(a, dim=1, index=index_1)
print(b)
c = torch.gather(a, dim=0, index=index_2)
print(c)
#另一种写法
d = a.gather(1, index_1)
print(b == d)
输出:
tensor([[1., 2., 3.],
[4., 5., 6.]])
tensor([[1., 2.],
[6., 4.]])
tensor([[1., 5., 6.],
[1., 2., 3.]])
tensor([[True, True],
[True, True]])
注意:很容易就会发现 torch.gather(input, dim, index, out=None)中的dim表示的就是第几维度,在这个二维例子中,如果dim=0,
那么它表示的就是你接下来的操作是对于第一维度进行的,也就是行;如果dim=1,那么它表示的就是你接下来的操作是对于第二维度进行的,也就是列。index的大小和input的大小是一样的,他表示的是你所选择的维度上的操作。
特别注意一下,index的类型必须是LongTensor类型的。
参考:https://blog.csdn.net/edogawachia/article/details/80515038
https://blog.csdn.net/weixin_43301333/article/details/110929471
https://blog.csdn.net/Lucky_Rocks/article/details/79676095
与gather函数类似的方法:
程序示例:
import torch
y = torch.tensor([0, 2])
y_hat = torch.tensor([[0.1, 0.3, 0.6], [0.3, 0.2, 0.5]])
print(y_hat[[0, 1], y])
#[0,1]表示第一个和第二个样本,根据y中的坐标值在y_hat找对应的值
输出:
tensor([0.1000, 0.5000])
3.isinstance()函数
isinstance() 函数来判断一个对象是否是一个已知的类型,类似 type()。
用法:isinstance(object, classinfo)
object – 实例对象。
classinfo – 可以是直接或间接类名、基本类型或者由它们组成的元组。
实验程序:
a = 1
print(isinstance(a, int))
输出:
True