PyTorch 有两个处理数据的原语: torch.utils.data.DataLoader
和torch.utils.data.Dataset
. Dataset
存储样本及其对应的标签,并使用DataLoader
加载Dataset
.
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
PyTorch 提供特定领域的库,例如TorchText、 TorchVision和TorchAudio,所有这些库都包含数据集。在本教程中,我们将使用 TorchVision 数据集。
在torchvision.datasets
模块中包含了许多开放的Dataset
,如 CIFAR、COCO(此处为完整列表)。在本教程中,我们使用 FashionMNIST 数据集。每个 TorchVision 都包含两个参数:transform
和 target_transform
分别用于修改样本和标签。
# 下载训练数据集
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor(),
)
# 下载测试数据集
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor(),
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz
0%| | 0/26421880 [00:00, ?it/s]
0%| | 32768/26421880 [00:00<01:26, 304394.43it/s]
0%| | 65536/26421880 [00:00<01:26, 303334.54it/s]
0%| | 131072/26421880 [00:00<00:59, 440585.41it/s]
1%| | 229376/26421880 [00:00<00:41, 624477.93it/s]
2%|1 | 491520/26421880 [00:00<00:20, 1270218.33it/s]
4%|3 | 950272/26421880 [00:00<00:11, 2277354.24it/s]
7%|7 | 1933312/26421880 [00:00<00:05, 4496259.57it/s]
15%|#4 | 3833856/26421880 [00:00<00:02, 8649960.23it/s]
26%|##6 | 6979584/26421880 [00:00<00:01, 14976874.98it/s]
37%|###7 | 9863168/26421880 [00:01<00:00, 18507302.14it/s]
49%|####9 | 13008896/26421880 [00:01<00:00, 21668779.11it/s]
60%|###### | 15925248/26421880 [00:01<00:00, 23192087.86it/s]
72%|#######2 | 19070976/26421880 [00:01<00:00, 24888330.18it/s]
83%|########2 | 21889024/26421880 [00:01<00:00, 25145959.65it/s]
95%|#########4| 24969216/26421880 [00:01<00:00, 26070324.51it/s]
100%|##########| 26421880/26421880 [00:01<00:00, 16141263.69it/s]
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
0%| | 0/29515 [00:00, ?it/s]
100%|##########| 29515/29515 [00:00<00:00, 268976.64it/s]
100%|##########| 29515/29515 [00:00<00:00, 267648.41it/s]
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
0%| | 0/4422102 [00:00, ?it/s]
1%| | 32768/4422102 [00:00<00:14, 301438.23it/s]
1%|1 | 65536/4422102 [00:00<00:14, 300094.76it/s]
3%|2 | 131072/4422102 [00:00<00:09, 436627.27it/s]
5%|5 | 229376/4422102 [00:00<00:06, 619247.61it/s]
11%|#1 | 491520/4422102 [00:00<00:03, 1260172.23it/s]
21%|##1 | 950272/4422102 [00:00<00:01, 2257665.28it/s]
44%|####3 | 1933312/4422102 [00:00<00:00, 4454674.72it/s]
87%|########6 | 3833856/4422102 [00:00<00:00, 8569864.79it/s]
100%|##########| 4422102/4422102 [00:00<00:00, 5042743.74it/s]
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
0%| | 0/5148 [00:00, ?it/s]
100%|##########| 5148/5148 [00:00<00:00, 21527693.91it/s]
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw
我们将Dataset
作为参数传递给DataLoader
。DataLoader
包装了一个对Dataset
遍历的迭代器,并支持自动批处理、采样、混洗和多进程数据加载。这里我们定义了batch size为64,即dataloader 在每次迭代时都会返回 64个数据和其对应的标签。
# 设置批次大小为64
batch_size = 64
# Create data loaders.
# 加载训练集
train_dataloader = DataLoader(training_data, batch_size=batch_size)
# 加载测试集
test_dataloader = DataLoader(test_data, batch_size=batch_size)
for X, y in test_dataloader:
print(f"Shape of X [N, C, H, W]: {X.shape}")
print(f"Shape of y: {y.shape} {y.dtype}")
break
Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64
阅读有关在 PyTorch 中加载数据的更多信息。
为了在 PyTorch 中定义神经网络,我们创建了一个继承自nn.Module的类。我们在__init__
函数中定义网络层,并在forward
函数中指定数据将如何通过网络。为了加速神经网络中的操作,我们将其移至 GPU(如果可用)。
# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
# 定义模型NeuralNetwork,该模型继承于nn.Module
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10)
)
# 覆盖重写前向传播函数
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
# 把模型传送到gpu
model = NeuralNetwork().to(device)
print(model)
Using cuda device
NeuralNetwork(
(flatten): Flatten(start_dim=1, end_dim=-1)
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=512, bias=True)
(3): ReLU()
(4): Linear(in_features=512, out_features=10, bias=True)
)
)
在在 PyTorch 中构建神经网络的阅读有关更多信息。
为了训练一个模型,我们需要一个损失函数 和一个优化器。
# 损失函数
loss_fn = nn.CrossEntropyLoss()
# 优化器为 随机梯度下降SGG, 学习率为lr=1e-3
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
在单个训练循环中,模型对训练数据集进行预测(分批输入),并反向传播预测误差以调整模型的参数。
# 训练函数
def train(dataloader, model, loss_fn, optimizer):
# 获取数据集总大小
size = len(dataloader.dataset)
# 准备模型进行前向传播训练
# model.train()启用 batch normalization 和 dropout 。
# 如果模型中有BN层(Batch Normalization)和 Dropout ,需要在 训练时 添加
# model.train() 是保证 BN 层能够用到 每一批数据 的均值和方差。对于 Dropout, # model.train() 是 随机取一部分 网络连接来训练更新参数
model.train()
# 分批次把数据传入模型
for batch, (X, y) in enumerate(dataloader):
# 把数据加载到GPU
X, y = X.to(device), y.to(device)
# 预测结果
pred = model(X)
# 计算损失
loss = loss_fn(pred, y)
# 梯度归零
optimizer.zero_grad()
# 反向传播
loss.backward()
# 更新权重
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
我们还根据测试数据集检查模型的性能,以确保它正在学习。
# 测试
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
# model.eval()不启用 BatchNormalization 和 Dropout
# 训练完 train 样本后,生成的模型 model 要用来测试样本。在 model(test) 之
# 前,需要加上model.eval(),否则只要有输入数据,即使不训练,model 也会改变权
# 值。这是model中含有的 batch normalization 层所带来的的性质
model.eval()
test_loss, correct = 0, 0
# torch.no_grad在测试时不更新梯度
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
训练过程在多次迭代(epochs)中进行。在每个epoch,模型都会学习参数以做出更好的预测。我们在每个epoch打印模型的准确性和损失;我们希望看到每个 epoch 的准确率增加和损失减少。
# 设置默认的迭代次数为5
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
# 训练
train(train_dataloader, model, loss_fn, optimizer)
# 测试
test(test_dataloader, model, loss_fn)
print("Done!")
Epoch 1
-------------------------------
loss: 2.314893 [ 0/60000]
loss: 2.295206 [ 6400/60000]
loss: 2.278248 [12800/60000]
loss: 2.261804 [19200/60000]
loss: 2.259621 [25600/60000]
loss: 2.220173 [32000/60000]
loss: 2.232810 [38400/60000]
loss: 2.199674 [44800/60000]
loss: 2.190488 [51200/60000]
loss: 2.160208 [57600/60000]
Test Error:
Accuracy: 34.4%, Avg loss: 2.153365
Epoch 2
-------------------------------
loss: 2.172394 [ 0/60000]
loss: 2.158403 [ 6400/60000]
loss: 2.104490 [12800/60000]
loss: 2.118272 [19200/60000]
loss: 2.084654 [25600/60000]
loss: 2.008146 [32000/60000]
loss: 2.046550 [38400/60000]
loss: 1.967219 [44800/60000]
loss: 1.970731 [51200/60000]
loss: 1.904694 [57600/60000]
Test Error:
Accuracy: 56.2%, Avg loss: 1.898913
Epoch 3
-------------------------------
loss: 1.931435 [ 0/60000]
loss: 1.901530 [ 6400/60000]
loss: 1.791456 [12800/60000]
loss: 1.836366 [19200/60000]
loss: 1.731514 [25600/60000]
loss: 1.669595 [32000/60000]
loss: 1.696603 [38400/60000]
loss: 1.593150 [44800/60000]
loss: 1.619061 [51200/60000]
loss: 1.514459 [57600/60000]
Test Error:
Accuracy: 61.3%, Avg loss: 1.526715
Epoch 4
-------------------------------
loss: 1.593652 [ 0/60000]
loss: 1.553579 [ 6400/60000]
loss: 1.409360 [12800/60000]
loss: 1.483904 [19200/60000]
loss: 1.365681 [25600/60000]
loss: 1.352317 [32000/60000]
loss: 1.362342 [38400/60000]
loss: 1.286407 [44800/60000]
loss: 1.323951 [51200/60000]
loss: 1.222849 [57600/60000]
Test Error:
Accuracy: 63.8%, Avg loss: 1.248915
Epoch 5
-------------------------------
loss: 1.327686 [ 0/60000]
loss: 1.306169 [ 6400/60000]
loss: 1.145820 [12800/60000]
loss: 1.253144 [19200/60000]
loss: 1.131874 [25600/60000]
loss: 1.150164 [32000/60000]
loss: 1.162306 [38400/60000]
loss: 1.102028 [44800/60000]
loss: 1.143301 [51200/60000]
loss: 1.062239 [57600/60000]
Test Error:
Accuracy: 65.4%, Avg loss: 1.083091
Done!
阅读有关训练模型的更多信息。
保存模型的常用方法是序列化内部状态字典(包含模型参数)。
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")
Saved PyTorch Model State to model.pth
加载模型的过程包括重新创建模型结构并将状态字典加载到其中。
# 重新创建模型
model = NeuralNetwork()
# 加载权重参数
model.load_state_dict(torch.load("model.pth"))
该模型现在可用于进行预测。
# 预测类的标签
classes = [
"T-shirt/top",
"Trouser",
"Pullover",
"Dress",
"Coat",
"Sandal",
"Shirt",
"Sneaker",
"Bag",
"Ankle boot",
]
# 初始化模型
model.eval()
# 准备预测数据
x, y = test_data[0][0], test_data[0][1]
# 在预测时不更新梯度
with torch.no_grad():
# 调用模型执行预测
pred = model(x)
predicted, actual = classes[pred[0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
Predicted: "Ankle boot", Actual: "Ankle boot"