1. 数据集下载并转换为张量
train_set = torchvision.datasets.CIFAR10(root=data_path,
train=True,
download=True,
transform=transforms.Compose([
transforms.ToTensor()
]))
val_set = torchvision.datasets.CIFAR10(root=data_path,
train=False,
download=True,
transform=transforms.Compose([
transforms.ToTensor()
])
)
2. 数据集大小
由 len(train_set)
, 可以得到训练集有50000
个样本; 同理,由len(val_set)
, 可以得到测试集有10000
个样本。
3. 每个样本输出的类型
由type(train_set[0])
,可以得到输出的类型为元组tuple
, 分别输出Img(图像数据)
和Label(标签数据)
。
读取数据集的三个步骤:Extract,Transform, Load
训练时需要设置几个参数: model
, optimizer
, data(train, val)
, loss
, learning rate
, epochs
。
def training_loop(epochs, model, train_loader, loss_fn, optimizer, learning_rate):
for epoch in range(epochs):
print(epoch)
for imgs, labels in train_loader:
inputs = imgs.view(imgs.shape[0], -1)
outputs = model(inputs)
loss = loss_fn(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch: {}, Loss: {}".format(epoch, loss))
载入DataLoader模型,按设置的batch,张量大小为
[Batch, Channels, Height, Width]
载入到模型时,需要重塑张量大小
[Batch, Channels·Height·Width]
model
model = nn.Sequential(
nn.Linear(3072, 128),
nn.Tanh(),
nn.Linear(128, 10),
nn.LogSoftmax(dim=0))
optimizer
optimizer = optim.SGD(model.parameters(), lr=lr)
data
train_loader = torch.utils.data.DataLoader(train_cifar10, batch_size=10)
val_loader = torch.utils.data.DataLoader(val_cifar10, batch_size=10)
loss
loss_fn = nn.NLLLoss()
epochs & learning rate
epochs = 10
lr = 1e-2
训练
training_loop(epochs=epochs, model=model, train_loader=train_loader,
loss_fn=loss_fn, optimizer=optimizer, learning_rate=lr)
训练后的参数解析
model.parameters() >> 地址
len(list(model.parameters())) >> 4
for name, params in model.named_parameters():
print(name,params.shape)
得到的:
0.weight, torch.Size([128, 3072]
0.bias, torch.Size([128])
2.weight, torch.Size([10, 128])
2.bias, torch.Size([10])
训练得到的参数在训练集中的表现–精确率
with torch.no_grad():
for imgs, labels in train_loader:
outputs = model(imgs.view(imgs.shape[0], -1))
# print(outputs.shape) >> [batch, 分类数]
# print(labels.shape) >> [batch]
_, predict_index = torch.max(outputs, dim=1)
# print(predict_index)
total += labels.shape[0]
correct += int((labels == predict_index).sum())
arr = float(correct) / float(total)
print(arr)
使用model后,得到的[batch, 分类数]如[10, 10] 表示一个batch中的十张图片的10个分类的概率,找到最大概率值的索引位置即为该位置所在的分类类别。
如outputs[0]
>>>tensor([-2.1677, -4.5608, -1.9927, -1.4298, -1.6841, -1.7170, -1.2153, -3.7270, -7.6631, -6.9029])
, 找到最大的值-1.2153
的索引6
。 则6
为labels
里面的frog
。