现在,我们来学习如何使用预训练的网络解决有挑战性的计算机视觉问题。你将使用通过 ImageNet 位于 torchvision 上训练的网络。
ImageNet 是一个庞大的数据集,包含 100 多万张有标签图像,并涉及 1000 个类别。.它可用于训练采用卷积层结构的深度神经网络。我不会详细讲解卷积网络,但是你可以观看此视频了解这种网络。
训练过后,作为特征检测器,这些模型可以在训练时未使用的图像上达到惊人的效果。对不在训练集中的图像使用预训练的网络称为迁移学习。我们将使用迁移学习训练网络分类猫狗照片并达到很高的准确率。
使用 torchvision.models
,你可以下载这些预训练的网络,并用在你的应用中。我们现在导入 models
。
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
import helper
大多数预训练的模型要求输入是 224x224 图像。此外,我们需要按照训练时采用的标准化方法转换图像。每个颜色通道都分别进行了标准化,均值为 [0.485, 0.456, 0.406]
,标准偏差为 [0.229, 0.224, 0.225]
。
data_dir = 'Cat_Dog_data'
# TODO: Define transforms for the training data and testing data
train_transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
# Pass transforms in here, then run the next cell to see how the transforms look
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)
trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(test_data, batch_size=64)
data_iter = iter(testloader)
images, labels = next(data_iter)
fig, axes = plt.subplots(figsize=(10,4), ncols=4)
for ii in range(4):
ax = axes[ii]
helper.imshow(images[ii], ax=ax, normalize=False)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
我们可以加载 DenseNet 等模型。现在我们输出这个模型的结构,看看后台情况。
model = models.densenet121(pretrained=True)
#model
该模型由两部分组成:特征和分类器。特征部分由一堆卷积层组成,整体作为特征检测器传入分类器中。分类器是一个单独的全连接层 (classifier): Linear(in_features=1024, out_features=1000)
。这个层级是用 ImageNet 数据集训练过的层级,因此无法解决我们的问题。这意味着我们需要替换分类器。但是特征就完全没有问题。你可以把预训练的网络看做是效果很好地的特征检测器,可以用作简单前馈分类器的输入。
# Freeze parameters so we don't backprop through them
for param in model.parameters():
param.requires_grad = False
from collections import OrderedDict
classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(1024, 500)),
('relu', nn.ReLU()),
('fc2', nn.Linear(500, 2)),
('output', nn.LogSoftmax(dim=1))
]))
model.classifier = classifier
model
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(denseblock1): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(fc1): Linear(in_features=1024, out_features=500, bias=True)
(relu): ReLU()
(fc2): Linear(in_features=500, out_features=2, bias=True)
(output): LogSoftmax()
)
)
构建好模型后,我们需要训练分类器。但是,问题是,现在我们使用的是非常深度的神经网络。如果你正常地在 CPU 上训练此网络,这会耗费相当长的时间。所以,我们将使用 GPU 进行运算。在 GPU 上,线性代数运算同步进行,这使得运算速度提升了 100 倍。我们还可以在多个 GPU 上训练,进一步缩短训练时间。
PyTorch 和其他深度学习框架一样,也使用 CUDA 在 GPU 上高效地进行前向和反向运算。在 PyTorch 中,你需要使用 model.to('cuda')
将模型参数和其他张量转移到 GPU 内存中。你可以使用 model.to('cpu')
将它们从 GPU 移到 CPU,比如在你需要在 PyTorch 之外对网络输出执行运算时。为了向你展示速度的提升对比,我将分别使用 GPU 和不使用 GPU 进行前向和反向传播运算。
import time
for device in ['cpu', 'cpu']:#'cuda'
criterion = nn.NLLLoss()
# Only train the classifier parameters, feature parameters are frozen
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
model.to(device)
for ii, (inputs, labels) in enumerate(trainloader):
# Move input and label tensors to the GPU
inputs, labels = inputs.to(device), labels.to(device)
start = time.time()
outputs = model.forward(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if ii==3:
break
print(f"Device = {device}; Time per batch: {(time.time() - start)/3:.3f} seconds")
Device = cpu; Time per batch: 1.038 seconds
Device = cpu; Time per batch: 1.030 seconds
你可以先询问 GPU 设备是否可用,如果启用了 CUDA,它将自动使用 CUDA:
at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
…
then whenever you get a new Tensor or Module
this won’t copy if they are already on the desired device
input = data.to(device)
model = MyModule(…).to(device)
接下来由你来完成模型训练过程。流程和之前的差不多,但是现在模型强大了很多,你应该能够轻松地达到 95% 以上的准确率。
**练习:**请训练预训练的模型来分类猫狗图像。你可以继续使用 DenseNet 模型或尝试 ResNet,两个模型都值得推荐。记住,你只需要训练分类器,特征部分的参数应保持不变。
## TODO: Use a pretrained model to classify the cat and dog images
# at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model2 = models.densenet121(pretrained=True)
for param in model2.parameters():
param.requires_grad = False
model2.classifier = nn.Sequential(nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
criterion = nn.NLLLoss()
optimizer = optim.Adam(model2.classifier.parameters(),lr=0.003)
model2.to(device)
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=256, out_features=2, bias=True)
(4): LogSoftmax()
)
)
epochs = 1
steps = 0
running_loss = 0
print_every = 5
for epoch in range(epochs):
for inputs,labels in trainloader:
steps += 1
inputs,labels = inputs.to(device),labels.to(device)
optimizer.zero_grad()
logps = model2.forward(inputs)
loss = criterion(logps, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if steps%print_every == 0:
test_loss = 0
accuracy = 0
model2.eval()
with torch.no_grad():
for inputs,labels in testloader:
inputs,labels = inputs.to(device),labels.to(device)
logps = model2.forward(inputs)
batch_loss = criterion(logps, labels)
test_loss += batch_loss.item()
#Caculate accuracy
ps = torch.exp(logps)
top_p, top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
print(f"Epoch {epoch+1}/{epochs}.. "
f"step: {steps}..."
f"Train loss: {running_loss/print_every:.3f}.. "
f"Test loss: {test_loss/len(testloader):.3f}.. "
f"Test accuracy: {accuracy/len(testloader):.3f}")
if accuracy/len(testloader) > 0.98:
break
running_loss = 0
model2.train()
Epoch 1/1.. step: 5...Train loss: 0.857.. Test loss: 0.748.. Test accuracy: 0.568
...
...
...
Epoch 1/1.. step: 50...Train loss: 0.159.. Test loss: 0.052.. Test accuracy: 0.982
def test_model(my_model,my_device, my_epochs, my_trainloader, my_testloader):
steps = 0
running_loss = 0
print_every = 5
my_criterion = nn.NLLLoss()
my_optimizer = optim.Adam(my_model.classifier.parameters(),lr=0.003)
my_model.to(device)
for epoch in range(my_epochs):
for inputs,labels in my_trainloader:
steps += 1
inputs,labels = inputs.to(my_device),labels.to(my_device)
my_optimizer.zero_grad()
logps = my_model.forward(inputs)
loss = my_criterion(logps, labels)
loss.backward()
my_optimizer.step()
running_loss += loss.item()
if steps%print_every == 0:
test_loss = 0
accuracy = 0
my_model.eval()
with torch.no_grad():
for inputs,labels in my_testloader:
inputs,labels = inputs.to(my_device),labels.to(my_device)
logps = my_model.forward(inputs)
batch_loss = my_criterion(logps, labels)
test_loss += batch_loss.item()
#Caculate accuracy
ps = torch.exp(logps)
top_p, top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
print(f"Epoch {epoch+1}/{my_epochs}.. "
f"step: {steps}..."
f"Train loss: {running_loss/print_every:.3f}.. "
f"Test loss: {test_loss/len(my_testloader):.3f}.. "
f"Test accuracy: {accuracy/len(my_testloader):.3f}")
running_loss = 0
if accuracy/len(testloader) > 0.92:
break
my_model.train()
import time
# at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_type = models.densenet121(pretrained=True)
model_type
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Linear(in_features=1024, out_features=1000, bias=True)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=256, out_features=2, bias=True)
(4): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='densenet121',Time per batch: {(time.time() - start):.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 0.727.. Test loss: 0.233.. Test accuracy: 0.957
Epoch 1/1.. step: 10...Train loss: 0.370.. Test loss: 0.175.. Test accuracy: 0.940
Epoch 1/1.. step: 15...Train loss: 0.305.. Test loss: 0.109.. Test accuracy: 0.962
Epoch 1/1.. step: 20...Train loss: 0.214.. Test loss: 0.090.. Test accuracy: 0.970
Epoch 1/1.. step: 25...Train loss: 0.192.. Test loss: 0.145.. Test accuracy: 0.943
Epoch 1/1.. step: 30...Train loss: 0.235.. Test loss: 0.073.. Test accuracy: 0.973
Device = cpu; model_type='densenet121',Time per batch: 1363.256 seconds
model_type = models.alexnet(pretrained=True)
model_type
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(9216,4096),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(4096,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Linear(in_features=9216, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=4096, out_features=256, bias=True)
(4): ReLU()
(5): Dropout(p=0.2, inplace=False)
(6): Linear(in_features=256, out_features=2, bias=True)
(7): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='resnet',Time per batch: {(time.time() - start)/1:.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 24.561.. Test loss: 1.131.. Test accuracy: 0.490
Epoch 1/1.. step: 10...Train loss: 0.979.. Test loss: 0.261.. Test accuracy: 0.896
Epoch 1/1.. step: 15...Train loss: 0.499.. Test loss: 0.432.. Test accuracy: 0.791
Epoch 1/1.. step: 20...Train loss: 0.503.. Test loss: 0.277.. Test accuracy: 0.896
Epoch 1/1.. step: 25...Train loss: 0.395.. Test loss: 0.188.. Test accuracy: 0.922
Device = cpu; model_type='resnet',Time per batch: 157.419 seconds
model_type = models.vgg16(pretrained=True)
model_type
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /home/leon/.cache/torch/checkpoints/vgg16-397923af.pth
100.0%
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
...
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(25088,4096),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(4096,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=4096, out_features=256, bias=True)
(4): ReLU()
(5): Dropout(p=0.2, inplace=False)
(6): Linear(in_features=256, out_features=2, bias=True)
(7): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='resnet',Time per batch: {(time.time() - start)/1:.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 45.882.. Test loss: 3.707.. Test accuracy: 0.518
Epoch 1/1.. step: 10...Train loss: 1.980.. Test loss: 0.121.. Test accuracy: 0.957
Device = cpu; model_type='resnet',Time per batch: 672.757 seconds
你需要检查传入模型和其他代码的张量形状是否正确。在调试和开发过程中使用 .shape 方法。
如果网络训练效果不好,检查以下几个事项:
在训练循环中使用 optimizer.zero_grad() 清理梯度。如果执行验证循环,使用 model.eval() 将网络设为评估模式,再使用 model.train() 将其设为训练模式。
CUDA 错误
有时候你会遇到这个错误:
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #1 ‘mat1’
第二个类型是 torch.cuda.FloatTensor,这意味着它是已经移到 GPU 的张量。它想获得类型为 torch.FloatTensor 的张量,但是没有 .cuda,因此该张量应该在 CPU 上。PyTorch 只能对位于相同设备上的张量进行运算,因此必须同时位于 CPU 或 GPU 上。如果你要在 GPU 上运行网络,一定要使用 .to(device) 将模型和所有必要张量移到 GPU 上,其中 device 为 “cuda” 或 “cpu”。