采用torchvision中的vgg16模型,能够实现1000个类型的图像分类,VGG模型在AlexNet的基础上使用3*3小卷积核,增加网络深度,具有很好的泛化能力。
首先下载vgg16模型,python代码如下:
import torchvision
# 下载路径:C:\Users\win10\.cache\torch\hub\checkpoints
vgg16_false = torchvision.models.vgg16(pretrained=False)
vgg16_true = torchvision.models.vgg16(pretrained=True)
print("ok")
下载结果:
G:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
G:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
warnings.warn(msg)
G:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
ok
查看预训练的模型和未预训练的模型的内部结构:
import torchvision
vgg16_false = torchvision.models.vgg16(pretrained=False)
vgg16_true = torchvision.models.vgg16(pretrained=True)
print(vgg16_true)
print(vgg16_false)
预训练的模型和未预训练的模型在整体结构上相同,但内部节点的参数(weight和bias)有所不同。
输出结果如下:
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
可以发现(classifier)中最后一层可以发现out_features=1000
,表示该模型能够支持1000种类型的图像分类。
迁移学习是机器学习的一个子领域,它允许一个已经在一个任务上训练好的模型用于另一个但相关的任务。通过这种方式,模型可以借用在原任务上学到的知识,从而更快地、更准确地完成新任务。
本文采用CIFAR10数据集,内部包含10个种类的图像,修改vgg16模型对数据集进行图像分类。为了将此数据集代入vgg16模型,需要对模型进行修改。
(classifier): Sequential(
... ...
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
使用add_module()函数添加模块。由于最后的归一化层为4096通道输出转1000通道输出,因此添加一个归一化层将1000通道输出转换为10通道输出。
import torchvision
from torch import nn
vgg16_true = torchvision.models.vgg16(pretrained=True)
print(vgg16_true)
vgg16_true.classifier.add_module('add_linear', nn.Linear(1000, 10))
print(vgg16_true)
输出结果(部分):
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
(add_linear): Linear(in_features=1000, out_features=10, bias=True)
)
对classifier内的第6层进行修改。由于最后的归一化层为4096通道输出转1000通道输出,因此需要修改为4096通道输出转换为10通道输出。
import torchvision
from torch import nn
vgg16_false = torchvision.models.vgg16(pretrained=False)
print(vgg16_false)
vgg16_false.classifier[6] = nn.Linear(4096, 10)
print(vgg16_false)
输出结果(部分):
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=10, bias=True)
)
有两种方式保存模型数据。第一种保存方式是将模型结构和模型参数保存,第二种保存方式只是保存模型参数,以字典类型保存。
python代码如下:
import torch
import torchvision
from torch import nn
from torch.nn import Linear, Conv2d, MaxPool2d, Flatten, Sequential, CrossEntropyLoss
vgg16_false = torchvision.models.vgg16(pretrained=False) # 未经过训练的模型
# 保存方式1,模型结构+模型参数
torch.save(vgg16_false, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_method1.pth")
# 保存方式2,模型参数(官方推荐)
torch.save(vgg16_false.state_dict(), "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_method2.pth")
保存修改过的模型或自己的编写的模型:
# 保存模型和导入模型时都需要导入MYNN这个类
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2, stride=1),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2, stride=1),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2, stride=1),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
mynn = MYNN()
torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_method1.pth")
有两种方式导入模型数据。第一种导入方式能够直接使用,第二种导入方法需要将字典数据导入原来的网络模型。
import torch
import torchvision
from torch import nn
from torch.nn import Linear, Conv2d, MaxPool2d, Flatten, Sequential, CrossEntropyLoss
# 方式1:加载模型
vgg16_import = torch.load("G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_method1.pth")
print(vgg16_import)
# 方式2:加载模型(字典数据)
vgg16_import2 = torch.load("G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_method2.pth")
vgg16_new = torchvision.models.vgg16(pretrained=False) # 重新加载模型
vgg16_new.load_state_dict(vgg16_import2) # 将数据填入模型
print(vgg16_import2)
print(vgg16_new)
导入保存的自己的模型,python代码如下:
# 需要导入自己网络模型
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2, stride=1),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2, stride=1),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2, stride=1),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
model = torch.load("G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_method1.pth")
print(model)
自己的网络模型导入运行结果:
MYNN(
(model1): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Flatten(start_dim=1, end_dim=-1)
(7): Linear(in_features=1024, out_features=64, bias=True)
(8): Linear(in_features=64, out_features=10, bias=True)
)
)
神经网络训练的步骤:
1. 导入数据集
2. 添加DataLoader
3. 搭建神经网络模型
4. 实例化神经网络模型
5. 创建损失函数
6. 设置优化器
7. 设置训练参数
8. 添加TensorBoard
9. 训练循环:
(1) 训练数据集训练
a. 训练图片导入神经网络
b. 计算损失函数
c. 梯度清零
d. 反向传播
f. 模型参数调优
e. 输出损失值
(2) 测试数据集测试
a. 测试图片导入神经网络
b. 计算损失函数
c. 统计正确targets的个数
d. 输出损失值和正确率
(3) 保存每轮训练优化后的网络模型
python代码如下:
import torchvision
from torch.utils.data import DataLoader
from torch import nn
import torch
from torch.utils.tensorboard import SummaryWriter
# 导入数据集
train_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=True,
transform=torchvision.transforms.ToTensor(),
download=False)
test_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=False,
transform=torchvision.transforms.ToTensor(),
download=False)
train_data_size = len(train_data) # 训练数据集长度
test_data_size = len(test_data) # 测试数据集长度
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用dataloader加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 搭建神经网络
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, 1, 2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64*4*4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
mynn = MYNN()
# 损失函数
loss_fcn = nn.CrossEntropyLoss()
# 优化器
learning_rate = 0.01
optimizer = torch.optim.SGD(mynn.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
## 记录训练的次数
total_train_step = 0
## 记录测试的次数
total_test_step = 0
## 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_mynn_train")
for i in range(epoch):
print("\r\n-------------第{}轮训练开始---------------".format(i+1))
# 训练步骤开始
#mynn.train() # 将网络设置为训练状态
for data in train_dataloader:
imgs, targets = data
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad() # 梯度清零
loss.backward() # 反向传播,求出每个节点的梯度
optimizer.step() # 对神经网络模型的参数进行调优
# 输出Loss状态
total_train_step = total_train_step + 1
output_steps = 25
if total_train_step % output_steps == 0:
print("训练次数: {}, Loss: {:.4f}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
#mynn.eval() # 将网络设置为评估状态
total_test_loss = 0
total_test_accuracy = 0 # 整体正确的个数
with torch.no_grad(): # 不计算梯度
for data in test_dataloader:
imgs, targets = data
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
total_test_loss = total_test_loss + loss
accuracy = (outputs.argmax(1) == targets).sum() # 计算正确targets的个数
total_test_accuracy = total_test_accuracy + accuracy
print("整体测试集上的Loss: {:.4f}".format(total_test_loss))
print("整体测试集上的正确率:{:.4f}".format(total_test_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss.item(), total_test_step)
writer.add_scalar("test_accuracy", total_test_accuracy/test_data_size, total_test_step)
total_test_step = total_test_step + 1
# 保存每轮训练的神经网络模型
torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_AutoSave\\mynn_{}.pth".format(i))
print("模型已保存。")
writer.close()
代码中包含了train()和eval()函数,这两个函数可以改变神经网络工作模式。
学习链接:
https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module
train()
将神经网络模块设置为训练模式。
这只对某些模块有影响。如果受影响,请参阅特定模块的文档以了解其在训练/评估模式下的行为细节,例如Dropout
、BatchNorm
等。
eval()
将神经网络模块设置为评估模式。
这只对某些模块有影响。如果受影响,请参阅特定模块的文档以了解其在训练/评估模式下的行为细节,例如Dropout
、BatchNorm
等。
由于运行速度非常慢,就不进行演示了。
需要使用英伟达支持CUDA的GPU运行代码。
网络模型、损失函数、数据(图像、标注)转换为.cuda()。
修改完的python代码如下:
import time
import torchvision
from torch.utils.data import DataLoader
from torch import nn
import torch
from torch.utils.tensorboard import SummaryWriter
# 导入数据集
train_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=True,
transform=torchvision.transforms.ToTensor(),
download=False)
test_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=False,
transform=torchvision.transforms.ToTensor(),
download=False)
train_data_size = len(train_data) # 训练数据集长度
test_data_size = len(test_data) # 测试数据集长度
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用dataloader加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 搭建神经网络
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, 1, 2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64*4*4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
mynn = MYNN()
if torch.cuda.is_available():
cuda_flag = 1
else:
cuda_flag = 0
if cuda_flag == 1:
mynn = mynn.cuda() # 转移到GPU的cuda
# 损失函数
loss_fcn = nn.CrossEntropyLoss()
if cuda_flag == 1:
loss_fcn = loss_fcn.cuda() # 转移到GPU的cuda
# 优化器
learning_rate = 0.01
optimizer = torch.optim.SGD(mynn.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
## 记录训练的次数
total_train_step = 0
## 记录测试的次数
total_test_step = 0
## 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_mynn_train")
start_time = time.time()
for i in range(epoch):
print("\r\n-------------第{}轮训练开始---------------".format(i+1))
# 训练步骤开始
#mynn.train() # 将网络设置为训练状态
for data in train_dataloader:
imgs, targets = data
if cuda_flag == 1:
imgs = imgs.cuda()
targets = targets.cuda()
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad() # 梯度清零
loss.backward() # 反向传播,求出每个节点的梯度
optimizer.step() # 对神经网络模型的参数进行调优
# 输出Loss状态
total_train_step = total_train_step + 1
output_steps = 25
if total_train_step % output_steps == 0:
print("训练次数: {}, Loss: {:.4f}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
#mynn.eval() # 将网络设置为评估状态
total_test_loss = 0
total_test_accuracy = 0 # 整体正确的个数
with torch.no_grad(): # 不计算梯度
for data in test_dataloader:
imgs, targets = data
if cuda_flag == 1:
imgs = imgs.cuda()
targets = targets.cuda()
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
total_test_loss = total_test_loss + loss
accuracy = (outputs.argmax(1) == targets).sum() # 计算正确targets的个数
total_test_accuracy = total_test_accuracy + accuracy
print("整体测试集上的Loss: {:.4f}".format(total_test_loss))
print("整体测试集上的正确率:{:.4f}".format(total_test_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss.item(), total_test_step)
writer.add_scalar("test_accuracy", total_test_accuracy/test_data_size, total_test_step)
total_test_step = total_test_step + 1
# 保存每轮训练的神经网络模型
torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_AutoSave\\mynn_{}.pth".format(i))
print("模型已保存。")
end_time = time.time()
print("使用时间为: {:.2f} s".format(end_time-start_time))
writer.close()
程序运行结果(部分):
训练数据集的长度为:50000
测试数据集的长度为:10000
-------------第1轮训练开始---------------
训练次数: 25, Loss: 2.2943
训练次数: 50, Loss: 2.2956
训练次数: 75, Loss: 2.3037
......
整体测试集上的Loss: 314.6040
整体测试集上的正确率:0.2840
模型已保存。
......
-------------第10轮训练开始---------------
训练次数: 7050, Loss: 1.3038
训练次数: 7075, Loss: 1.1709
训练次数: 7100, Loss: 1.3245
......
训练次数: 7775, Loss: 0.9295
训练次数: 7800, Loss: 1.2519
整体测试集上的Loss: 199.4948
整体测试集上的正确率:0.5497
模型已保存。
使用时间为: 95.01 s
神经网络优化循环次数设置为10次。
Terminal输入:tensorboard --logdir=logs_mynn_train
打开网页,显示如下:
(1)train_loss:展示的是通过网络通过拟合,损失值函数的降低情况。
(2)test_loss:单次测试循环的所有图像识别的总损失值,随着迭代不断降低。
(3)test_accuracy:测试准确度不断提高,从30%提高到50%
似乎产生了过拟合问题。
使用以下函数选择训练设备,修改时更加方便,也可以选择多个不同的设备:
device = torch.device("cpu")
# torch.device("cuda")
# torch.device("cuda:0")
# torch.device("cuda:1")
mynn.to(device) # 网络转移到设备
loss_fcn.to(device) # 数据转移到设备
imgs = imgs.to(device)
targets = targets.to(device)
import time
import torchvision
from torch.utils.data import DataLoader
from torch import nn
import torch
from torch.utils.tensorboard import SummaryWriter
# 定义训练的设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # 语法糖
print(device)
# 导入数据集
train_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=True,
transform=torchvision.transforms.ToTensor(),
download=False)
test_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=False,
transform=torchvision.transforms.ToTensor(),
download=False)
train_data_size = len(train_data) # 训练数据集长度
test_data_size = len(test_data) # 测试数据集长度
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用dataloader加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 搭建神经网络
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, 1, 2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64*4*4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
mynn = MYNN()
mynn.to(device) # 转移到设备
# 损失函数
loss_fcn = nn.CrossEntropyLoss()
loss_fcn.to(device) # 转移到设备
# 优化器
learning_rate = 0.01
optimizer = torch.optim.SGD(mynn.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
## 记录训练的次数
total_train_step = 0
## 记录测试的次数
total_test_step = 0
## 训练的轮数
epoch = 20
# 添加tensorboard
writer = SummaryWriter("logs_mynn_train_2")
start_time = time.time()
for i in range(epoch):
print("\r\n-------------第{}轮训练开始---------------".format(i+1))
# 训练步骤开始
#mynn.train() # 将网络设置为训练状态
for data in train_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad() # 梯度清零
loss.backward() # 反向传播,求出每个节点的梯度
optimizer.step() # 对神经网络模型的参数进行调优
# 输出Loss状态
total_train_step = total_train_step + 1
output_steps = 25
if total_train_step % output_steps == 0:
print("训练次数: {}, Loss: {:.4f}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
#mynn.eval() # 将网络设置为评估状态
total_test_loss = 0
total_test_accuracy = 0 # 整体正确的个数
with torch.no_grad(): # 不计算梯度
for data in test_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
total_test_loss = total_test_loss + loss
accuracy = (outputs.argmax(1) == targets).sum() # 计算正确targets的个数
total_test_accuracy = total_test_accuracy + accuracy
print("整体测试集上的Loss: {:.4f}".format(total_test_loss))
print("整体测试集上的正确率:{:.4f}".format(total_test_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss.item(), total_test_step)
writer.add_scalar("test_accuracy", total_test_accuracy/test_data_size, total_test_step)
total_test_step = total_test_step + 1
# 保存每轮训练的神经网络模型
torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_AutoSave_2\\mynn_{}.pth".format(i))
print("模型已保存。")
end_time = time.time()
print("使用时间为: {:.2f} s".format(end_time-start_time))
writer.close()
程序运行结果(部分):
......
整体测试集上的Loss: 174.1695
整体测试集上的正确率:0.6208
模型已保存。
使用时间为: 191.84 s
完整的模型验证(测试,demo)套路,利用已经训练好的模型,提供输入图片,查看结果。
from PIL import Image
import torchvision
import torch
from torch import nn
image_path = "dog.png"
# image_path = "airplane.png"
image = Image.open(image_path)
image = image.convert("RGB")
print(image)
transform = torchvision.transforms.Compose([
torchvision.transforms.Resize((32, 32)),
torchvision.transforms.ToTensor()
])
image_resize = transform(image)
print(image_resize.shape)
# 网络模型
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, 1, 2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64*4*4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.model(x)
return x
# 加载网络参数
model = torch.load("G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\mynn_AutoSave_2\\mynn_1.pth",
map_location=torch.device("cpu"))
print(model)
image_resize = torch.reshape(image_resize, (1, 3, 32, 32)) #.cuda()
model.eval()
with torch.no_grad():
output = model(image_resize)
print(output)
print(output.argmax(1))
如果是使用GPU训练模型,要在CPU上验证,则使用:
model = torch.load("mynn_1.pth", map_location=torch.device("cpu"))
如果是使用GPU训练模型,要在GPU上验证,则使用:
image_resize = torch.reshape(image_resize, (1, 3, 32, 32)).cuda()
学习链接:
https://blog.51cto.com/u_12419595/5937387
Python代码:
import time
import torchvision
from torch.utils.data import DataLoader
from torch import nn
import torch
from torch.utils.tensorboard import SummaryWriter
# 定义训练的设备
device = torch.device("cuda") # 语法糖
print(device)
# 导入数据集
train_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=True,
transform=torchvision.transforms.ToTensor(),
download=False)
test_data = torchvision.datasets.CIFAR10(root="G:\\Anaconda\\pycharm_pytorch\\learning_project\\dataset_CIFAR10",
train=False,
transform=torchvision.transforms.ToTensor(),
download=False)
train_data_size = len(train_data) # 训练数据集长度
test_data_size = len(test_data) # 测试数据集长度
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用dataloader加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 网络
class MYNN(nn.Module):
def __init__(self):
super(MYNN, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, 64, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, 3, 1, 1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
nn.Conv2d(64, 128, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(128, 128, 3, 1, 1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
nn.Conv2d(128, 256, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(256, 256, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(256, 256, 3, 1, 1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
nn.Conv2d(256, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(512, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(512, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
nn.Conv2d(512, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(512, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.Conv2d(512, 512, 3, 1, 1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
)
# 从原始的 models.vgg16(pretrained=True) 中预设值参数值。
if True:
pretrained_model = torchvision.models.vgg16(pretrained=True) # 从预训练模型加载VGG16网络参数
pretrained_params = pretrained_model.state_dict()
keys = list(pretrained_params.keys())
new_dict = {}
for index, key in enumerate(self.state_dict().keys()):
new_dict[key] = pretrained_params[keys[index]]
self.load_state_dict(new_dict)
# 但是至于后面的全连接层,根据实际场景,就得自行定义自己的FC层了。
self.classifier = nn.Sequential( # 定义自己的分类层
# 原始模型vgg16输入image大小是224 x 224
# 我们测试的自己模仿写的模型输入image大小是32 x 32
# 大小是小了 7 x 7倍
nn.Linear(in_features=512 * 1 * 1, out_features=256), # 自定义网络输入后的大小。
nn.ReLU(inplace=True),
nn.Dropout(p=0.5, inplace=False),
nn.Linear(in_features=256, out_features=256),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5, inplace=False),
nn.Linear(in_features=256, out_features=10),
)
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1) # 不知道什么用
x = self.classifier(x)
return x
# vgg16_true = torchvision.models.vgg16(pretrained=True) # 未经过训练的模型
# print(vgg16_true)
# vgg16_true.classifier.insert()
# vgg16_true.add_module('7', nn.Linear(1000, 10))
# print(vgg16_true)
mynn = MYNN()
# torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_AutoSave\\vgg16_origin.pth")
mynn.to(device) # 转移到cuda设备
# 损失函数
loss_fcn = nn.CrossEntropyLoss()
loss_fcn.to(device) # 转移到cuda设备
# 优化器
learning_rate = 1e-2
optimizer = torch.optim.SGD(mynn.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
## 记录训练的次数
total_train_step = 0
## 记录测试的次数
total_test_step = 0
## 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_vgg16_train")
start_time = time.time()
for i in range(epoch):
print("\r\n-------------第{}轮训练开始---------------".format(i+1))
# 训练步骤开始
#mynn.train() # 将网络设置为训练状态
for data in train_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad() # 梯度清零
loss.backward() # 反向传播,求出每个节点的梯度
optimizer.step() # 对神经网络模型的参数进行调优
# 输出Loss状态
total_train_step = total_train_step + 1
output_steps = 25
if total_train_step % output_steps == 0:
print("训练次数: {}, Loss: {:.4f}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
#mynn.eval() # 将网络设置为评估状态
total_test_loss = 0
total_test_accuracy = 0 # 整体正确的个数
with torch.no_grad(): # 不计算梯度
for data in test_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = mynn(imgs)
loss = loss_fcn(outputs, targets)
total_test_loss = total_test_loss + loss
accuracy = (outputs.argmax(1) == targets).sum() # 计算正确targets的个数
total_test_accuracy = total_test_accuracy + accuracy
print("整体测试集上的Loss: {:.4f}".format(total_test_loss))
print("整体测试集上的正确率:{:.4f}".format(total_test_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss.item(), total_test_step)
writer.add_scalar("test_accuracy", total_test_accuracy/test_data_size, total_test_step)
total_test_step = total_test_step + 1
# 保存每轮训练的神经网络模型
torch.save(mynn, "G:\\Anaconda\\pycharm_pytorch\\learning_project\\model\\vgg16_AutoSave\\vgg16_{}.pth".format(i))
print("模型已保存。")
end_time = time.time()
print("使用时间为: {:.2f} s".format(end_time-start_time))
writer.close()
部分代码解析:
if True:
pretrained_model = torchvision.models.vgg16(pretrained=True)
pretrained_params = pretrained_model.state_dict()
keys = list(pretrained_params.keys())
new_dict = {}
for index, key in enumerate(self.state_dict().keys()):
new_dict[key] = pretrained_params[keys[index]]
self.load_state_dict(new_dict)
这段代码的作用是将预训练的VGG16模型的参数加载到自定义的模型中。
首先,通过调用torchvision.models.vgg16(pretrained=True)
创建一个预训练的VGG16模型对象pretrained_model
。pretrained=True
表示加载已经在大规模图像数据集上预训练好的模型参数。
然后,使用pretrained_model.state_dict()
获取预训练模型的参数字典pretrained_params
,其中包含了模型的权重和偏差。
接下来,通过self.state_dict().keys()
获取自定义模型的参数名字列表,并使用enumerate
函数遍历这些参数名字。
在遍历过程中,将预训练模型参数字典中对应的参数值赋值给新的字典new_dict
,其中keys[index]
表示根据索引获取预训练模型参数字典中对应的参数值。
最后,使用self.load_state_dict(new_dict)
将新的参数字典加载到自定义模型中,完成参数的替换。
这段代码的目的是将预训练模型的参数应用到自定义模型中,从而利用预训练模型在大规模数据上学习到的特征,加速自定义模型的训练过程,提升模型性能。
代码运行结果:
-------------第1轮训练开始---------------
训练次数: 25, Loss: 2.3008
训练次数: 50, Loss: 2.2906
训练次数: 75, Loss: 2.2432
......
训练次数: 725, Loss: 0.8643
训练次数: 750, Loss: 0.6412
训练次数: 775, Loss: 0.7951
整体测试集上的Loss: 166.9573
整体测试集上的正确率:0.6587
模型已保存。
......
-------------第10轮训练开始---------------
训练次数: 7050, Loss: 0.2929
训练次数: 7075, Loss: 0.0241
训练次数: 7100, Loss: 0.0476
......
训练次数: 7750, Loss: 0.0684
训练次数: 7775, Loss: 0.3135
训练次数: 7800, Loss: 0.0954
整体测试集上的Loss: 97.4364
整体测试集上的正确率:0.8584
模型已保存。