第四周-P4猴痘识别

>- ** 本文为[365天深度学习训练营](https://mp.weixin.qq.com/s/Nb93582M_5usednAKp_Jtw) 中的学习记录博客**
>- ** 参考文章:[Pytorch实战 | 第P4周:猴痘病识别](https://www.heywhale.com/mw/project/6347b0065565973b87564268)**
>- ** 原作者:[K同学啊 | 接辅导、项目定制](https://mtyjkh.blog.csdn.net/)**
>- ** 文章来源:[K同学的学习圈子](https://www.yuque.com/mingtian-fkmxf/zxwb45)**

要求:

  1. 训练过程中保存效果最好的模型参数。
  1. 加载最佳模型参数识别本地的一张图片。
  1. 调整网络结构使测试集accuracy到达88%(重点)。(完成)

拔高(可选):

  1. 调整模型参数并观察测试集的准确率变化。
  1. 尝试设置动态学习率。
  1. 测试集accuracy到达90%。

一、前期准备

1.设置gpu

#加载需要的包
import torch 
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms,datasets

import os,PIL,pathlib,random

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

device

 2.加载数据

#第一步使用pathlib.Path()将字符串类型文件路径转换为pathlib.Path对象
#第二步使用glob()方法获取data_dir目录下其他子文件路径
#第三步使用split函数将字符串路径分割,以此获取各个文件所属的类别名称


import os,PIL,random,pathlib

data_dir = r"F:\P4_data"
data_dir = pathlib.Path(data_dir)

data_paths = list(data_dir.glob("*"))

classNames = [str(path).split("\\")[2] for path in data_paths]
data_Paths
classNames
[WindowsPath('F:/P4_data/Monkeypox'), WindowsPath('F:/P4_data/Others')]
['Monkeypox', 'Others']

3.图片可视化(猴痘图片可能会有些不适,不予展示)

import matplotlib.pyplot as plt
from PIL import Image

#获取图像的文件夹路径
image_folder = r"F:\P4_data\Monkeypox"

#获取各个图片路径后缀
image_files = [f for f in os.listdir(image_folder) if f.endswith((".jpg",".png",".jpeg"))                                          ]

#创建Matplotlib图像
fig, axes = plt.subplots(3, 8, figsize=(16, 6))

for ax, img_file in zip(axes.flat, image_files):
    img_path = os.path.join(image_folder,img_file)
    img = Image.open(img_path)
    ax.imshow(img)
    ax.axis("off")

4.转换数据

total_datadir = r"F:\P4_data"

train_transforms = transforms.Compose([
    transforms.Resize([224,224]),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485,0.456,0.406],
        std=[0.229,0.224,0.225])
])

total_data = datasets.ImageFolder(total_datadir,transform=train_transforms)
total_data
Dataset ImageFolder
    Number of datapoints: 2142
    Root location: F:\P4_data
    StandardTransform
Transform: Compose(
               Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
               ToTensor()
               Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
           )
​
total_data.class_to_idx

​

{'Monkeypox': 0, 'Others': 1}

 5.划分训练集和测试集

train_size = int(0.8 * len(total_data))
test_size  = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data,[train_size,test_size])
train_dataset, test_dataset
train_size,test_size
(,
 )
(1713, 429)

 6.加载数据集

batch_size = 32
train_dl = torch.utils.data.DataLoader(train_dataset,
                                       batch_size = batch_size,
                                       shuffle=True,
                                       num_workers=1)
test_dl = torch.utils.data.DataLoader(test_dataset,
                                      batch_size=batch_size,
                                      shuffle=False,
                                      num_workers=1)
for X,y in test_dl:
    print("Shape of X [N,C,H,W]:",X.shape)
    print("Shape of y:",y.shape,y.dtype)
    break
Shape of X [N,C,H,W]: torch.Size([32, 3, 224, 224])
Shape of y: torch.Size([32]) torch.int64

二、构建简单的CNN神经网络

构建了卷积-池化的CNN模型

import torch.nn.functional as F

class Network_bn(nn.Module):
    def __init__(self):
        super(Network_bn, self).__init__()
        
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=12, kernel_size=5, stride=1, padding=0)
        self.bn1 = nn.BatchNorm2d(12)
        self.conv2 = nn.Conv2d(in_channels=12,out_channels=12, kernel_size=5, stride=1, padding=0)
        self.bn2 =nn.BatchNorm2d(12)
        self.pool = nn.MaxPool2d(2,2)
        
        self.conv4 = nn.Conv2d(in_channels=12,out_channels=24, kernel_size=5, stride=1, padding=0)
        self.bn4 = nn.BatchNorm2d(24)
        self.conv5 = nn.Conv2d(in_channels=24,out_channels=24, kernel_size=5, stride=1, padding =0)
        self.bn5 = nn.BatchNorm2d(24)
        
        self.conv7 = nn.Conv2d(in_channels=24,out_channels=48, kernel_size=5, stride=1, padding=0)
        self.bn7 = nn.BatchNorm2d(48)
        self.conv8 = nn.Conv2d(in_channels=48,out_channels=48, kernel_size=5, stride=1, padding=0)
        self.bn8 = nn.BatchNorm2d(48)

        self.fc1 = nn.Linear(48*21*21,len(classNames))

    def forward(self,x ):
        x = F.relu(self.bn1(self.conv1(x)))
        x = F.relu(self.bn2(self.conv2(x)))
        x = self.pool(x)
        x = F.relu(self.bn4(self.conv4(x)))
        x = F.relu(self.bn5(self.conv5(x)))  
        x = self.pool(x)
        x = F.relu(self.bn7(self.conv7(x)))
        x = F.relu(self.bn8(self.conv8(x)))
        x = self.pool(x)
        x = x.view(-1, 48*21*21)
        x = self.fc1(x)
        
        return x
    
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

model = Network_bn().to(device)
model

三、训练模型

1.设置超参数

loss_fn    = nn.CrossEntropyLoss()
learn_rate = 0.001
opt        = torch.optim.SGD(model.parameters(),lr=learn_rat

2.编写训练函数

# 训练循环
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset) 
    num_batches = len(dataloader)
    
    train_loss ,train_acc = 0,0 # 初始化训练损失和正确率
    
    for X,y in dataloader:
        X,y = X.to(device),y.to(device)
        
        #计算预测误差
        pred = model(X)
        loss =loss_fn(pred,y)
        
        #反向传播
        optimizer.zero_grad() #grad属性归零
        loss.backward()
        optimizer.step()
        
        #记录acc和loss
        train_acc  += (pred.argmax(1)== y).type(torch.float).sum().item()
        train_loss += loss.item()
        
    train_acc  /= size
    train_loss /= num_batches
        
    return train_acc , train_loss  

 3.编写测试函数

##测试函数

def test(dataloader, model, loss_fn):
    size                = len(dataloader.dataset)
    num_batches         = len(dataloader)
    test_loss, test_acc = 0 , 0
    
    #当不进行训练时,停止梯度更新,节省计算内存消耗:
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)
            
            #计算loss
            target_pred  = model(imgs)
            loss         = loss_fn(target_pred, target)
            
            test_loss   += loss.item()
            test_acc    += (target_pred.argmax(1) == target).type(torch.float).sum().item()
            
    test_acc  /= size
    test_loss /= num_batches
    
    return test_acc , test_loss

4.正式训练

epochs = 50
train_loss = []
train_acc = []
test_loss = []
test_acc = []

for epoch in range(epochs):
    model.train()
    epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, opt)

    model.eval()
    epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)  # Correct the variable name here, it should be `epoch_test_loss`, not `epoch_test_acc`

    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)

    template = 'Epoch:{}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}'  # Fixed the format specifier for epoch
    print(template.format(epoch + 1, epoch_train_acc * 100, epoch_train_loss, epoch_test_acc * 100, epoch_test_loss))
print("Done")
Epoch:1, Train_acc:61.1%, Train_loss:0.891, Test_acc:69.0%, Test_loss:0.569
Epoch:2, Train_acc:70.0%, Train_loss:0.616, Test_acc:56.6%, Test_loss:0.704
Epoch:3, Train_acc:73.4%, Train_loss:0.566, Test_acc:72.3%, Test_loss:0.617
Epoch:4, Train_acc:81.1%, Train_loss:0.421, Test_acc:79.7%, Test_loss:0.428
Epoch:5, Train_acc:86.2%, Train_loss:0.342, Test_acc:79.5%, Test_loss:0.432
Epoch:6, Train_acc:88.4%, Train_loss:0.300, Test_acc:76.9%, Test_loss:0.523
Epoch:7, Train_acc:88.4%, Train_loss:0.288, Test_acc:73.2%, Test_loss:0.620
Epoch:8, Train_acc:90.4%, Train_loss:0.254, Test_acc:85.3%, Test_loss:0.352
Epoch:9, Train_acc:92.1%, Train_loss:0.226, Test_acc:85.1%, Test_loss:0.339
Epoch:10, Train_acc:92.1%, Train_loss:0.211, Test_acc:87.6%, Test_loss:0.313
Epoch:11, Train_acc:94.5%, Train_loss:0.178, Test_acc:86.7%, Test_loss:0.352
Epoch:12, Train_acc:94.3%, Train_loss:0.166, Test_acc:84.6%, Test_loss:0.324
Epoch:13, Train_acc:94.6%, Train_loss:0.156, Test_acc:87.4%, Test_loss:0.307
Epoch:14, Train_acc:96.0%, Train_loss:0.143, Test_acc:86.7%, Test_loss:0.279
Epoch:15, Train_acc:96.2%, Train_loss:0.134, Test_acc:83.4%, Test_loss:0.347
Epoch:16, Train_acc:96.7%, Train_loss:0.129, Test_acc:88.3%, Test_loss:0.305
Epoch:17, Train_acc:97.3%, Train_loss:0.116, Test_acc:86.7%, Test_loss:0.284
Epoch:18, Train_acc:97.5%, Train_loss:0.108, Test_acc:87.4%, Test_loss:0.298
Epoch:19, Train_acc:97.5%, Train_loss:0.099, Test_acc:86.2%, Test_loss:0.314
Epoch:20, Train_acc:97.2%, Train_loss:0.099, Test_acc:88.8%, Test_loss:0.274
Epoch:21, Train_acc:97.7%, Train_loss:0.094, Test_acc:87.9%, Test_loss:0.273
Epoch:22, Train_acc:98.6%, Train_loss:0.080, Test_acc:87.4%, Test_loss:0.278
Epoch:23, Train_acc:98.0%, Train_loss:0.091, Test_acc:88.1%, Test_loss:0.282
Epoch:24, Train_acc:98.7%, Train_loss:0.069, Test_acc:87.6%, Test_loss:0.309
Epoch:25, Train_acc:98.7%, Train_loss:0.073, Test_acc:88.3%, Test_loss:0.315
Epoch:26, Train_acc:98.8%, Train_loss:0.068, Test_acc:88.8%, Test_loss:0.271
Epoch:27, Train_acc:98.9%, Train_loss:0.060, Test_acc:89.3%, Test_loss:0.301
Epoch:28, Train_acc:98.8%, Train_loss:0.069, Test_acc:86.9%, Test_loss:0.288
Epoch:29, Train_acc:99.2%, Train_loss:0.054, Test_acc:88.6%, Test_loss:0.295
Epoch:30, Train_acc:99.3%, Train_loss:0.051, Test_acc:87.6%, Test_loss:0.372
Epoch:31, Train_acc:99.4%, Train_loss:0.052, Test_acc:88.1%, Test_loss:0.293
Epoch:32, Train_acc:99.5%, Train_loss:0.046, Test_acc:89.5%, Test_loss:0.282
Epoch:33, Train_acc:99.6%, Train_loss:0.043, Test_acc:89.0%, Test_loss:0.265
Epoch:34, Train_acc:99.2%, Train_loss:0.047, Test_acc:89.5%, Test_loss:0.274
Epoch:35, Train_acc:99.6%, Train_loss:0.039, Test_acc:88.3%, Test_loss:0.289
Epoch:36, Train_acc:99.6%, Train_loss:0.040, Test_acc:89.3%, Test_loss:0.304
Epoch:37, Train_acc:99.6%, Train_loss:0.035, Test_acc:89.7%, Test_loss:0.284
Epoch:38, Train_acc:99.4%, Train_loss:0.041, Test_acc:88.8%, Test_loss:0.277
Epoch:39, Train_acc:99.6%, Train_loss:0.037, Test_acc:88.8%, Test_loss:0.262
Epoch:40, Train_acc:99.7%, Train_loss:0.037, Test_acc:89.0%, Test_loss:0.284
Epoch:41, Train_acc:99.7%, Train_loss:0.033, Test_acc:88.8%, Test_loss:0.283
Epoch:42, Train_acc:99.5%, Train_loss:0.036, Test_acc:87.6%, Test_loss:0.296
Epoch:43, Train_acc:99.9%, Train_loss:0.027, Test_acc:89.7%, Test_loss:0.266
Epoch:44, Train_acc:99.8%, Train_loss:0.028, Test_acc:89.3%, Test_loss:0.285
Epoch:45, Train_acc:99.7%, Train_loss:0.029, Test_acc:88.3%, Test_loss:0.280
Epoch:46, Train_acc:99.8%, Train_loss:0.027, Test_acc:89.7%, Test_loss:0.271
Epoch:47, Train_acc:99.9%, Train_loss:0.024, Test_acc:89.0%, Test_loss:0.295
Epoch:48, Train_acc:100.0%, Train_loss:0.024, Test_acc:89.7%, Test_loss:0.279
Epoch:49, Train_acc:99.7%, Train_loss:0.026, Test_acc:88.8%, Test_loss:0.294
Epoch:50, Train_acc:99.9%, Train_loss:0.023, Test_acc:89.5%, Test_loss:0.323
Done

5.保存训练效果最好的模型参数

验证集的作用就是监督训练是否过拟合;一般默认验证集的损失值经历由下降到上升的阶段;

保存在验证集上损失最小的那个迭代模型,其泛化能力应该最好;

你可能感兴趣的:(深度学习,人工智能)