迁移学习——猫狗分类(PyTorch:迁移 VGG16 方法)

迁移学习——猫狗分类(PyTorch:迁移 VGG16 方法)

    • 3.2 迁移 VGG16
      • 3.2.1 通过代码自动下载模型并直接调用
      • 3.2.2 对当前迁移过来的模型进行全连接层的调整
      • 3.2.3 模型训练及结果
      • 3.2.4 举例说明

前文关于迁移学习的入门及自定义模型的方法看这里: 迁移学习——猫狗分类(PyTorch:自定义 VGGNet 方法)。
参考了唐进民的《深度学习之PyTorch实战计算机视觉》7 部分,及 这里的代码。

另外一个迁移学习的方法:迁移学习——猫狗分类(PyTorch:迁移 ResNet50 方法)

3.2 迁移 VGG16

3.2.1 通过代码自动下载模型并直接调用

首先需要下载已经具备最优参数的模型,这需要对我们之前使用的 model = Models()代码部分进行替换,此时不需要自己搭建和定义训练模型了,而是通过代码自动下载模型并直接调用(此时干脆把前面对数据集进行的处理再重新做一遍):

'''数据集处理'''
import torch
import torchvision
from torchvision import datasets, models, transforms  #导入了models,包含了vgg16这个模型
import os
from torch.autograd import Variable
import matplotlib.pyplot as plt 
import time
model_path = 'transferVGG16/model_name.pth'
model_params_path = 'transferVGG16/params_name.pth'


data_dir = "C:/Users/xinyu/Desktop/data/DogsVSCats/"

data_transform = {
    x:transforms.Compose(
        [
            transforms.Scale([224,224]),    #Scale类将原始图片的大小统一缩放至64×64
            transforms.ToTensor(),
            transforms.Normalize(
                mean=[0.5,0.5,0.5],
                std=[0.5,0.5,0.5]
            )
        ]
    )
    for x in ["train","valid"]
}


image_datasets = {
    x:datasets.ImageFolder(
        root=os.path.join(data_dir,x),  #将输入参数中的两个名字拼接成一个完整的文件路径
        transform=data_transform[x]
    )
    for x in ["train","valid"]
}


dataloader = {  
    #注意:标签0/1自动根据子目录顺序以及目录名生成
    #如:{'cat': 0, 'dog': 1} #{'狗dog': 0, '猫cat': 1}
    #如:['cat', 'dog']  #['狗dog', '猫cat']
    x:torch.utils.data.DataLoader(
        dataset=image_datasets[x],
        batch_size=16,
        shuffle=True
    )
    for x in ["train","valid"]
}


X_example, y_example = next(iter(dataloader["train"]))
example_classes = image_datasets["train"].classes    #['cat', 'dog']  #['狗dog', '猫cat']
index_classes = image_datasets["train"].class_to_idx #{'cat': 0, 'dog': 1} #{'狗dog': 0, '猫cat': 1}


Use_gpu = torch.cuda.is_available()
model = models.vgg16(pretrained=True)
print(model)

打印的模型结果是:

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

3.2.2 对当前迁移过来的模型进行全连接层的调整

尽管迁移学习要求我们需要解决的问题之间最好具有很强的相似性,但是每个问题对最后输出的结果会有不一样的要求,而承担整个模型输出分类工作的是卷积神经网络模型中的全连接层,所以在迁移学习的过程中调整最多的也是全连接层部分。

其基本思路是冻结卷积神经网络中全连接层之前的全部网络层次,让这些被冻结的网络层次中的参数在模型的训练过程中不进行梯度更新,能够被优化的参数仅仅是没有被冻结的全连接层的全部参数。

首先,迁移过来的 VGG16 架构模型在最后输出的结果是 1000 个,在我们的问题中只需两个输出结果,所以全连接层必须进行调整:

for param in model.parameters():
    param.requires_grad = False  #原模型中的参数冻结,不进行梯度更新
    
'''定义新的全连接层并重新赋值给 model.classifier,重新设计分类器的结构,此时 parma.requires_grad 会被默认重置为 True'''
model.classifier = torch.nn.Sequential(torch.nn.Linear(25088, 4096),
                                      torch.nn.ReLU(),
                                      torch.nn.Dropout(p=0.5),
                                      torch.nn.Linear(4096, 4096),
                                      torch.nn.ReLU(),
                                      torch.nn.Dropout(p=0.5),
                                      torch.nn.Linear(4096, 2))
print(model)

if Use_gpu:
    model = model.cuda()
    
loss_fn = torch.nn.CrossEntropyLoss()  #交叉熵
optimizer = torch.optim.Adam(model.classifier.parameters(), lr = 1e-5)
#负责优化的参数变成了全连接层中的所有参数,即对 model.classifier.parameters 这部分参数进行优化

此时新的模型结果是:

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU()
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=2, bias=True)
  )
)

可以看出,其最大的不同就是模型的最后一部分全连接层发生了变化。

3.2.3 模型训练及结果

epoch_n = 2

time_open = time.time()
for epoch in range(epoch_n):
    print("Epoch {}/{}".format(epoch,epoch_n -1))
    print("-"*10)

    for phase in ["train","valid"]:
        if phase == "train":
            print("Training...")
            model.train(True)
        else:
            print("Validing...")
            model.train(False)
        
        running_loss = 0.0
        running_corrects = 0
        #cxq = 1
        for batch, data in enumerate(dataloader[phase],1):
            X, y = data
            #print("$$$$$$",cxq)
            #cxq+=1
            if Use_gpu:
                X, y = Variable(X.cuda()), Variable(y.cuda())
            else:
                X, y = Variable(X), Variable(y)
            y_pred = model(X)

            _, pred = torch.max(y_pred.data,1)

            optimizer.zero_grad()

            loss = loss_f(y_pred,y)

            if phase == "train":
                loss.backward()
                optimizer.step()
            
            running_loss += loss.item()
            running_corrects += torch.sum(pred == y.data)

            if batch%500 == 0 and phase == "train":
                print("Batch {}, Train Loss:{:.4f},Train ACC:{:.4f}%".format(
                        batch, running_loss/batch, 100.0*running_corrects/(16*batch)
                        )
                )

        epoch_loss = running_loss*16/len(image_datasets[phase])
        epoch_acc = 100.0 * running_corrects/len(image_datasets[phase])

        print("{} Loss:{:.4f} Acc:{:.4f}%".format(phase,epoch_loss,epoch_acc))

time_end = time.time() - time_open
print("程序运行时间:{}分钟...".format(int(time_end/60)))

结果如下:

Epoch 0/1
----------
Training...
Batch 500, Train Loss:0.1611,Train ACC:94.2250%
Batch 1000, Train Loss:0.1295,Train ACC:95.0750%
train Loss:0.1238 Acc:95.3100%
Validing...
valid Loss:0.1017 Acc:96.2600%
Epoch 1/1
----------
Training...
Batch 500, Train Loss:0.0585,Train ACC:97.7750%
Batch 1000, Train Loss:0.0563,Train ACC:97.8562%
train Loss:0.0571 Acc:97.8150%
Validing...
valid Loss:0.0895 Acc:96.6000%
程序运行时间:149分钟...

3.2.4 举例说明

X_example, Y_example = next(iter(dataloader['train']))
#print('X_example个数{}'.format(len(X_example)))   #X_example个数16 torch.Size([16, 3, 64, 64])
#print('Y_example个数{}'.format(len(Y_example)))   #Y_example个数16 torch.Size([16]

#X, y = data #torch.Size([16, 3, 64, 64]) torch.Size([16]
if Use_gpu:
    X_example, Y_example = Variable(X_example.cuda()), Variable(Y_example.cuda())
else:
    X_example, Y_example = Variable(X_example), Variable(Y_example)

y_pred = model(X_example)

index_classes = image_datasets['train'].class_to_idx   # 显示类别对应的独热编码
#print(index_classes)     #{'cat': 0, 'dog': 1}

example_classes = image_datasets['train'].classes     # 将原始图像的类别保存起来
#print(example_classes)       #['cat', 'dog']



img = torchvision.utils.make_grid(X_example)
img = img.cpu().numpy().transpose([1,2,0])
print("实际:",[example_classes[i] for i in Y_example])
#['cat', 'cat', 'cat', 'cat', 'dog', 'cat', 'cat', 'dog', 'cat', 'cat', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat']
_, y_pred = torch.max(y_pred,1)
print("预测:",[example_classes[i] for i in y_pred])

plt.imshow(img)
plt.show()

结果是:

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
实际: ['cat', 'dog', 'cat', 'dog', 'cat', 'cat', 'cat', 'dog', 'cat', 'dog', 'dog', 'dog', 'dog', 'dog', 'cat', 'dog']
预测: ['cat', 'cat', 'dog', 'cat', 'cat', 'cat', 'dog', 'cat', 'cat', 'dog', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat']

迁移学习——猫狗分类(PyTorch:迁移 VGG16 方法)_第1张图片

你可能感兴趣的:(人工智能实例,pytorch,深度学习,迁移学习,神经网络,计算机视觉)