简单记录VGG网络,pytorch+VggNet+CIFAR10

目录

数据集:

Net

train

总结


数据集:

数据集使用的是CIFAR10,cifar 10 这个数据集一共有 50000 张训练集,10000 张测试集,两个数据集里面的图片都是 png 彩色图片,图片大小是 32 x 32 x 3,一共是 10 分类问题,分别为飞机、汽车、鸟、猫、鹿、狗、青蛙、马、船和卡车。这个数据集是对网络性能测试一个非常重要的指标,可以说如果一个网络在这个数据集上超过另外一个网络,那么这个网络性能上一定要比另外一个网络好,目前这个数据集最好的结果是 95% 左右的测试集准确率。

from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader


##定义数据变换格式
def data_tf(x):
    x = np.array(x, dtype='float32') / 255
    x = (x - 0.5) / 0.5  # 标准化,这个技巧之后会讲到
    x = x.transpose((2, 0, 1))  # 将 channel 放到第一维,只是 pytorch 要求的输入方式
    x = torch.from_numpy(x)
    return x

print("下载中。。。")
##下载数据集
train_set = CIFAR10('./data', train=True, transform=data_tf, download=True)
test_set = CIFAR10('./data', train=False,  transform=data_tf, download=True)

print("下载完了")
train_data = DataLoader(train_set, batch_size=batch_size, shuffle=True)
test_data = DataLoader(test_set, batch_size=batch_size, shuffle=False)

但是用这个方法下载很慢甚至是卡着动都不动,所以我们可以换其他下载器来下载:

https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

下载完,解压到目录里就行了

 难点:因为CIFAR10数据集的图片大小为【32,32】,而VGG模型里要求输入的图片大小为【224,224】,所以大小不匹配在进行训练时就报错了,所以需要将其数据集的大小改成适配模型

可以用到一下代码

tran = transforms.Resize((224,224))        ##将数据改成匹配模型的【224,224】结构
image_data = tran(image_data)

Net

网络结构在我这篇博客里简略介绍了一下,感兴趣的可以看一下

简单记录一下,几个经典的网络结构_子根的博客-CSDN博客

 代码部分:

import torch
import torch.nn as nn
import cv2
from torchvision.transforms import transforms
class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),      #224
            nn.ReLU(True),

            nn.Conv2d(64, 64, 3, 1, 1),     #224
            nn.ReLU(True),
            nn.MaxPool2d(2,2),              ##112

            nn.Conv2d(64, 128, 3, 1, 1),    #112
            nn.ReLU(True),

            nn.Conv2d(128, 128, 3, 1, 1),   #112
            nn.ReLU(True),
            nn.MaxPool2d(2, 2),             ##56

            nn.Conv2d(128, 256, 3, 1, 1),  # 56
            nn.ReLU(True),

            nn.Conv2d(256, 256, 3, 1, 1),  # 56
            nn.ReLU(True),

            nn.Conv2d(256, 256, 3, 1, 1),  # 28
            nn.ReLU(True),
            nn.MaxPool2d(2, 2),  ##28

            nn.Conv2d(256, 512, 3, 1, 1),  # 28
            nn.ReLU(True),

            nn.Conv2d(512, 512, 3, 1, 1),  # 28
            nn.ReLU(True),

            nn.Conv2d(512, 512, 3, 1, 1),  # 28
            nn.ReLU(True),
            nn.MaxPool2d(2,2),              # 14

            nn.Conv2d(512, 512, 3, 1, 1),  # 14
            nn.ReLU(True),

            nn.Conv2d(512, 512, 3, 1, 1),  # 14
            nn.ReLU(True),

            nn.Conv2d(512, 512, 3, 1, 1),  # 14
            nn.ReLU(True),
            nn.MaxPool2d(2, 2),            # 7
        )
        self.clifier = nn.Sequential(
            nn.Linear(512*7*7, 4096),
            nn.ReLU(True),
            nn.Dropout(),

            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),

            nn.Linear(4096, 10)
        )

    def forward(self, x):
        tran = transforms.Resize((224, 224))  ##将数据改成匹配模型的【224,224】结构
        x = tran(x)
        x = self.features(x)
        x = torch.reshape(x, (x.shape[0], -1))
        x = self.clifier(x)

        return x

train

import torch
import tqdm
import time
import numpy as np
import torch.nn as nn

from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader
from torch import optim
from torch.autograd import Variable

import net

##定义初始变量
batch_size = 2
epoches = 1
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

##定义数据变换格式
def data_tf(x):
    x = np.array(x, dtype='float32') / 255
    x = (x - 0.5) / 0.5  # 标准化,这个技巧之后会讲到
    x = x.transpose((2, 0, 1))  # 将 channel 放到第一维,只是 pytorch 要求的输入方式
    x = torch.from_numpy(x)
    return x

##下载数据集
train_set = CIFAR10('./data', train=True, transform=data_tf, download=False)
test_set = CIFAR10('./data', train=False,  transform=data_tf, download=False)

train_data = DataLoader(train_set, batch_size=batch_size, shuffle=False)
test_data = DataLoader(test_set, batch_size=batch_size, shuffle=False)

model = net.VGG().to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-1, momentum= 0.8)

# 开始训练
losses_men = []
acces_men = []

start = time.time()
for epoche in range(epoches):
    train_loss = 0
    train_acc = 0
    time1 = time.time()
    print()
    print(f'开始,第 {epoche + 1} 个Epoche中:')
    for image_data, image_label in tqdm.tqdm(train_data) :
        image_data = Variable(image_data.to(device))
        image_label = Variable(image_label.to(device))

        ##前向传播
        out = model(image_data)
        loss = criterion(out, image_label)

        ##反向传播
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        ##记录误差
        train_loss +=loss.item()

        ##记录准确率
        _,pred_label = out.max(1)
        num_correct = (pred_label == image_label).sum().item()  ##计算每次batch_size正确的个数
        acc = num_correct / out.shape[0]
        train_acc += acc

    losses_men.append(train_loss / len(train_data) )
    acces_men.append(train_acc / len(train_data))
    time2 = time.time()

    print('Epoch_time : ', time2 - time1)
    print('train_loss : ',losses_men)
    print('train_acc : ', losses_men)

print('All time : ', time.time() - start)

#############################################################################

##                                上面的代码有些问题

##                                也不能说是重大问题吧!至少在我电脑是能运行的

##                                但是上传服务器,在服务器跑的话,就出问题了

##                                所以看脸了

#############################################################################

错误问题为:

TypeError: img should be PIL Image. Got 

现在说下服务的代码该怎么改:

定义数据格式修改为:

def data_tf(x):
    x = np.array(x, dtype='float32') / 255
    x = (x - 0.5) / 0.5  # 标准化,这个技巧之后会讲到
    x = cv2.resize(x, (224, 224))
    x = x.transpose((2, 0, 1))  # 将 channel 放到第一维,只是 pytorch 要求的输入方式
    x = torch.from_numpy(x)
    return x

NET网络那里删去tran = transforms.Resize((224, 224))  ,x = tran(x)

也就说,图片的扩充放在在data_tf里用OpenCV实现,别用那个transforms了,不会用,排查错误都不知道怎么改

总结

以上代码的Vgg网络经过20轮的训练,数据惨不忍睹,虽说你不能过拟合吧,但是你也要收敛呀

20个epoch过后精确度连0.1都达不到,就很过分了

最后经过修改,在网络中添加了BatchNorm1d,才总算给我收敛起来

没有(BatchNorm1d)的

train_loss :  [2.303093245267258, 2.303139756707584, 2.303160000640108, 

2.3031104309174717, 2.303226224296843, 2.303206982210164,

 2.303128406214897, 2.3031576993825187, 2.3029416113558328,

2.3030955550615744, 2.30320418673708, 2.303175717058694,

2.3031793052278213, 2.3031632662429224, 2.3030178775567838,

2.303055908064098, 2.303198173222944, 2.303126417462478]

train_acc :  [0.10016384271099744, 0.10088315217391304, 0.09906489769820973, 0.09826566496163683, 0.09802589514066497, 0.09950447570332481,

0.09802589514066497, 0.09906489769820973, 0.10094309462915602,

0.10068334398976982, 0.09936460997442455, 0.09844549232736573,

0.09744645140664962, 0.09458919437340153, 0.10124280690537084,

0.10086317135549872, 0.09736652813299233, 0.09780610613810742]

有(BatchNorm1d)的:

 train_loss :  [5.834198918641376, 2.6215473267123524, 2.1093473187492937,

1.7783519804020367, 1.5179658337017459, 1.3456324152934276,1.2027138713223244, 1.1024912583553577, 1.0052912438769475,0.9160828074378431, 0.8292376793863828, 0.7431533735274048,0.659939240845268, 0.582230176081133, 0.5101518849735065,

0.45334468538041617, 0.3936309843417019, 0.3504536931433946,0.3098974650549462, 0.2667675401939227]

train_acc :  [0.18532209079283887, 0.29411764705882354, 0.3796954923273657, 0.4420756074168798, 0.501678388746803, 0.5523897058823529, 0.5899136828644501, 0.6228021099744245, 0.6544916879795396, 0.68130594629156, 0.7115768861892583, 0.7400695332480819, 0.7686421035805626, 0.7947170716112532, 0.8206122122762148, 0.8402533567774936, 0.8641903772378516, 0.8779571611253197, 0.8934622762148338, 0.907608695652174]

最后,老婆压场

喜欢的话,给我老婆点个赞吧☺

你可能感兴趣的:(pytorch,深度学习,计算机视觉,python,网络)