卷积神经网络CNN--PyTorch实现

本文纯属个人这几天学习后的看法,如有不实,还望指正。

(一)相关理论浅谈

  在我看来,卷积神经网络对图像处理的过程就是:①先将图像读入程序,得到图像的每个像素点的每个颜色通道的值。本篇博客是使用MNIST数据集,是28 × \times × 28的图像,并且使用单颜色通道。然后将读入的图像数据转换成一致尺寸的tensor数据结构。②使用不同的卷积核对图像进行卷积计算,需要设置使图像卷积后大小不变的padding,filter的个数,卷积过程中滑动窗口大小stride,输入通道数,输出通道数。③使用非线性激励函数(通常使用ReLu函数),初步提取特征。④使用池化层提取主要特征,池化过程通常是求最大值或者平均值⑤使用全连接层将各部分特征汇总。⑤计算损失值,为防止过拟合,给损失函数加上正则惩罚项。通过反向传播,调整参数值。反向传播的过程也就是求偏导的过程,求出每个参数对最后结果的影响,若偏导数为正,在结果值偏大时,该参数便要调小。

(二)MNIST数据集

  直接通过代码从网上下载,可能因为网速等原因难以下载下来,可以先去网上下载下来,然后使用如下方法:https://blog.csdn.net/AugustMe/article/details/90638342
或者使用我下载后的数据,放在代码所在的文件夹中。

(三)代码实现

import os    #调用系统命令
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

#Hyper Parameters 超参
num_epochs = 5
batch_size = 100
learning_rate = 0.001

DOWNLOAD_MNIST = False
if not(os.path.exists('./mnist/')) or not os.listdir('./mnist/'):
    # not mnist dir or mnist is empyt dir
    DOWNLOAD_MNIST = True
    
#MNIST Dataset
train_dataset =  dsets.MNIST(root='./mnist/', train=True, transform=transforms.ToTensor(), download=DOWNLOAD_MNIST)
test_dataset = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())

#Data Loader(Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, shuffle=False)

#CNN Model(2 conv layer)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=5, padding=2),   #stride=1时, k=2p+1
            nn.BatchNorm2d(16),             #对这16个结果进行规范处理,卷积网络中(防止梯度消失或爆炸),设置的参数就是卷积的输出通道数
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=5, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.fc = nn.Linear(7*7*32, 10)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out
cnn = CNN()
cnn.cuda()

#Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate)

# Train the Model
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images).cuda()
        labels = Variable(labels).cuda()
        
        #Forward + Backward + Optimize
        optimizer.zero_grad()    #清空上一次梯度
        outputs = cnn(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        if(i+1) % 100 == 0:
            print('Epoch [%d/%d], Iter[%d/%d] Loss: %.4f' %(epoch+1, num_epochs, i+1, len(train_dataset)/batch_size, loss.item()))
        
# Test the Model 用数据集中的数据测试模型
cnn.eval()  #Change model to 'eval' mode (BN uses moving mean/var).
correct = 0
total = 0
for images, labels in test_loader:
    images = Variable(images).cuda()
    outputs = cnn(images)
    _, predicted = torch.max(outputs.data, 1)  #按照维度取最大值,返回每一行中最大的元素,且返回索引
    total += labels.size(0)         #labels.size(0) = 100 = batch_size
    correct += (predicted.cpu() == labels).sum()  #计算每次批量处理后,100个测试图像中有多少个预测正确,求和加入correct
       
print('Test accuracy of the model on the 10000 test images: %d %%' %(100 * correct/total))

#用一张自己手写的图片测试模型
img_path="66.png"
##PIL
img = Image.open(img_path).convert('L') # 读取图像
transform1 = transforms.Compose([
    transforms.Scale(28),
    transforms.CenterCrop((28, 28)),
    transforms.ToTensor(), # range [0, 255] -> [0.0,1.0]
    ]
)
img2 = transform1(img) # 归一化到 [0.0,1.0]
img2 = img2.unsqueeze(0) #增加一维,输出的img格式为[1,C,H,W]
image = Variable(img2).cuda()

# print(image.shape)
# mode = transforms.ToPILImage()(img2)
# plt.imshow(mode)
# plt.show()

#测试图片
outputs = cnn(image)
print(outputs.data)
_, predicted = torch.max(outputs.data, 1) 
print("Predicted %d" %predicted)
      
# Save the Trained Model
torch.save(cnn.state_dict(), 'cnn.pkl')

测试自己的图片

卷积神经网络CNN--PyTorch实现_第1张图片

输出结果

Epoch [1/5], Iter[100/600] Loss: 0.1901
Epoch [1/5], Iter[200/600] Loss: 0.1366
Epoch [1/5], Iter[300/600] Loss: 0.0650
Epoch [1/5], Iter[400/600] Loss: 0.0861
Epoch [1/5], Iter[500/600] Loss: 0.1619
Epoch [1/5], Iter[600/600] Loss: 0.1493
Epoch [2/5], Iter[100/600] Loss: 0.0475
Epoch [2/5], Iter[200/600] Loss: 0.1708
Epoch [2/5], Iter[300/600] Loss: 0.0233
Epoch [2/5], Iter[400/600] Loss: 0.0138
Epoch [2/5], Iter[500/600] Loss: 0.1100
Epoch [2/5], Iter[600/600] Loss: 0.0075
Epoch [3/5], Iter[100/600] Loss: 0.0124
Epoch [3/5], Iter[200/600] Loss: 0.0068
Epoch [3/5], Iter[300/600] Loss: 0.0298
Epoch [3/5], Iter[400/600] Loss: 0.0111
Epoch [3/5], Iter[500/600] Loss: 0.0751
Epoch [3/5], Iter[600/600] Loss: 0.0155
Epoch [4/5], Iter[100/600] Loss: 0.0971
Epoch [4/5], Iter[200/600] Loss: 0.0173
Epoch [4/5], Iter[300/600] Loss: 0.0099
Epoch [4/5], Iter[400/600] Loss: 0.0049
Epoch [4/5], Iter[500/600] Loss: 0.0199
Epoch [4/5], Iter[600/600] Loss: 0.0042
Epoch [5/5], Iter[100/600] Loss: 0.0482
Epoch [5/5], Iter[200/600] Loss: 0.0423
Epoch [5/5], Iter[300/600] Loss: 0.0208
Epoch [5/5], Iter[400/600] Loss: 0.0360
Epoch [5/5], Iter[500/600] Loss: 0.0153
Epoch [5/5], Iter[600/600] Loss: 0.0326
Test accuracy of the model on the 10000 test images: 98 %
tensor([[-0.1304, -0.4410, -0.8985, -0.6545, -1.9931, 0.2359, 1.9494, -2.1930,
-1.5755, -1.3205]], device=‘cuda:0’)
Predicted 6

参考链接

Pytorch实现简单CNN模型
Python图像处理库PIL中图像格式转换(一)
Pytorch模型保存与加载,并在加载的模型基础上继续训练
pytorch模型保存格式

你可能感兴趣的:(深度学习)