本文纯属个人这几天学习后的看法,如有不实,还望指正。
在我看来,卷积神经网络对图像处理的过程就是:①先将图像读入程序,得到图像的每个像素点的每个颜色通道的值。本篇博客是使用MNIST数据集,是28 × \times × 28的图像,并且使用单颜色通道。然后将读入的图像数据转换成一致尺寸的tensor数据结构。②使用不同的卷积核对图像进行卷积计算,需要设置使图像卷积后大小不变的padding,filter的个数,卷积过程中滑动窗口大小stride,输入通道数,输出通道数。③使用非线性激励函数(通常使用ReLu函数),初步提取特征。④使用池化层提取主要特征,池化过程通常是求最大值或者平均值⑤使用全连接层将各部分特征汇总。⑤计算损失值,为防止过拟合,给损失函数加上正则惩罚项。通过反向传播,调整参数值。反向传播的过程也就是求偏导的过程,求出每个参数对最后结果的影响,若偏导数为正,在结果值偏大时,该参数便要调小。
直接通过代码从网上下载,可能因为网速等原因难以下载下来,可以先去网上下载下来,然后使用如下方法:https://blog.csdn.net/AugustMe/article/details/90638342
或者使用我下载后的数据,放在代码所在的文件夹中。
import os #调用系统命令
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt
#Hyper Parameters 超参
num_epochs = 5
batch_size = 100
learning_rate = 0.001
DOWNLOAD_MNIST = False
if not(os.path.exists('./mnist/')) or not os.listdir('./mnist/'):
# not mnist dir or mnist is empyt dir
DOWNLOAD_MNIST = True
#MNIST Dataset
train_dataset = dsets.MNIST(root='./mnist/', train=True, transform=transforms.ToTensor(), download=DOWNLOAD_MNIST)
test_dataset = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())
#Data Loader(Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, shuffle=False)
#CNN Model(2 conv layer)
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=5, padding=2), #stride=1时, k=2p+1
nn.BatchNorm2d(16), #对这16个结果进行规范处理,卷积网络中(防止梯度消失或爆炸),设置的参数就是卷积的输出通道数
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size=5, padding=2),
nn.BatchNorm2d(32),
nn.ReLU(),
nn.MaxPool2d(2))
self.fc = nn.Linear(7*7*32, 10)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
cnn = CNN()
cnn.cuda()
#Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate)
# Train the Model
for epoch in range(num_epochs):
for i, (images, labels) in enumerate(train_loader):
images = Variable(images).cuda()
labels = Variable(labels).cuda()
#Forward + Backward + Optimize
optimizer.zero_grad() #清空上一次梯度
outputs = cnn(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if(i+1) % 100 == 0:
print('Epoch [%d/%d], Iter[%d/%d] Loss: %.4f' %(epoch+1, num_epochs, i+1, len(train_dataset)/batch_size, loss.item()))
# Test the Model 用数据集中的数据测试模型
cnn.eval() #Change model to 'eval' mode (BN uses moving mean/var).
correct = 0
total = 0
for images, labels in test_loader:
images = Variable(images).cuda()
outputs = cnn(images)
_, predicted = torch.max(outputs.data, 1) #按照维度取最大值,返回每一行中最大的元素,且返回索引
total += labels.size(0) #labels.size(0) = 100 = batch_size
correct += (predicted.cpu() == labels).sum() #计算每次批量处理后,100个测试图像中有多少个预测正确,求和加入correct
print('Test accuracy of the model on the 10000 test images: %d %%' %(100 * correct/total))
#用一张自己手写的图片测试模型
img_path="66.png"
##PIL
img = Image.open(img_path).convert('L') # 读取图像
transform1 = transforms.Compose([
transforms.Scale(28),
transforms.CenterCrop((28, 28)),
transforms.ToTensor(), # range [0, 255] -> [0.0,1.0]
]
)
img2 = transform1(img) # 归一化到 [0.0,1.0]
img2 = img2.unsqueeze(0) #增加一维,输出的img格式为[1,C,H,W]
image = Variable(img2).cuda()
# print(image.shape)
# mode = transforms.ToPILImage()(img2)
# plt.imshow(mode)
# plt.show()
#测试图片
outputs = cnn(image)
print(outputs.data)
_, predicted = torch.max(outputs.data, 1)
print("Predicted %d" %predicted)
# Save the Trained Model
torch.save(cnn.state_dict(), 'cnn.pkl')
Epoch [1/5], Iter[100/600] Loss: 0.1901
Epoch [1/5], Iter[200/600] Loss: 0.1366
Epoch [1/5], Iter[300/600] Loss: 0.0650
Epoch [1/5], Iter[400/600] Loss: 0.0861
Epoch [1/5], Iter[500/600] Loss: 0.1619
Epoch [1/5], Iter[600/600] Loss: 0.1493
Epoch [2/5], Iter[100/600] Loss: 0.0475
Epoch [2/5], Iter[200/600] Loss: 0.1708
Epoch [2/5], Iter[300/600] Loss: 0.0233
Epoch [2/5], Iter[400/600] Loss: 0.0138
Epoch [2/5], Iter[500/600] Loss: 0.1100
Epoch [2/5], Iter[600/600] Loss: 0.0075
Epoch [3/5], Iter[100/600] Loss: 0.0124
Epoch [3/5], Iter[200/600] Loss: 0.0068
Epoch [3/5], Iter[300/600] Loss: 0.0298
Epoch [3/5], Iter[400/600] Loss: 0.0111
Epoch [3/5], Iter[500/600] Loss: 0.0751
Epoch [3/5], Iter[600/600] Loss: 0.0155
Epoch [4/5], Iter[100/600] Loss: 0.0971
Epoch [4/5], Iter[200/600] Loss: 0.0173
Epoch [4/5], Iter[300/600] Loss: 0.0099
Epoch [4/5], Iter[400/600] Loss: 0.0049
Epoch [4/5], Iter[500/600] Loss: 0.0199
Epoch [4/5], Iter[600/600] Loss: 0.0042
Epoch [5/5], Iter[100/600] Loss: 0.0482
Epoch [5/5], Iter[200/600] Loss: 0.0423
Epoch [5/5], Iter[300/600] Loss: 0.0208
Epoch [5/5], Iter[400/600] Loss: 0.0360
Epoch [5/5], Iter[500/600] Loss: 0.0153
Epoch [5/5], Iter[600/600] Loss: 0.0326
Test accuracy of the model on the 10000 test images: 98 %
tensor([[-0.1304, -0.4410, -0.8985, -0.6545, -1.9931, 0.2359, 1.9494, -2.1930,
-1.5755, -1.3205]], device=‘cuda:0’)
Predicted 6
Pytorch实现简单CNN模型
Python图像处理库PIL中图像格式转换(一)
Pytorch模型保存与加载,并在加载的模型基础上继续训练
pytorch模型保存格式