[!] 源自于本人在Python与树莓派的选修课程报告,为原创内容
目标要求模型精度高,且推理时间短
自2012年ILSVRC中AlexNet到2021年Transformer的变体ViT,这些图像分类模型都可以作为我们设计的基准。经过初步筛选后,得到的选项为:
由于要求推理时间短,因此可以排除NIN与ViT;而对于参数量少的要求,AlexNet,VGG和GoogleNet系列显然不符合。因此剩下的选项为ResNet和DenseNet。
考虑到ResNet中残差机制主要用于解决模型深度退化问题,在网络层数较少的情况下的提升极其有限;而DenseNet的特征多重利用特点使其适用于多种不同的情况之中。
综上所述,我们这次建立的模型应该参考DenseNet来进行设计。
我们可以通过最小化数据分布 p d a t a p_{data} pdata与模型分布 p m o d e l ( y ∣ x ; θ ) p_{model}(\mathbf{y}|\mathbf{x};\theta) pmodel(y∣x;θ)之间的KL散度来使得分类器能够得到最佳的函数近似 y = f ( x ; θ ) \mathbf{y}=f(\mathbf{x};\theta) y=f(x;θ),其中 θ \theta θ代表了模型的参数, x , y \mathbf{x},\mathbf{y} x,y分别对应了样本空间中的样本点及其监督标签。
一般而言,绝大多数现代神经网络采用极大似然估计来训练神经网络,因此,最小化数据分布与模型分布之间KL散度在极大似然估计方法中的损失函数等价于: J ( θ ) = − E x , y ∼ p ^ d a t a log p m o d e l ( y ∣ x ; θ ) J(\theta) = -\Epsilon_{\mathbf{x} ,\mathbf{y} \sim\hat{p}_{data}}\log p_{model}(\mathbf{y} |\mathbf{x} ;\mathbf{\theta} ) J(θ)=−Ex,y∼p^datalogpmodel(y∣x;θ) 即模型分布的负对数似,其中 E \Epsilon E为期望算子, p ^ d a t a \hat{p}_{data} p^data为数据分布中的一次抽样。
由于题目为二分类任务,此时模型分布退化为伯努利分布 P ( y = 1 ∣ x ) P(y=1|x) P(y=1∣x)。然而,神经网络输出单元的诱导局部域值域为全体实数 R R R,因此,我们需要使用激活函数来将其转化为一个有效的概率值 p ∈ [ 0 , 1 ] p\in[0,1] p∈[0,1],最常用的输出单元激活函数为Logistic Sigmoid函数 σ ( x ) = 1 1 + exp { − x } \sigma(x)=\frac{1}{1+\exp\{-x\}} σ(x)=1+exp{ −x}1
因此,分类器的输出单元如下所示: y ^ = σ ( w T h + b ) \hat{y} = \sigma(\mathbf{w}^T\mathbf{h} +b) y^=σ(wTh+b)其中b为单元偏置值, h \mathbf{h} h为特征提取模块的输出特征。因此,有 J ( θ ) = − w T [ y log x + ( 1 − y ) log ( 1 − x ) ] J(\theta) = -\mathbf{w}^T[y\log{x}+(1-y)\log(1-x)] J(θ)=−wT[ylogx+(1−y)log(1−x)]即Binary Cross Entropy Loss(BCELoss)。
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,SubsetRandomSampler
import torch.nn as nn
import torch.optim as optim
import torchvision
import matplotlib.pyplot as plt
import cv2
import random as ra
import os
import numpy as np
from tqdm import tqdm
Pytorch通过数据加载器来得到每一批的数据,而数据集加载器需要一个数据集实例来作为参数,因此,我们首先构建数据集类PreprocessDataset
PreprocessDataset类的作用为读取并缓存数据集,并根据数据加载器的需要按索引获取每个样本
PreprocessDataset类需要构建如下方法:构造方法__init__、长度符重载方法__len__,索引重载方法__getitem__
class PreprocessDataset(Dataset):
def __init__(self,path,imgSize = 224):
self.path = path
self.posPath = os.path.join(path,'ok_front')
self.negPath = os.path.join(path,'def_front')
self.imgSize = imgSize
self.datas = list() #Pos: 1 Neg: 0
for root,_,files in os.walk(self.posPath):
for file in files:
self.datas.append((os.path.join(root,file),1.0))
for root,_,files in os.walk(self.negPath):
for file in files:
self.datas.append((os.path.join(root,file),0.0))
print("[INFO] Successfully loaded the dataset with %d samples!" % len(self.datas))
def __len__(self):
return len(self.datas)
def __getitem__(self,index):
imgPath,label = self.datas[index]
img = cv2.imread(imgPath,0)
resizeSize = int(self.imgSize * 1.05)
img = cv2.resize(img,(resizeSize,resizeSize))
img = self._randomCrop(img)
img = torch.tensor(img / 255.0).float()
img = self._normalization(img)
img = img.unsqueeze(0)
return img,label
我们使用OpenCV来读取图像对象, 并将其进行预处理,为了增加模型的泛化性能,我们在数据集类中增加了两个私有方法,分别用于随即裁剪和图像标准化
#...接上
def _randomCrop(self,img):
height,width = img.shape[:2]
cropHeight = ra.randint(0,height - self.imgSize - 1)
cropWidth = ra.randint(0,width - self.imgSize - 1)
img = img[cropHeight:cropHeight + self.imgSize,cropWidth:cropWidth + self.imgSize]
return img
def _normalization(self,img,std = 0.5,mean = 0.5):
img = (img - mean) / std
return img
至此,数据集类已经构造完毕,根据Pytorch中提供的函数即可构建数据集加载器。为了提高我们方法的说服力,我们将训练集划分为训练集和验证集,比例为7:3。
batch = 32
epochs = 150
imgSize = 224
device = torch.device("cuda" if torch.cuda.is_available() else " cpu")
path = './casting_data/'
trainDataPath = os.path.join(path,'train')
testDataPath = os.path.join(path,'test')
trainDataset = PreprocessDataset(trainDataPath,imgSize)
testDataset = PreprocessDataset(testDataPath,imgSize)
# Get validation Dataset
length = len(trainDataset)
indices = list(range(length))
ra.shuffle(indices)
trainSampler = SubsetRandomSampler(indices[:int(0.7 * length)])
valSampler = SubsetRandomSampler(indices[int(0.7 * length):])
trainData = DataLoader(trainDataset,batch_size = batch,sampler = trainSampler)
valData = DataLoader(trainDataset,batch_size = batch,sampler = valSampler)
testData = DataLoader(testDataset,batch_size = batch)
根据上述建模方法,使用Pytorch设计了一款模型
class Block(nn.Sequential):
def __init__(self,inChannals,outChannals):
"""DenseBlock中的非线性组合函数"""
super(Block,self).__init__(
nn.BatchNorm2d(inChannals),
nn.ReLU(inplace = True),
nn.Conv2d(inChannals,outChannals,kernel_size = 1,stride = 1,bias = False),
nn.BatchNorm2d(outChannals),
nn.ReLU(inplace = True),
nn.Conv2d(outChannals,outChannals,kernel_size = 3,padding = 1,stride = 1,bias = False,groups = outChannals)
)
class DenseBlock(nn.Module):
def __init__(self,inChannals,blockNum,k = 24):
"""网络中的密集连接模块,k为每个模块的输出通道"""
super(DenseBlock,self).__init__()
self.blocks = nn.ModuleList([Block(inChannals + k * i,k) for i in range(blockNum)])
def forward(self,input):
outputs = [input]
outputs.append(self.blocks[0](outputs[-1]))
for i in range(1,len(self.blocks)):
temp = torch.cat(outputs,dim = 1)
outputs.append(self.blocks[i](temp))
output = torch.cat(outputs,dim = 1)
return output
class Transition(nn.Sequential):
def __init__(self,inChannals,outChannals):
"""降采样模块"""
super(Transition,self).__init__(
nn.BatchNorm2d(inChannals),
nn.ReLU(inplace = True),
nn.Conv2d(inChannals,outChannals,kernel_size = 1,bias = False),
nn.AvgPool2d(2)
)
class Model(nn.Module):
def __init__(self,channalNum = 64,compressionRate = 0.5,k = 24):
super(LDN_S,self).__init__()
self.features = nn.Sequential(
nn.Conv2d(1,channalNum,kernel_size=7,stride=2,bias=False),
nn.BatchNorm2d(channalNum),
nn.ReLU(inplace = True),
nn.MaxPool2d(kernel_size = 3,stride = 2,padding = 1, dilation=1, ceil_mode=False)
)
self.blockConfig = [4]
self.blocks = list()
for blockNum in self.blockConfig:
self.blocks.append(DenseBlock(channalNum,blockNum,k))
channalNum = channalNum + blockNum * k
self.blocks.append(Transition(channalNum,int(channalNum * compressionRate)))
channalNum = int(channalNum * compressionRate)
self.blocks = nn.ModuleList(self.blocks)
self.classifier = nn.Sequential(
nn.BatchNorm2d(channalNum),
nn.AdaptiveAvgPool2d((1,1)),
nn.Flatten(),
nn.Dropout(),
nn.Linear(channalNum,1)
)
def forward(self,input):
x = self.features(input)
for block in self.blocks:
x = block(x)
x = self.classifier(x)
return x
至此,我们可以通过简单的类定义函数来构建神经网络,以及优化器和损失函数。优化器我们使用的是Adam
net = Model().to(device)
#这个损失函数自带Sigmoid单元,因此神经网络输出层不用加
lossF = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(net.parameters(),lr=1e-5)
首先,我们构建准确率函数来查看模型的训练情况
def accuracy(outputs,labels):
predictions = torch.where(outputs > 0.5, torch.ones_like(outputs), torch.zeros_like(outputs))
acc = torch.sum(predictions == labels)/ labels.shape[0]
return acc * 100
随后,我们便可以开始构建模型训练函数和验证函数
def train(epoch):
net.train(True)
totalAcc,totalLoss = 0.0,0.0
processBar = tqdm(trainData,ncols = 100)
for step,(imgs,labels) in enumerate(processBar,1):
imgs = imgs.to(device)
labels = labels.to(device).view([-1,1])
net.zero_grad()
outputs = net(imgs)
loss = lossF(outputs,labels)
loss.backward()
acc = accuracy(outputs,labels)
optimizer.step()
totalAcc += acc.item()
totalLoss += loss.item()
processBar.set_description("[%d/%d] Loss-M: %.4f Acc-M: %.2f" % (epoch,epochs,totalLoss/step,
totalAcc/step))
processBar.close()
return totalLoss/step,totalAcc/step
def validation(epoch):
net.train(False)
totalAcc,totalLoss = 0.0,0.0
for step,(imgs,labels) in enumerate(valData,1):
imgs = imgs.to(device)
labels = labels.to(device).view([-1,1])
outputs = net(imgs)
loss = lossF(outputs,labels)
acc = accuracy(outputs,labels)
totalAcc += acc.item()
totalLoss += loss.item()
print("[%d/%d] Val Loss: %.4f Val Acc: %.2f" % (epoch,epochs,totalLoss/step,
totalAcc/step))
return totalLoss/step,totalAcc/step
同理,可以构建测试函数
def test():
net.train(False)
totalAcc,totalLoss = 0.0,0.0
for step,(imgs,labels) in enumerate(testData,1):
imgs = imgs.to(device)
labels = labels.to(device).view([-1,1])
outputs = net(imgs)
loss = lossF(outputs,labels)
acc = accuracy(outputs,labels)
totalAcc += acc.item()
totalLoss += loss.item()
print("Test Loss: %.4f Test Acc: %.2f" % (totalLoss/step, totalAcc/step))
return totalLoss/step,totalAcc/step
最后,我们构建训练总循环
history = {
'trainLoss': list(),
'trainAcc': list(),
'valLoss': list(),
'valAcc': list()
}
for epoch in range(epochs):
trainLoss,trainAcc = train(epoch)
valLoss,valAcc = validation(epoch)
#保存最佳模型
if epoch == 0 or valAcc > max(history['valAcc']):
print("[INFO] Successfully saved the Neural Network (Validation Accuracy %.2f)" % (valAcc))
saveDict = {
'net': net.state_dict(),
'optimizer': optimizer.state_dict(),
'epoch': epoch
}
torch.save(saveDict, './checkpoints/Faster_LDN_%d_Acc%.2f.pth' % (epoch,valAcc))
history['trainAcc'].append(trainAcc)
history['trainLoss'].append(trainLoss)
history['valAcc'].append(valAcc)
history['valLoss'].append(valLoss)
test()
[143/150] Loss-M: 0.0402 Acc-M: 99.25: 100%|██████████████████████| 146/146 [00:05<00:00, 24.42it/s]
[143/150] Val Loss: 0.0274 Val Acc: 99.60
[144/150] Loss-M: 0.0373 Acc-M: 99.51: 100%|██████████████████████| 146/146 [00:05<00:00, 24.54it/s]
[144/150] Val Loss: 0.0271 Val Acc: 99.80
[145/150] Loss-M: 0.0381 Acc-M: 99.38: 100%|██████████████████████| 146/146 [00:05<00:00, 24.75it/s]
[145/150] Val Loss: 0.0227 Val Acc: 99.80
[146/150] Loss-M: 0.0394 Acc-M: 99.34: 100%|██████████████████████| 146/146 [00:05<00:00, 24.49it/s]
[146/150] Val Loss: 0.0261 Val Acc: 99.75
[147/150] Loss-M: 0.0376 Acc-M: 99.34: 100%|██████████████████████| 146/146 [00:05<00:00, 24.37it/s]
[147/150] Val Loss: 0.0290 Val Acc: 99.80
[148/150] Loss-M: 0.0415 Acc-M: 99.15: 100%|██████████████████████| 146/146 [00:05<00:00, 24.62it/s]
[148/150] Val Loss: 0.0268 Val Acc: 99.65
[149/150] Loss-M: 0.0451 Acc-M: 99.11: 100%|██████████████████████| 146/146 [00:05<00:00, 24.74it/s]
[149/150] Val Loss: 0.0324 Val Acc: 99.55
Test Loss: 0.0300 Test Acc: 99.73
数据集源自Kaggle:https://www.kaggle.com/ravirajsinh45/real-life-industrial-dataset-of-casting-product ↩︎
CSDN上找的计算工具 ↩︎