(ImageNet Classification with Deep Convolutional Neural Networks,NIPS2012)
本文通过训练一个大规模的深度卷积神经网络用于解决图片分类问题。该神经网络有6千万个参数和6.5百万神经元,包括5个卷积层和3个全连接层。在训练过程中为了提升训练速度使用了GPU来进行模型训练,而且首次提出了“dropout”的方法,并证明了其有效性。最终该模型在ILSVRC—2012竞赛中取得图像分类错误率比第二名低10.9%的成绩获得第一名。
此图为AexNet模型的整体框架。
此图为AlexNet模型训练时的一个示意图。
输入: 224 × 224 × 3 224 \times 224 \times 3 224×224×3的图片
输出:类别
通俗的理解卷积、池化(相关知识参考)
卷积:相乘相加
池化:去除冗余 (相关参考)
- “Dropout”:保持网络的稀疏性,以避免过拟合、提高模型的泛华能力。(相关参考)
- “Data Augumentation”:随机改变训练样本可以降低模型对某些属性的依赖,从而提高模型的泛化能力。
CIFAR-10 数据集:是由 Hinton 的两个大弟子 Alex Krizhevsky、Ilya Sutskever 收集的一个用于普适物体识别的数据集,其中包括10类 32 × 32 32 \times 32 32×32的彩色图片一共6万张,50000张训练,10000张测试(交叉验证)。(CIFAR-10数据集下载)
此图展示的就是CIFAR-10 数据集。
// Pytorch 代码实现
## model.py
fimport torch.nn as nn
import torch.utils.model_zoo as model_zoo
from IPython import embed
from collections import OrderedDict
from utee import misc
print = misc.logger.info
model_urls = {
'cifar10': 'http://ml.cs.tsinghua.edu.cn/~chenxi/pytorch-models/cifar10-d875770b.pth',
}
class CIFAR(nn.Module):
def __init__(self, features, n_channel, num_classes):
super(CIFAR, self).__init__()
assert isinstance(features, nn.Sequential), type(features)
self.features = features
self.classifier = nn.Sequential(
nn.Linear(n_channel, num_classes)
)
print(self.features)
print(self.classifier)
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
def make_layers(cfg, batch_norm=False):
layers = []
in_channels = 3
for i, v in enumerate(cfg):
if v == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
padding = v[1] if isinstance(v, tuple) else 1
out_channels = v[0] if isinstance(v, tuple) else v
conv2d = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=padding)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(out_channels, affine=False), nn.ReLU()]
else:
layers += [conv2d, nn.ReLU()]
in_channels = out_channels
return nn.Sequential(*layers)
def cifar10(n_channel, pretrained=None):
cfg = [n_channel, n_channel, 'M', 2*n_channel, 2*n_channel, 'M', 4*n_channel, 4*n_channel, 'M', (8*n_channel, 0), 'M']
layers = make_layers(cfg, batch_norm=True)
model = CIFAR(layers, n_channel=8*n_channel, num_classes=10)
if pretrained is not None:
m = model_zoo.load_url(model_urls['cifar10'])
state_dict = m.state_dict() if isinstance(m, nn.Module) else m
assert isinstance(state_dict, (dict, OrderedDict)), type(state_dict)
model.load_state_dict(state_dict)
return model
if __name__ == '__main__':
model = cifar10(128, pretrained='log/cifar10/best-135.pth')
embed()
Pytorch完整代码:Pytorch
Tensorflow完整代码:Tensorflow
PaddlePaddle代码: PaddlePaddle.
Keras代码:keras
模型还在训练,稍后给出。
参考文献
[1] https://zh.d2l.ai
[2] https://www.cs.toronto.edu/~kriz/cifar.html
[3] https://blog.csdn.net/zeuseign/article/details/72773342
[4] https://www.cnblogs.com/vipyoumay/p/7686230.html
[5] https://zhuanlan.zhihu.com/p/27381582