[深度学习论文笔记][Image Classification] 图像分类部分论文导读

[ImageNet]
• Over 15M labeled high resolution images.
• Roughly 22k categories.

• Collected from web and labeled by Amazon Mechanical Turk.


[ILSVRC (ImageNet Large-Scale Visual Recognition Challenge)]
• Annual competition of image classification at large scale.
• 1.2M training images, 50k validation images, and 150k testing images.
• 1k categories.
• Resolution of each image varies.
• Classification: make 5 guesses about the image label (top-5 error).


[Architectures] See Tab. 1.

AlexNet
• Deeper, bigger than LeNet.
• Featured conv layer stacked on top of each other (previously it was common to only have a single conv layer always immediately followed by a pool layer).
• First use of ReLU.
• Heavy data augmentation.

• Dropout.


ZFNet
• Improvement on AlexNet by tweaking the architecture hyperparameters.
• conv1: change from (11 × 11, s4) to (7 × 7, s2).
• conv3,4,5: instead of 384, 384, 256 filters use 512, 1024, 512.


GoogLeNet

• Inception Module that dramatically reduced the number of parameters in the network (4M, compared to AlexNet with 60M).
• Use global average pooling instead of fc.


VGGNet

• Depth of the network is a critical component for good performance.
• 3 × 3 conv and 2 × 2 pool only.
• More parameters (138M).


ResNet

• Skip connections.
• Heavy use of BN.

• Xavier/2 initialization.

[深度学习论文笔记][Image Classification] 图像分类部分论文导读_第1张图片

你可能感兴趣的:(CNN,Papers)