【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》

【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第1张图片
CVPR-2018


文章目录

  • 1 Background and Motivation
  • 2 Advantages / Contributions
  • 3 Method
  • 4 Experiments
    • 4.1 Datasets
    • 4.2 CIFAR-10
    • 4.3 ImageNet
    • 4.4 COCO
    • 4.5 Efficiency of architecture search methods
  • 5 Conclusion (own)


1 Background and Motivation

Classification models often requires significant architecture engineering.

作者提出直接 learn the model architectures on the dataset of interest.

但是很吃资源

所以作者现在小数据集(CIFAR-10)上 search for an architecture block,然后 transfer 到大数据集(ImageNet)上

2 Advantages / Contributions

  • 实现从手工设计网络结构(human-invented models / engineered architectures / human-designed architectures)到手工设计探索网络结构的方法

  • CIFAR-10、ImageNet、COCO上都胜于 state-of-the-art(CIFAR-10 上 search,transform 到 ImageNet)

  • 比轻量级网络 MobileNet、shuffleNet 效果好(哈哈,MobileNet、Shuffle v2 又怼回去了,参考【MobileNet V2】《MobileNetV2:Inverted Residuals and Linear Bottlenecks》、【ShuffleNet V2】《ShuffleNet V2:Practical Guidelines for Efficient CNN Architecture Design》)

NASNet search space

搜 best cell 而不是 best architecture

  • faster
  • 更容易 generalize to other problems

3 Method

The design of our search space took much inspiration from LSTM, and Neural Architecture Search (NAS)Cell.

NAS 的结构如下
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第2张图片
1)作者相比于 NAS 的改进如下
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第3张图片

2)主要是 search two types of convolutional cells

  • Normal Cell
  • Reduction Cell(feature map 减半,channels double,结构同 Normal,只是输入到 cell 的第一个操作的 stride = 2)

感受下适用于 CIFAR-10 和 ImageNet 的整体结构
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第4张图片
3)搜索过程原理图
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第5张图片
可以看到,生成的新的 feature map,也会被加入到 hidden state set

4)controller RNN
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第6张图片
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第7张图片
step 1~5,由 5 个 softmax classifier 来裁决。

一个 cell 来 B 次 step 1~5,实验发现 B=5 效果最好

step 3,4 的候选操作如下
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第8张图片
step 5 候选操作

  • element-wise addition
  • concatenation

4 Experiments

Proximal Policy Optimization(PPO)来 train controller RNN,500 NVidia P100s 4 days for CIFAR-10

NASNet-A
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第9张图片

4.1 Datasets

  • CIFAR-10
  • ImageNet
  • COCO

4.2 CIFAR-10

【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第10张图片
cutout data augmentation,图 2 中 N = 7 的时候效果最好

4.3 ImageNet

没有 residual connection

【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第11张图片
更少的 parameters 和 computation,更高的 accuracy
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第12张图片
看看在限制的计算量下的结果,精度比 mobileNet、shuffle 更好,说明参数利用率更高
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第13张图片

4.4 COCO

NASNet + Faster RCNN pipeline
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第14张图片
These results provide further evidence that NASNet provides superior, generic image features that may be transferred across other computer vision tasks.

【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第15张图片
能得到更精确的 localization

4.5 Efficiency of architecture search methods

【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第16张图片

reinforcement learning vs random search 也即
sample the decisions from the softmax classifiers vs sample the decisions from the uniform distribution

brute-force random search

感受下 NASNet-B 和 NASNet-C 的结构 for CIFAR-10 and ImageNet.(NASNet-A最好)
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第17张图片【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第18张图片
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第19张图片
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第20张图片 【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第21张图片
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第22张图片

5 Conclusion (own)

  • 表格中 N @ number,number 表示什么意思,一共有多少个 state 吗?
  • 附录中的 RL-based search strategy 感兴趣可以了解一波,虽然不是这篇文章的重点介绍内容,但是是核心方法!
  • 具体 coding 的时候,多个 h 怎么落地!

论文笔记-NASNet 对 number 做出了如下解释
【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》_第23张图片

NasNet 这篇对 RL 的细节进行了详细的介绍!

你可能感兴趣的:(CNN)