Cihang Xie1,2∗ Mingxing Tan1 Boqing Gong1 Jiang Wang1 Alan Yuille2 Quoc V. Le1 1Google 2Johns Hopkins University
谷歌大脑的创始成员和 AutoML 的缔造者之一Quoc Le再推新研究论文 ,发表于2019/11/21
Adversarial examples are commonly viewed as a threat to ConvNets. Here we present an opposite perspective: adversarial examples can be used to improve image recognition models if harnessed in the right manner. We propose AdvProp, 引;
对于最新的模型EfficientNet-B7其分类结果,在 ImageNet (+0.7%), ImageNet-C (+6.5%), ImageNet-A (+7.0%), StylizedImageNet (+4.8%).
(3.5B Instagram images (∼3000× more than ImageNet) and ∼9.4× more parameters. )
Models are available at
https://github.com/tensorflow/tpu/tree/ master/models/official/efficientnet.
2 Introduction
The existence of adversarial examples not only reveals the limited generalization ability of ConvNets, but also poses security threats on the real-world deployment of these models. 引
Recent works[15,13,31] also suggest that training with adversarial examples on large datasets, e.g., ImageNet[23],with supervised learning results in performance degradation on clean images. 对比
We observe all previous methods jointly train over clean images and adversarial examples without distinction even though they should be drawn from different underlying distributions.
In this paper, we propose AdvProp, short for Adversarial Propagation,
3 Related Work
Adversarial Training. Adversarial training, which trains networks with adversarial examples, constitutes the current foundation of state-of-the-arts for defending against adversarial attacks [5, 15, 19, 31].
This paper focuses on standard supervised learning without extra data. Although using similar adversarial training techniques,we stand on an opposite perspective to previous works—we aim at using adversarial examples to improve clean image recognition accuracy.
Benefits of Learning Adversarial Features. Adversarial examples make network representations align better with salient data characteristics and human perception[30]. Moreover, such trained models are much more robust to high frequency noise [32].
Zhang et al. [35] further suggest these adversarially learned feature representations are less sensitive to texture distortions and focus more on shape information.
Our proposed AdvProp can be characterized as a training paradigm which fully exploits the complementarity between clean images and their corresponding adversarial examples.
Data augmentation. ehorizontal flipping and random cropping, applying masking out [3] or adding Gaussian noise [18] to regions in images, or mixing up pairs of images and their labels in a convex manner [33].
Our work can be regarded as one type of data augmentation: creating additional training samples by injecting noise.
上面的结果提供了一个很有启发的信号——如果利用得当,对抗性的例子可以有利于模型的性能。尽管如此,我们注意到这种方法通常不能提高性能。 继续引
5.1 Adversarial Training
vanilla training :训练干净图像
Madry’s adversarial training :单独训练对抗图像
We train networks with a mixture of adversarial examples and clean image:混合训练
5.2.Disentangled Learning via An Auxiliary BN
5.2 AdvProp
6.1 Experiments Setup
Architectures. ranging from the light-weight EfficientNet-B0 to the large EfficientNet-B7. We follow the settings in [28] to train these networks: RMSProp optimizer with decay 0.9 and momentum 0.9; batch norm momentum 0.99; weight decay 1e-5; initial learning rate 0.256 that decays by 0.97 every 2.4 epochs; a fixed AutoAugment policy[1] is applied to augment training images.
Adversarial Attackers. We choose Projected Gradient Descent (PGD) [19] under L∞ norm as the default attacker for generating adversarial examples on-the-fly.(动态)
Datasets. In addition to reporting performance on the original ImageNet validation set, we go beyond by testing the models on the following test sets:
ImageNet-C. image corruptions.
ImageNet-A . challenging scenarios (e.g., occlusion and fog scene).
Stylized-ImageNet . removing local texture cues while retaining global shape information.
5.2.ImageNet Results and Beyond
Generalization on Distorted ImageNet Datasets.
如上结果表明,AdvProp通过允许模型学习比vanilla training更丰富的内部表征,显著提高了模型的泛化能力。更丰富的内部表征不仅为样式化的Stylized-ImageNet 数据集更好地分类提供了全局形状信息,同时增强了模型对常见图像破坏的鲁棒性。
Ablation on Adversarial Attacker Strength.
5.3.Comparisons to Adversarial Training