CS231n学习笔记--16. Adversarial Examples and Adversarial Training

Overview

  • What are adversarial examples?
  • Why do they happen?
  • How can they be used to compromise machine learning
    systems?
  • What are the defenses?
  • How to use adversarial examples to improve machine
    learning, even when there is no adversary

1. Adversarial Examples

Fool neural nets from Panda to Gibbon

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第1张图片

Turning Objects into “Airplanes”

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第2张图片

Attacking a Linear Model

黄框内的数字被神经网络误识别!

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第3张图片

以下类型的分类器也存在这样的问题:

  • Linear models:Logistic regression,Softmax regression,SVMs
  • Decision trees
  • Nearest neighbors

2. Reason

原因猜测:Adversarial Examples from Overfitting

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第4张图片

Adversarial Examples from Excessive Linearity:

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第5张图片

Modern deep nets are very piecewise linear

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第6张图片

Small inter-class distances

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第7张图片

High-Dimensional Linear Models

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第8张图片

Linear Models of ImageNet

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第9张图片

3. compromise machine learning systems

Cross-model, cross-dataset generalization

不同的模型使用同样的数据产生的权重几乎相同!

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第10张图片

Cross-technique transferability

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第11张图片

Transferability Attack

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第12张图片

Cross-Training Data Transferability

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第13张图片

Adversarial Examples in the Human Brain

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第14张图片

Practical Attacks

  • Fool real classifiers trained by remotely hosted API(MetaMind, Amazon, Google)

  • Fool malware detector networks

  • Display adversarial examples in the physical world and fool machine learning systems that perceive them through a camera

Failed defenses

以下方法均解决不了:

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第15张图片

4. Use adversarial examples

Training on Adversarial Examples

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第16张图片

Adversarial Training of other Models
- Linear models: SVM / linear regression cannot learn a step function, so adversarial training is less useful, very similar to weight decay

  • k-NN: adversarial training is prone to overfitting.

  • Takeway: neural nets can actually become more secure than other models. Adversarially trained neural nets have the best empirical success rate on adversarial examples of any machine learning model.

Adversarial Training

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第17张图片

Virtual Adversarial Training

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第18张图片

Text Classification with VAT

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第19张图片

CS231n学习笔记--16. Adversarial Examples and Adversarial Training_第20张图片

Conclusion

  • Attacking is easy

  • Defending is difficult

  • Adversarial training provides regularization and semi-supervised learning

  • The out-of-domain input problem is a bottleneck for model-based optimization generally

你可能感兴趣的:(深度学习,CS231n学习笔记)