Adversarially Learned Anomaly Detection
IEEE ICDM 2018
paper
code
1、developing effective methods for complex and high-dimensional data remains a challenge
对复杂的高维的数据难处理
2、The need to solve an optimization problem for every test example makes this method impractical on large datasets or for real-time applications
优点:effective, but also efficient at test time.
loss
V ( D x z , D x x , D z z , E , G ) = V ( D x z , E , G ) + V ( D x x , E , G ) + V ( D z z , E , G ) \begin{array}{l}{V\left(D_{x z}, D_{x x}, D_{z z}, E, G\right) = \quad V\left(D_{x z}, E, G\right)+V\left(D_{x x}, E, G\right)+V\left(D_{z z}, E, G\right)}\end{array} V(Dxz,Dxx,Dzz,E,G)=V(Dxz,E,G)+V(Dxx,E,G)+V(Dzz,E,G)
Anomaly Score
A ( x ) = ∥ f x x ( x , x ) − f x x ( x , G ( E ( x ) ) ) ∥ 1 A(x)=\left\|f_{x x}(x, x)-f_{x x}(x, G(E(x)))\right\|_{1} A(x)=∥fxx(x,x)−fxx(x,G(E(x)))∥1
A(x) 表示D的置信度,样本是都被很好的encoder或者reconstructed by generator。值越大表示越异常。
数据集:
参数设置:
KDDCup99 :20%的异常
Arrhythmia :15%的异常
use 80% of the whole official dataset for training and keep the remaining 20% as our test set.
We further remove 25% from the training set for a validation set and discard anomalous samples from both training and validation sets (thus setting up a novelty detection task).
评价方法:
Precision, Recall, F1 score
baselines:
One Class Support Vector Machines (OC-SVM)
Support vector method for novelty detection 1999
Isolation Forests (IF)
Isolation forest 2008
Deep Structured Energy Based Models (DSEBM)
Deep structured energy based models for anomaly detection 2016
Deep Autoencoding Gaussian Mixture Model (DAGMM)
Deep autoencoding gaussian mixture model for unsupervised anomaly detection 2018
AnoGAN
Unsupervised anomaly detection with generative adversarial networks to guide marker discovery 2017
我们提出了一种基于GAN的异常检测方法ALAD,它在训练期间从数据空间到潜在空间学习编码器,使得它在测试时比单独发布的GAN方法更有效。 此外,我们还采用了额外的鉴别器来改进编码器,以及已经发现可以稳定GAN训练的频谱归一化。