ALAD

Adversarially Learned Anomaly Detection

IEEE ICDM 2018
paper
code

研究动机(主要解决的问题)

1、developing effective methods for complex and high-dimensional data remains a challenge

对复杂的高维的数据难处理

2、The need to solve an optimization problem for every test example makes this method impractical on large datasets or for real-time applications

优点:effective, but also efficient at test time.

框架方法

ALAD_第1张图片

Loss & Anomaly Score

loss
V ( D x z , D x x , D z z , E , G ) = V ( D x z , E , G ) + V ( D x x , E , G ) + V ( D z z , E , G ) \begin{array}{l}{V\left(D_{x z}, D_{x x}, D_{z z}, E, G\right) = \quad V\left(D_{x z}, E, G\right)+V\left(D_{x x}, E, G\right)+V\left(D_{z z}, E, G\right)}\end{array} V(Dxz,Dxx,Dzz,E,G)=V(Dxz,E,G)+V(Dxx,E,G)+V(Dzz,E,G)

Anomaly Score
A ( x ) = ∥ f x x ( x , x ) − f x x ( x , G ( E ( x ) ) ) ∥ 1 A(x)=\left\|f_{x x}(x, x)-f_{x x}(x, G(E(x)))\right\|_{1} A(x)=fxx(x,x)fxx(x,G(E(x)))1

A(x) 表示D的置信度,样本是都被很好的encoder或者reconstructed by generator。值越大表示越异常。

实验

数据集:

  1. KDDCup99
  2. Arrhythmia

参数设置:

KDDCup99 :20%的异常

Arrhythmia :15%的异常

use 80% of the whole official dataset for training and keep the remaining 20% as our test set.

We further remove 25% from the training set for a validation set and discard anomalous samples from both training and validation sets (thus setting up a novelty detection task).

评价方法:

Precision, Recall, F1 score

baselines:

  1. One Class Support Vector Machines (OC-SVM)

    Support vector method for novelty detection 1999

  2. Isolation Forests (IF)

    Isolation forest 2008

  3. Deep Structured Energy Based Models (DSEBM)

    Deep structured energy based models for anomaly detection 2016

  4. Deep Autoencoding Gaussian Mixture Model (DAGMM)

    Deep autoencoding gaussian mixture model for unsupervised anomaly detection 2018

  5. AnoGAN

    Unsupervised anomaly detection with generative adversarial networks to guide marker discovery 2017

实验结果
ALAD_第2张图片

总结

我们提出了一种基于GAN的异常检测方法ALAD,它在训练期间从数据空间到潜在空间学习编码器,使得它在测试时比单独发布的GAN方法更有效。 此外,我们还采用了额外的鉴别器来改进编码器,以及已经发现可以稳定GAN训练的频谱归一化。

你可能感兴趣的:(anomaly,detection)