深度学习 检测异常_深度学习用于异常检测:全面调查

深度学习 检测异常

This post summarizes a comprehensive survey paper on deep learning for anomaly detection — “Deep Learning for Anomaly Detection: A Review” [1], discussing challenges, methods and opportunities in this direction.

这篇文章总结了关于深度学习以进行异常检测的综合调查论文-“ 深度学习以进行异常检测:回顾 ” [1],讨论了该方向上的挑战,方法和机遇。

Anomaly detection, a.k.a. outlier detection, has been an active research area for several decades, due to its broad applications in a large number of key domains such as risk management, compliance, security, financial surveillance, health and medical risk, and AI safety. Although it is a problem widely studied in various communities including data mining, machine learning, computer vision and statistics, there are still some unique problem complexities and challenges that require advanced approaches. In recent years, deep learning enabled anomaly detection has emerged as a critical direction towards addressing these challenges. However, there is a lack of systematic review and discussion of the research progress in this direction. We aim to present a comprehensive review of this direction to discuss the main challenges, a large number of state-of-the-art methods, how they address the challenges, as well as future opportunities.

由于异常检测(即异常值检测)已在风险管理,合规性,安全性,财务监控,健康和医疗风险以及AI安全等众多关键领域中得到广泛应用,因此数十年来一直是活跃的研究领域。 尽管这是包括数据挖掘,机器学习,计算机视觉和统计在内的各个社区广泛研究的问题,但仍然存在一些独特的问题复杂性和挑战,需要先进的方法。 近年来,启用深度学习的异常检测已成为解决这些挑战的关键方向。 然而,缺乏系统地回顾和讨论该方向上的研究进展。 我们旨在对该方向进行全面回顾,以讨论主要挑战,大量最新方法,它们如何应对挑战以及未来机遇。

异常检测中尚未解决的挑战 (Largely Unsolved Challenges in Anomaly Detection)

Although anomaly detection is a lasting active research area for years, there are still a number of largely unsolved challenges due to some unique and complex nature of anomalies, e.g., unknownness (they remain unknown until actually occur), heterogeneity (different anomalies demonstrate completely different abnormal characteristics), rareness (anomalies are rarely occurred data instances), diverse form of anomalies (point anomaly, contextual anomaly, and group anomaly).

虽然异常检测是多年持续活跃的研究领域,仍有一些主要解决的挑战,因为异常的一些独特性和复杂性,例如,unknownness(他们仍然是未知的,直到实际发生), 异质性 (不同的异常表现出完全不同的异常特征), 稀有性 (异常是很少发生的数据实例), 多种形式的异常 (点异常,上下文异常和组异常)。

One of the most challenging issues is the difficulty to achieve high anomaly detection recall rate (Challenge #1). Since anomalies are highly rare and heterogeneous, it is difficult to identify all of the anomalies. Many normal instances are wrongly reported as anomalies while true yet sophisticated anomalies are missed.

最具挑战性的问题之一是难以实现较高的异常检测召回率(挑战#1) 。 由于异常非常罕见且异构,因此很难识别所有异常。 许多正常实例被错误地报告为异常,而错过了真实而复杂的异常。

Anomaly detection in high-dimensional and/or not-independent data (Challenge #2) is also a significant challenge. Anomalies often exhibit evident abnormal characteristics in a low-dimensional space yet become hidden and unnoticeable in a high-dimensional space. High-dimensional anomaly detection has been a long-standing problem. Subspace/feature selection-based methods may be a straightforward solution. However, identifying intricate (e.g., high-order, nonlinear and heterogeneous) feature interactions and couplings may be essential in high-dimensional data yet remains a major challenge for anomaly detection.

高维和/或非独立数据中的异常检测(挑战2)也是一个重大挑战。 异常通常在低维空间中表现出明显的异常特征,而在高维空间中变得隐藏且不明显。 高维异常检测一直是一个长期存在的问题。 基于子空间/功能选择的方法可能是一个简单的解决方案。 然而,在高维数据中识别复杂的(例如,高阶,非线性和异构)特征相互作用和耦合可能是必不可少的,但仍然是异常检测的主要挑战。

Due to the difficulty and cost of collecting large-scale labeled anomaly data, it is important to have data-efficient learning of normality/abnormality (Challenge #3). wo major challenges are how to learn expressive normality/abnormality representations with a small amount of labeled anomaly data and how to learn detection models that are generalized to novel anomalies uncovered by the given labeled anomaly data.

由于收集大规模的带标签的异常数据的困难和成本,重要的是要有数据有效地学习正态/异常(挑战#3) 。 主要挑战是如何学习带有少量标记异常数据的表达正态性/异常表示形式,以及如何学习归纳到给定标记异常数据所发现的新颖异常的检测模型。

Many weakly/semi-supervised anomaly detection methods assume the given labeled training data is clean, which can be highly vulnerable to noisy instances that are mistakenly labeled as an opposite class label. One main challenge here is how to develop noise-resilient anomaly detection (Challenge #4).

许多弱/半监督的异常检测方法都假定给定的带有标签的训练数据是干净的,这对于被错误地标记为相反类别标签的嘈杂实例可能非常脆弱。 这里的主要挑战是如何开发抗噪声的异常检测(挑战4)

Most of existing methods are for point anomalies, which cannot be used for conditional anomaly and group anomaly since they exhibit completely different behaviors from point anomalies. One main challenge here is to incorporate the concept of conditional/group anomalies into anomaly measures/models for the detection of those complex anomalies (Challenge #5).

现有的大多数方法都是针对点异常的,这些方法不能用于条件异常和组异常,因为它们表现出与点异常完全不同的行为。 这里的主要挑战是将条件/组异常的概念合并到用于检测那些复杂异常的异常度量/模型中(挑战#5)

In many critical domains there may be some major risks if anomaly detection models are directly used as black-box models. For example, the rare data instances reported as anomalies may lead to possible algorithmic bias against the minority groups presented in the data, such as under-represented groups in fraud detection and crime detection systems. An effective approach to mitigate this type of risk is to have anomaly explanation (Challenge #6) algorithms that provide straightforward clues about why a specific data instance is identified as anomaly. Providing such explanation can be as important as detection accuracy in some applications. To derive anomaly explanation from specific detection methods is still a largely unsolved problem, especially for complex models. Developing inherently interpretable anomaly detection models is also crucial, but it remains a main challenge to well balance the model’s interpretability and effectiveness.

如果将异常检测模型直接用作黑盒模型,则在许多关键领域中可能会存在一些重大风险。 例如,报告为异常的稀有数据实例可能导致针对数据中显示的少数群体(例如欺诈检测和犯罪检测系统中代表性不足的群体)的可能算法偏差。 减轻此类风险的有效方法是具有异常解释(挑战#6)算法,该算法提供有关为何将特定数据实例识别为异常的直接线索。 在某些应用中,提供这样的解释与检测精度一样重要。 从特定的检测方法中得出异常的解释仍然是一个未解决的问题,尤其是对于复杂的模型。 开发内在可解释的异常检测模型也很关键,但如何平衡模型的可解释性和有效性仍然是主要挑战。

通过深度异常检测应对挑战 (Addressing the Challenges with Deep Anomaly Detection)

In a nutshell deep anomaly detection aims at learning feature representations or anomaly scores via neural networks for the sake of anomaly detection. In recent years, a large number of deep anomaly detection methods have been introduced, demonstrating significantly better performance than conventional anomaly detection on addressing challenging detection problems in a variety of real-world applications. We systematically review the current deep anomaly detection methods and their capabilities in addressing the aforementioned challenges.

简而言之,为了进行异常检测,深度异常检测旨在通过神经网络学习特征表示或异常分数。 近年来,已经引入了大量的深度异常检测方法,在解决各种现实应用中的挑战性检测问题方面,其表现出比常规异常检测明显更好的性能。 我们系统地审查了当前的深度异常检测方法及其解决上述挑战的能力。

To have a thorough understanding of the area, we introduce a hierarchical taxonomy to classify existing deep anomaly detection methods into three main categories and 11 fine-grained categories from the modeling perspective. An overview of the taxonomy of the methods, together with the challenges they address, is shown in Figure 1. Specifically, deep anomaly detection consists of three conceptual paradigms — Deep Learning for Feature Extraction, Learning Feature Representations of Normality, and End-to-end Anomaly Score Learning.

为了深入了解该区域,我们引入了一个层次分类法,从建模的角度将现有的深度异常检测方法分为三个主要类别和11个细粒度类别。 图1概述了这些方法的分类及其所面临的挑战。具体而言,深度异常检测包括三个概念范式- 用于特征提取的深度学习学习正态的特征表示端到端。结束异常分数学习

Figure 1. A Hierarchical Taxonomy of Current Deep Anomaly Detection Techniques. The detection challenges that each category of methods can address are also presented. 图1.当前深度异常检测技术的分层分类法。 还介绍了每种方法类别都可以解决的检测难题。

In the Deep Learning for Feature Extraction framework, deep learning and anomaly detection are fully separated in the first main category, so deep learning techniques are used as some independent feature extractors only. The two modules are dependent on each other in some form in the second main category — Learning Feature Representations of Normality, with an objective of learning expressive representations of normality. This category of methods can be further divided into two subcategories based on whether traditional anomaly measures are incorporated into their objective functions. These two subcategories encompass seven fine-grained categories of methods, with each category taking a different approach to formulate its objective function. The two modules are fully unified in the third main category — End-to-end Anomaly Score Learning, in which the methods are dedicated to learning anomaly scores via neural networks in an end-to-end fashion. These methods are further grouped into four categories based on the formulation of neural network-enabled anomaly scoring.

用于特征提取深度学习框架中深度学习和异常检测在第一主要类别中完全分开,因此深度学习技术仅用作某些独立的特征提取器。 这两个模块在第二个主要类别( 学习正常的特征表示)中以某种形式相互依赖,目的是学习正常的表现表示。 根据是否将传统异常措施纳入其目标功能,可以将该方法类别进一步分为两个子类别。 这两个子类别包含七个细粒度的方法类别,每个类别采用不同的方法来表述其目标函数。 这两个模块在第三个主要类别- 端到端异常分数学习中完全统一,其中方法专门用于通过端到端方式通过神经网络学习异常分数。 根据神经网络支持的异常评分的公式,将这些方法进一步分为四类。

For each category of methods, we review detailed methodology and algorithms, covering their key intuitions, objective functions, underlying assumptions, advantages and disadvantages, and discuss how they address the aforementioned challenges. The full details are difficult to demonstrate here. See the full paper below for detail.

对于每种方法,我们都会详细介绍方法和算法,涵盖其主要直觉,目标功能,基本假设,优缺点,并讨论它们如何应对上述挑战。 完整的细节很难在这里演示。 有关详细信息,请参见下面的全文。

未来机会 (Future Opportunities)

Through such a review, we identify some exciting opportunities. Some of them are described as follows.

通过这样的审查,我们发现了一些令人兴奋的机会。 其中一些描述如下。

探索异常监控信号 (Exploring Anomaly-supervisory Signals)

Informative supervisory signals are the key for deep anomaly detection to learn expressive representations of normality/abnormality or anomaly scores and reduce false positives. While a wide range of unsupervised or self-supervised supervisory signals have been explored, to learn the representations, a key issue for these formulations is that their objective functions are generic but not optimized specifically for anomaly detection. Current anomaly measure-dependent feature learning approaches help address this issue by imposing constraints derived from traditional anomaly measures. However, these constraints can have some inherent limitations, e.g., implicit assumptions in the anomaly measures. It is critical to explore new sources of anomaly-supervisory signals that lie beyond the widely-used formulations such as data reconstruction and GANs, and have weak assumptions on the anomaly distribution. Another possibility is to develop domain-driven anomaly detection by leveraging domain knowledge such as application-specific knowledge of anomaly and/or expert rules as the supervision source.

信息监督信号是深度异常检测的关键,可以学习正常/异常或异常评分的表达形式并减少误报。 尽管已经探索了各种各样的无监督或自我监督的监督信号,但要学习这些表示,这些公式的关键问题是它们的目标函数是通用的,但并未专门针对异常检测进行优化。 当前的依赖异常度量的特征学习方法通​​过施加来自传统异常度量的约束来帮助解决此问题。 但是,这些约束可能具有一些固有的局限性,例如,异常度量中的隐式假设。 至关重要的是,探索新的异常监视信号源,这些新信号源应超出数据重建和GAN等广泛使用的形式,并且对异常分布的假设还很弱。 另一种可能性是通过利用领域知识(例如异常的特定于应用程序的知识)和/或专家规则作为监视源来开发域驱动的异常检测

深度弱监督异常检测 (Deep Weakly-supervised Anomaly Detection)

Deep weakly-supervised anomaly detection aims at leveraging deep neural networks to learn anomaly-informed detection models with some weakly-supervised anomaly signals, e.g.,, partially/inexactly/inaccurately labeled anomaly data. This labeled data provides important knowledge of anomaly and can be a major driving force to lift detection recall rates. One exciting opportunity is to utilize a small number of accurate labeled anomaly examples to enhance detection models as they are often available in real-world applications, e.g., some intrusions/frauds from deployed detection systems/end-users and verified by human experts. However, since anomalies can be highly heterogeneous, there can be unknown/novel anomalies that lie beyond the span set of the given anomaly examples. Thus, one important direction here is unknown anomaly detection, in which we aim to build detection models that are generalized from the limited labeled anomalies to unknown anomalies.

深度弱监督异常检测旨在利用深度神经网络来学习带有一些弱监督异常信号(例如,部分/错误/不准确标记的异常数据)的异常通知检测模型。 这些标记的数据提供了有关异常的重要知识,并且可以成为提高检测召回率的主要动力。 一个激动人心的机会是利用少量准确的带标签的异常示例来增强检测模型,因为它们在现实应用中经常可用,例如,来自已部署的检测系统/最终用户的某些入侵/欺诈行为,并由人类专家进行了验证。 但是,由于异常可能是高度异构的,因此可能存在超出给定异常示例的范围集的未知/新颖异常。 因此,这里的一个重要方向是未知异常检测 ,在该目标中,我们旨在建立从有限的标记异常到未知异常的广义检测模型。

To detect anomalies that belong to the same classes of the given anomaly examples can be as important as the detection of novel/unknown anomalies. Thus, another important direction is to develop data-efficient anomaly detection or few-shot anomaly detection, in which we aim at learning highly expressive representations of the known anomaly classes given only limited anomaly examples. It should be noted that the limited anomaly examples may come from different anomaly classes and thus exhibit completely different manifold/class features. This scenarios is fundamentally different from the general few-shot learning, in which the limited examples are class-specific and assumed to share the same manifold/class structure.

检测属于给定异常示例的相同类别的异常与检测新颖/未知异常一样重要。 因此,另一个重要的方向是发展数据有效的异常检测 或少发异常检测 ,其中我们旨在仅给出有限的异常示例,学习已知异常类别的高表达表示形式。 应当指出,有限的异常示例可能来自不同的异常类别,因此表现出完全不同的歧管/类别特征。 此场景与一般的一次性学习有根本的区别,后者的有限示例是特定于类的,并假定共享相同的流形/类结构。

大规模常态学习 (Large-scale Normality Learning)

Large-scale unsupervised/self-supervised representation learning has gained tremendous success in enabling downstream learning tasks. This is particular important for learning tasks, in which it is difficult to obtain sufficient labeled data, such as anomaly detection. The goal is to learn transferable pre-trained representation models from large-scale unlabeled data in an unsupervised/self-supervised mode and fine-tune detection models in a semi-supervised mode. Self-supervised classification-based anomaly detection methods may provide some initial sources of supervision for the normality learning. However, precautions must be taken to ensure that (i) the unlabeled data is free of anomaly contamination and/or (ii) the representation learning methods are robust w.r.t. possible anomaly contamination. This is because most methods implicitly assume that the training data is clean and does not contain any noise/anomaly instances. This robustness is important in both the pre-trained modeling and the fine-tuning stage. Additionally, anomalies and datasets in different domains vary significantly, so the large-scale normality learning may need to be domain/application-specific.

大规模的无监督/自我监督的表示学习在实现下游学习任务方面已经取得了巨大的成功。 这对于学习任务非常重要,在学习任务中很难获得足够的标记数据,例如异常检测。 目的是在无监督/自我监督模式下从大规模未标记数据中学习可转移的预训练表示模型,并在半监督模式下学习微调检测模型。 基于自我监督分类的异常检测方法可能为正常学习提供一些初步的监督来源。 但是,必须采取预防措施以确保(i)未标记的数据没有异常污染和/或(ii)表示学习方法在可能受到异常污染的情况下是可靠的。 这是因为大多数方法都隐式地认为训练数据是干净的,并且不包含任何噪声/异常实例。 这种鲁棒性在预训练的建模和微调阶段都很重要。 此外,不同域中的异常和数据集相差很大,因此大规模正态性学习可能需要特定于域/应用程序。

If you find the summarization of the survey paper interesting and helpful, you can read the full paper for detail.

如果您发现问卷调查的摘要有趣且有用,则可以阅读全文以获取详细信息。

[1] Guansong Pang, Chunhua Shen, Longbing Cao, Anton van den Hengel. “Deep Learning for Anomaly Detection: A Review”. 2020. arXiv preprint: 2007.02500.

[1]庞冠松,沉春华,曹龙兵,安东·范登·亨格尔。 “ 用于异常检测的深度学习:回顾 ”。 2020年。arXiv预印本:2007.02500。

翻译自: https://towardsdatascience.com/a-comprehensive-survey-on-deep-learning-for-anomaly-detection-b1989b09ae38

深度学习 检测异常

你可能感兴趣的:(深度学习,python,人工智能,机器学习,tensorflow)