
White Paper

  • 《企业级 AIOps 实施建议》白皮书

Course and Slides

  • Tsinghua-Peidan - AIOps course in Tsinghua.
  • 基于机器学习的智能运维

Industry Practice

  • 腾讯运维的AI实践
  • AI 时代下腾讯的海量业务智能监控实践
  • 织云Metis时间序列异常检测全方位解析
  • 腾讯织云Metis智能运维学件平台开源代码

  • 阿里全链路监控方案

  • 百度智能流量监控实战
  • 异常检测:百度是这样做的
  • Next Generation of DevOps AIOps in Practice @Baidu [video]

  • 搭建大规模高性能的时间序列大数据平台
  • Yahoo大规模时列数据异常检测技术及其高性能可伸缩架构
  • Netflix: Robust PCA
  • LinkedIn: exponential smoothing
  • Uber: multivariate non-linear model


  • 智能运维|AIOps中的四大金刚都是谁?
  • A Comparison of Mapping Approaches for Distributed Cloud Applications
  • AIOps探索:基于VAE模型的周期性KPI异常检测方法

Tools and Algorithms

  • Tools to Monitor and Visualize Microservices Architecture
  • python-fp-growth,挖掘频繁项集
  • Anomaly Detection with Twitter in R
  • 百度开源时间序列打标工具:Curve
  • Microsoft开源时间序列打标工具: TagAnomaly
  • Anomaly Detection Examples
  • facebook/prophet, Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • google/CausalImpact, An R package for causal inference in time series
  • 时间序列分析之ARIMA
  • 时间序列特征提取库tsfresh
  • Awesome Time Series Analysis and Data Mining


  • Survey on Models and Techniques for Root-Cause Analysis
  • 基于机器学习的智能运维
  • HotSpot: Anomaly Localization for Additive KPIs With Multi-Dimensional Attributes
    • Chinese:清华AIOps新作:蒙特卡洛树搜索定位多维指标异常
  • Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning
  • Robust and Rapid Clustering of KPIs for Large-Scale Anomaly Detection


  • Alibaba/clusterdata
  • Azure/AzurePublicDataset
  • Google/cluster-data
  • The Numenta Anomaly Benchmark(NAB)
  • Yahoo: A Labeled Anomaly Detection Dataset
  • 港中文loghub数据集

Useful WeChat Official Accounts

  • 腾讯织云(腾讯的)
  • 智能运维前沿(清华裴丹团队的)
  • AIOps智能运维(百度的)
  • 华为产品可服务能力(华为的)
  • 知乎专栏:智能运维(AIOps)

