promerheus python 接入
promerheus是一套功能强大的监控系统,我们可以用它加一些可视化的库比如grafana实现非常酷炫的监控报表。promerheus分为客户端和服务端。本文只会涉及客户端的一些基本概念和在python中如何接入。至于详细的介绍请查阅官方文档。官方文档中比较重要的有客户端的度量类型,PromQL(Prometheus Query Language)中的操作和一些函数。分布见文末参考文献1,2,3.
METRIC TYPES
METRIC TYPES 只是在客户端区分的概念,在服务端是没有这些类型的。在客户端一共有四种度量类型。
Counter
A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
counter是只用直增不减的度量,使用场景统计请求次数。
Gauge
A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.
gauge可增可减少,使用场景:内存使用
Histogram
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
histogram会将观测值放到各个预先定义的桶里面。默认的桶如下:
DEFAULT_BUCKETS = (.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF)
Summary
Similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
summary除了观测值外,还会统计观测的次数。
以上所有的类型其实在客户端的内存中只会存一个值,经过你会上报指标多次,但是在内存中只会存储累加值。服务端默认是15秒是会拉取一遍客户端的统计数据。
python 接入代码
from prometheus_client import start_http_server, Counter, Gauge, Histogram, Summary
metrics_map = {}
"""
name是metrics的标识,不可重复
labels用于聚合,
如果第一次name配合了labels使用,那么以后都必须使用同样的labels
"""
def counter_reporter(name, *labels):
"""
只可增加, 最终上报的指标名称为{name}_total
:param name: 指标名称
:param labels: 指标tags, 必须是字符串
:return:
"""
if metrics_map.get(name):
counter = metrics_map.get(name)
if len(labels) > 0:
counter.labels(*labels).inc()
else:
counter.inc()
else:
counter = Counter(name, f'{name} metrics', labelnames=labels)
if len(labels) > 0:
counter.labels(*labels).inc()
else:
counter.inc()
metrics_map[name] = counter
def gauge_reporter(name, value, *labels):
"""
可增可减,最终上报的指标名称为{name}
:param name: 指标名称
:param labels: 指标tags
:return:
"""
if metrics_map.get(name):
gauge = metrics_map.get(name)
if len(labels) > 0:
gauge.labels(*labels).set(value)
else:
gauge.set(value)
else:
gauge = Gauge(name, f'{name} metrics', labelnames=labels)
if len(labels) > 0:
gauge.labels(*labels).set(value)
else:
gauge.set(value)
metrics_map[name] = gauge
def summary_reporter(name, value, *labels):
"""
最终上报的指标有两个,名称为{name}_sum, {name}_count
:param name: 指标名称
:param labels: 指标tags
:return:
"""
if metrics_map.get(name):
summary = metrics_map.get(name)
if len(labels) > 0:
summary.labels(*labels).observe(value)
else:
summary.observe(value)
else:
summary = Summary(name, f'{name} metrics', labelnames=labels)
if len(labels) > 0:
summary.labels(*labels).observe(value)
else:
summary.observe(value)
metrics_map[name] = summary
def histogram_reporter(name, value, *labels):
"""
会建立很多桶,名称为{name}_bucket(le="xx")
:param name: 指标名称
:param labels: 指标tags
:return:
"""
if metrics_map.get(name):
histogram = metrics_map.get(name)
if len(labels) > 0:
histogram.labels(*labels).observe(value)
else:
histogram.observe(value)
else:
histogram = Histogram(name, f'{name} metrics', labelnames=labels)
if len(labels) > 0:
histogram.labels(*labels).observe(value)
else:
histogram.observe(value)
metrics_map[name] = histogram
大家可以直接使用。如果需要结合flask等web服务使用,请参考文献4
参考文献
- metric types
- operators
- functions
- Prometheus Python Client
- 详解Prometheus range query中的step参数
- Understanding Prometheus Range Vectors
- rate()/increase() extrapolation considered harmful