prometheus对指标timestamp的处理

prometheus中的指标timestamp有两个:

  • prometheus拉取时刻的timestamp,即服务端的时间:time.Now();
  • exporter的/metrics接口,除了返回metric,value,还返回timestamp;
# HELP container_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
# TYPE container_cpu_user_seconds_total counter
container_cpu_user_seconds_total{container="",id="/",image="",name="",namespace="",pod=""} 788250.59 1692241058502
container_cpu_user_seconds_total{container="",id="/kubepods.slice",image="",name="",namespace="",pod=""} 378238.54 1692241058529

一. prometheus的配置

对上面的两个timestamp,prometheus通过下面的配置决定选择哪一个。

1. 配置

# honor_timestamps controls whether Prometheus respects the timestamps present
# in scraped data.
#
# If honor_timestamps is set to "true", the timestamps of the metrics exposed
# by the target will be used.
#
# If honor_timestamps is set to "false", the timestamps of the metrics exposed
# by the target will be ignored.
[ honor_timestamps:  | default = true ]

使用honor_timestamps配置拉取指标的时间:

  • 默认honor_timestamps=true;
  • honor_timestamps=true时:

    • 使用拉取/metrics时exporter返回的timestamps;
    • 若exporter未返回timestamps,则使用prometheus拉取时刻的timestamps(即服务端的时间);
  • honor_timestamps=false时:

    • 直接使用prometheus拉取时刻的timestamps(即服务端的时间);

2. 源码

image.png

// scrape/scape.go
func (sl *scrapeLoop) append(app storage.Appender, b []byte, contentType string, ts time.Time) (total, added, seriesAdded int, err error) {
    var (
        p              = textparse.New(b, contentType)
        defTime        = timestamp.FromTime(ts)     // 这里的ts=time.Now(),即拉取时的时间
    )
    ...
    for {
        var (
            et          textparse.Entry
            sampleAdded bool
        )
        if et, err = p.Next(); err != nil {
            if err == io.EOF {
                err = nil
            }
            break
        }  
        ...
        t := defTime                    // defTime=time.Now()
        met, tp, v := p.Series()        // 解析出拉取的:met=metrics, tp=timestamp, v=value
        if !sl.honorTimestamps {        // 若honor_timestamps=false,tp=nil, 即使用服务端的时间:time.Now()
            tp = nil
        }
        if tp != nil {                  // 若解析出之间,则t=解析的时间,否则使用time.Now()
            t = *tp
        }
        ...
        err = app.AddFast(ce.ref, t, v)     // 将t/v/series写入tsdb
        ...
    }  
    ...
}

二. exporter中的timestamp

1. 不带timestamp

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000112409
go_gc_duration_seconds{quantile="0.25"} 0.000435099
go_gc_duration_seconds{quantile="0.5"} 0.000530901
go_gc_duration_seconds{quantile="0.75"} 0.000681327
go_gc_duration_seconds{quantile="1"} 0.00163155
go_gc_duration_seconds_sum 11.546457813
go_gc_duration_seconds_count 2331

在开发exporter时,也可以使用client_go的SetToCurrentTime()设置为当前时间,这样/metrics就会返回timestamp。

2. 带timestamp(eg.cadvisor)

cavisor的/metrics返回了timestamp,紧跟在value后面:

# HELP container_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
# TYPE container_cpu_user_seconds_total counter
container_cpu_user_seconds_total{container="",id="/",image="",name="",namespace="",pod=""} 788250.59 1692241058502
container_cpu_user_seconds_total{container="",id="/kubepods.slice",image="",name="",namespace="",pod=""} 378238.54 1692241058529
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice",image="",name="",namespace="",pod=""} 17053.5 1692241041865
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod6cc5ddd5_aa45_4889_9106_20fcae0951e8.slice",image="",name="",namespace="kube-system",pod="kube-proxy-f6pjd"} 9980.23 1692241057892
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-poddabdd3fc_2893_4a90_99b8_464462a5ab6a.slice",image="",name="",namespace="kube-system",pod="calico-kube-controllers-75ddb95444-gv7j7"} 7073.3 1692241058973

cadvisor中构造prometheus指标中时间的方法:

// cadvisor/metrics/prometheus.go

{
   name:      "container_cpu_user_seconds_total",
   help:      "Cumulative user cpu time consumed in seconds.",
   valueType: prometheus.CounterValue,
   getValues: func(s *info.ContainerStats) metricValues {
      return metricValues{
         {
            value:     float64(s.Cpu.Usage.User) / float64(time.Second),
            timestamp: s.Timestamp,
         },
      }
   },
}

可以看到,在采集的时候,记录了采集的timestamp,同时也把这个timestamp传给了prometheus:

for _, metricValue := range cm.getValues(stats) {
   ch <- prometheus.NewMetricWithTimestamp(
      metricValue.timestamp,
      prometheus.MustNewConstMetric(desc, cm.valueType, float64(metricValue.value), append(values, metricValue.labels...)...),
   )
}

参考:

1.https://prometheus.io/docs/prometheus/latest/configuration/co...
2.https://mp.weixin.qq.com/s/kxHgNN_d83nT2LTNQyj6tg

你可能感兴趣的:(prometheus监控工具)