prometheus之配置详解

声明:本文只是翻译官网相关的资料,详情参考官网configuration
概述
Prometheus改变参数可以通过命令行参数以及配置文件,其中命令行参数只要是修改系统参数,例如存储路径的指定或者挂载磁盘
Prometheus可以在运行时重新加载它的配置。 如果新配置格式不正确,则更改将不会应用。 通过向Prometheus进程发送SIGHUP或向/-/reload端点发送HTTP POST请求(启用--web.enable-lifecycle标志时)来触发配置reload。 这也将重新加载任何配置的规则文件。

prometheus通过--config.file命令行参数指定配置文件

一.参数的类型定义(通用占位符定义如下)

1. <boolean>:true false
2. <duration>:与正则表达式[0-9]+(ms|[smhdwy])匹配的持续时间
3. <labelname>:与正则表达式匹配的字符串[a-zA-Z_][a-zA-Z0-9_]*
4. <labelvalue>:一个unicode字符串
5. <filename>:当前工作目录中的有效路径
6. <host>:一个有效的字符串,由一个主机名或IP后跟一个可选的端口号组成
7. <path>:有效的URL路径
8. <scheme>:可以取值为http或https的字符串
9. <string>:一个常规字符串
10. <secret>:一个常用的密码字符串,例如密码
11. <tmpl_string>:在使用前被模板扩展的字符串

全局配置指定在所有其他配置上下文中有效的参数。 它们也作为其他配置部分的默认值。

global:
  # How frequently to scrape targets by default.
  [ scrape_interval:  | default = 1m ]

  # How long until a scrape request times out.
  [ scrape_timeout:  | default = 10s ]

  # How frequently to evaluate rules.
  [ evaluation_interval:  | default = 1m ]

  # The labels to add to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    [ :  ... ]

# Rule files specifies a list of globs. Rules and alerts are read from
# all matching files.搜集的规则文件
rule_files:
  [ -  ... ]

# A list of scrape configurations.
scrape_configs:
  [ -  ... ]

# Alerting specifies settings related to the Alertmanager.
alerting:
  alert_relabel_configs:
    [ -  ... ]
  alertmanagers:
    [ -  ... ]

# Settings related to the remote write feature.
remote_write:
  [ -  ... ]

# Settings related to the remote read feature.
remote_read:
  [ -  ... ]

:定义收集规则。 在一般情况下,一个scrape配置指定一个job。 在高级配置中,这可能会改变。

relabel_configs允许在抓取之前对任何目标及其标签进行修改。
目标可以通过static_configs参数进行静态配置,也可以使用其中一种受支持的服务发现机制进行动态发现。

static_configs

# The targets specified by the static config.
targets:
  [ - '' ]

# Labels assigned to all metrics scraped from the targets.
labels:
  [ :  ... ]
# The job name assigned to scraped metrics by default.
job_name: 

# How frequently to scrape targets from this job.
[ scrape_interval:  | default =  ]

# Per-scrape timeout when scraping this job.
[ scrape_timeout:  | default =  ]

# The HTTP resource path on which to fetch metrics from targets.
[ metrics_path:  | default = /metrics ]

# honor_labels controls how Prometheus handles conflicts between labels that are
# already present in scraped data and labels that Prometheus would attach
# server-side ("job" and "instance" labels, manually configured target
# labels, and labels generated by service discovery implementations).
#
# If honor_labels is set to "true", label conflicts are resolved by keeping label
# values from the scraped data and ignoring the conflicting server-side labels.
#
# If honor_labels is set to "false", label conflicts are resolved by renaming
# conflicting labels in the scraped data to "exported_" (for
# example "exported_instance", "exported_job") and then attaching server-side
# labels. This is useful for use cases such as federation, where all labels
# specified in the target should be preserved.
#
# Note that any globally configured "external_labels" are unaffected by this
# setting. In communication with external systems, they are always applied only
# when a time series does not have a given label yet and are ignored otherwise.
[ honor_labels:  | default = false ]

# Configures the protocol scheme used for requests.
[ scheme:  | default = http ]

# Optional HTTP URL parameters.
params:
  [ : [, ...] ]

# Sets the `Authorization` header on every scrape request with the
# configured username and password.
basic_auth:
  [ username:  ]
  [ password:  ]

# Sets the `Authorization` header on every scrape request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token:  ]

# Sets the `Authorization` header on every scrape request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]

# Configures the scrape request's TLS settings.
tls_config:
  [  ]

# Optional proxy URL.
[ proxy_url:  ]

# List of Azure service discovery configurations.
azure_sd_configs:
  [ -  ... ]

# List of Consul service discovery configurations.
consul_sd_configs:
  [ -  ... ]

# List of DNS service discovery configurations.
dns_sd_configs:
  [ -  ... ]

# List of EC2 service discovery configurations.
ec2_sd_configs:
  [ -  ... ]

# List of OpenStack service discovery configurations.
openstack_sd_configs:
  [ -  ... ]

# List of file service discovery configurations.
file_sd_configs:
  [ -  ... ]

# List of GCE service discovery configurations.
gce_sd_configs:
  [ -  ... ]

# List of Kubernetes service discovery configurations.
kubernetes_sd_configs:
  [ -  ... ]

# List of Marathon service discovery configurations.
marathon_sd_configs:
  [ -  ... ]

# List of AirBnB's Nerve service discovery configurations.
nerve_sd_configs:
  [ -  ... ]

# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
  [ -  ... ]

# List of Triton service discovery configurations.
triton_sd_configs:
  [ -  ... ]

# List of labeled statically configured targets for this job.
static_configs:
  [ -  ... ]

# List of target relabel configurations.
relabel_configs:
  [ -  ... ]

# List of metric relabel configurations.
metric_relabel_configs:
  [ -  ... ]

# Per-scrape limit on number of scraped samples that will be accepted.
# If more than this number of samples are present after metric relabelling
# the entire scrape will be treated as failed. 0 means no limit.
[ sample_limit:  | default = 0 ]

配置详细信息

# CA certificate to validate API server certificate with.
[ ca_file:  ]

# Certificate and key files for client cert authentication to the server.
[ cert_file:  ]
[ key_file:  ]

# ServerName extension to indicate the name of the server.
# http://tools.ietf.org/html/rfc4366#section-3.1
[ server_name: <string> ]

# Disable validation of the server certificate.
[ insecure_skip_verify:  ]

prometheus支持好多种服务收集,从上面的配置信息说明就可以看得出比如kubernetes,openstack,ec2,dns等等,下面详细讲解kubernetes的


这个是以角色(role)来定义收集的,比如node,service,pod,endpoints,ingress等等
node
默认就是取kubelet的http端口号,可以从NodeInternalIP, NodeExternalIP, NodeLegacyHostIP, and NodeHostName这里取IP地址,组成NodeInternalIP:Port模式

labels

__meta_kubernetes_node_name: 节点名称
__meta_kubernetes_node_label_: node的labels
__meta_kubernetes_node_annotation_: node的annotation
__meta_kubernetes_node_address_: 如果存在四个值中(NodeInternalIP, NodeExternalIP, NodeLegacyHostIP, and NodeHostName)的一个

service
kubeDNS加端口
labels

__meta_kubernetes_namespace: service所处的namespace
__meta_kubernetes_service_name: serviceName.
__meta_kubernetes_service_label_: service 的label
__meta_kubernetes_service_annotation_: service的annotationname
__meta_kubernetes_service_port_name: service的portName
__meta_kubernetes_service_port_number: service的端口
__meta_kubernetes_service_port_protocol: 协议

pod
labels

__meta_kubernetes_namespace: pod所在的namespace.
__meta_kubernetes_pod_name: podName
__meta_kubernetes_pod_ip: podIP
__meta_kubernetes_pod_label_: pod的label.
__meta_kubernetes_pod_annotation_: pod的annotation 
__meta_kubernetes_pod_container_name: 容器名称
__meta_kubernetes_pod_container_port_name: 容器端口名称.
__meta_kubernetes_pod_container_port_number: 容器端口号.
__meta_kubernetes_pod_container_port_protocol: 容器的协议
__meta_kubernetes_pod_ready: pod的状态 ture false
__meta_kubernetes_pod_node_name: 所在的节点名称
__meta_kubernetes_pod_host_ip: pod所在节点的IP.
__meta_kubernetes_pod_uid: pod的UUID.

endpoints

__meta_kubernetes_namespace: endpoints的namespace.
__meta_kubernetes_endpoints_name: endpoints的name.
__meta_kubernetes_endpoint_ready: endpoint状态 true false 
__meta_kubernetes_endpoint_port_name: endpoint port Name 
__meta_kubernetes_endpoint_port_protocol: 协议.

ingress

__meta_kubernetes_namespace: ingress的namespace.
__meta_kubernetes_ingress_name: ingressName.
__meta_kubernetes_ingress_label_: ingress 的label
__meta_kubernetes_ingress_annotation_: ingress 的annotation
__meta_kubernetes_ingress_scheme:默认是http, https 
__meta_kubernetes_ingress_path: Path from ingress spec. Defaults to /.

详细信息请参考:prometheus-kubernetes

配置文件例子

# my global config
global:
  scrape_interval:     15s
  evaluation_interval: 30s
  # scrape_timeout is set to the global default (10s).

  external_labels:
    monitor: codelab
    foo:     bar

rule_files:
- "first.rules"
- "my/*.rules"

remote_write:
  - url: http://remote1/push
    write_relabel_configs:
    - source_labels: [__name__]
      regex:         expensive.*
      action:        drop
  - url: http://remote2/push

remote_read:
  - url: http://remote1/read
    read_recent: true
  - url: http://remote3/read
    read_recent: false
    required_matchers:
      job: special

scrape_configs:
- job_name: prometheus

  honor_labels: true
  # scrape_interval is defined by the configured global (15s).
  # scrape_timeout is defined by the global default (10s).

  # metrics_path defaults to '/metrics'
  # scheme defaults to 'http'.

  file_sd_configs:
    - files:
      - foo/*.slow.json
      - foo/*.slow.yml
      - single/file.yml
      refresh_interval: 10m
    - files:
      - bar/*.yaml

  static_configs:
  - targets: ['localhost:9090', 'localhost:9191']
    labels:
      my:   label
      your: label

  relabel_configs:
  - source_labels: [job, __meta_dns_name]
    regex:         (.*)some-[regex]
    target_label:  job
    replacement:   foo-${1}
    # action defaults to 'replace'
  - source_labels: [abc]
    target_label:  cde
  - replacement:   static
    target_label:  abc
  - regex:
    replacement:   static
    target_label:  abc

  bearer_token_file: valid_token_file


- job_name: service-x

  basic_auth:
    username: admin_name
    password: "multiline\nmysecret\ntest"

  scrape_interval: 50s
  scrape_timeout:  5s

  sample_limit: 1000

  metrics_path: /my_path
  scheme: https

  dns_sd_configs:
  - refresh_interval: 15s
    names:
    - first.dns.address.domain.com
    - second.dns.address.domain.com
  - names:
    - first.dns.address.domain.com
    # refresh_interval defaults to 30s.

  relabel_configs:
  - source_labels: [job]
    regex:         (.*)some-[regex]
    action:        drop
  - source_labels: [__address__]
    modulus:       8
    target_label:  __tmp_hash
    action:        hashmod
  - source_labels: [__tmp_hash]
    regex:         1
    action:        keep
  - action:        labelmap
    regex:         1
  - action:        labeldrop
    regex:         d
  - action:        labelkeep
    regex:         k

  metric_relabel_configs:
  - source_labels: [__name__]
    regex:         expensive_metric.*
    action:        drop

- job_name: service-y

  consul_sd_configs:
  - server: 'localhost:1234'
    token: mysecret
    services: ['nginx', 'cache', 'mysql']
    scheme: https
    tls_config:
      ca_file: valid_ca_file
      cert_file: valid_cert_file
      key_file:  valid_key_file
      insecure_skip_verify: false

  relabel_configs:
  - source_labels: [__meta_sd_consul_tags]
    separator:     ','
    regex:         label:([^=]+)=([^,]+)
    target_label:  ${1}
    replacement:   ${2}

- job_name: service-z

  tls_config:
    cert_file: valid_cert_file
    key_file: valid_key_file

  bearer_token: mysecret

- job_name: service-kubernetes

  kubernetes_sd_configs:
  - role: endpoints
    api_server: 'https://localhost:1234'

    basic_auth:
      username: 'myusername'
      password: 'mysecret'

- job_name: service-kubernetes-namespaces

  kubernetes_sd_configs:
  - role: endpoints
    api_server: 'https://localhost:1234'
    namespaces:
      names:
        - default

- job_name: service-marathon
  marathon_sd_configs:
  - servers:
    - 'https://marathon.example.com:443'

    tls_config:
      cert_file: valid_cert_file
      key_file: valid_key_file

- job_name: service-ec2
  ec2_sd_configs:
    - region: us-east-1
      access_key: access
      secret_key: mysecret
      profile: profile

- job_name: service-azure
  azure_sd_configs:
    - subscription_id: 11AAAA11-A11A-111A-A111-1111A1111A11
      tenant_id: BBBB222B-B2B2-2B22-B222-2BB2222BB2B2
      client_id: 333333CC-3C33-3333-CCC3-33C3CCCCC33C
      client_secret: mysecret
      port: 9100

- job_name: service-nerve
  nerve_sd_configs:
    - servers:
      - localhost
      paths:
      - /monitoring

- job_name: 0123service-xxx
  metrics_path: /metrics
  static_configs:
    - targets:
      - localhost:9090

- job_name: 測試
  metrics_path: /metrics
  static_configs:
    - targets:
      - localhost:9090

- job_name: service-triton
  triton_sd_configs:
  - account: 'testAccount'
    dns_suffix: 'triton.example.com'
    endpoint: 'triton.example.com'
    port: 9163
    refresh_interval: 1m
    version: 1
    tls_config:
      cert_file: testdata/valid_cert_file
      key_file: testdata/valid_key_file

alerting:
  alertmanagers:
  - scheme: https
    static_configs:
    - targets:
      - "1.2.3.4:9093"
      - "1.2.3.5:9093"
      - "1.2.3.6:9093"

参考:
configuration

你可能感兴趣的:(prometheus)