声明:本文只是翻译官网相关的资料,详情参考官网configuration
概述
Prometheus改变参数可以通过命令行参数以及配置文件,其中命令行参数只要是修改系统参数,例如存储路径的指定或者挂载磁盘
Prometheus可以在运行时重新加载它的配置。 如果新配置格式不正确,则更改将不会应用。 通过向Prometheus进程发送SIGHUP
或向/-/reload
端点发送HTTP POST请求(启用--web.enable-lifecycle
标志时)来触发配置reload。 这也将重新加载任何配置的规则文件。
prometheus通过--config.file
命令行参数指定配置文件
一.参数的类型定义(通用占位符定义如下)
1. <boolean>:true false
2. <duration>:与正则表达式[0-9]+(ms|[smhdwy])匹配的持续时间
3. <labelname>:与正则表达式匹配的字符串[a-zA-Z_][a-zA-Z0-9_]*
4. <labelvalue>:一个unicode字符串
5. <filename>:当前工作目录中的有效路径
6. <host>:一个有效的字符串,由一个主机名或IP后跟一个可选的端口号组成
7. <path>:有效的URL路径
8. <scheme>:可以取值为http或https的字符串
9. <string>:一个常规字符串
10. <secret>:一个常用的密码字符串,例如密码
11. <tmpl_string>:在使用前被模板扩展的字符串
全局配置指定在所有其他配置上下文中有效的参数。 它们也作为其他配置部分的默认值。
global:
# How frequently to scrape targets by default.
[ scrape_interval: | default = 1m ]
# How long until a scrape request times out.
[ scrape_timeout: | default = 10s ]
# How frequently to evaluate rules.
[ evaluation_interval: | default = 1m ]
# The labels to add to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
[ : ... ]
# Rule files specifies a list of globs. Rules and alerts are read from
# all matching files.搜集的规则文件
rule_files:
[ - ... ]
# A list of scrape configurations.
scrape_configs:
[ - ... ]
# Alerting specifies settings related to the Alertmanager.
alerting:
alert_relabel_configs:
[ - ... ]
alertmanagers:
[ - ... ]
# Settings related to the remote write feature.
remote_write:
[ - ... ]
# Settings related to the remote read feature.
remote_read:
[ - ... ]
:定义收集规则。 在一般情况下,一个scrape配置指定一个job。 在高级配置中,这可能会改变。
relabel_configs
允许在抓取之前对任何目标及其标签进行修改。
目标可以通过static_configs
参数进行静态配置,也可以使用其中一种受支持的服务发现机制进行动态发现。
static_configs
# The targets specified by the static config.
targets:
[ - '' ]
# Labels assigned to all metrics scraped from the targets.
labels:
[ : ... ]
# The job name assigned to scraped metrics by default.
job_name:
# How frequently to scrape targets from this job.
[ scrape_interval: | default = ]
# Per-scrape timeout when scraping this job.
[ scrape_timeout: | default = ]
# The HTTP resource path on which to fetch metrics from targets.
[ metrics_path: | default = /metrics ]
# honor_labels controls how Prometheus handles conflicts between labels that are
# already present in scraped data and labels that Prometheus would attach
# server-side ("job" and "instance" labels, manually configured target
# labels, and labels generated by service discovery implementations).
#
# If honor_labels is set to "true", label conflicts are resolved by keeping label
# values from the scraped data and ignoring the conflicting server-side labels.
#
# If honor_labels is set to "false", label conflicts are resolved by renaming
# conflicting labels in the scraped data to "exported_" (for
# example "exported_instance", "exported_job") and then attaching server-side
# labels. This is useful for use cases such as federation, where all labels
# specified in the target should be preserved.
#
# Note that any globally configured "external_labels" are unaffected by this
# setting. In communication with external systems, they are always applied only
# when a time series does not have a given label yet and are ignored otherwise.
[ honor_labels: | default = false ]
# Configures the protocol scheme used for requests.
[ scheme: | default = http ]
# Optional HTTP URL parameters.
params:
[ : [, ...] ]
# Sets the `Authorization` header on every scrape request with the
# configured username and password.
basic_auth:
[ username: ]
[ password: ]
# Sets the `Authorization` header on every scrape request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token: ]
# Sets the `Authorization` header on every scrape request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]
# Configures the scrape request's TLS settings.
tls_config:
[ ]
# Optional proxy URL.
[ proxy_url: ]
# List of Azure service discovery configurations.
azure_sd_configs:
[ - ... ]
# List of Consul service discovery configurations.
consul_sd_configs:
[ - ... ]
# List of DNS service discovery configurations.
dns_sd_configs:
[ - ... ]
# List of EC2 service discovery configurations.
ec2_sd_configs:
[ - ... ]
# List of OpenStack service discovery configurations.
openstack_sd_configs:
[ - ... ]
# List of file service discovery configurations.
file_sd_configs:
[ - ... ]
# List of GCE service discovery configurations.
gce_sd_configs:
[ - ... ]
# List of Kubernetes service discovery configurations.
kubernetes_sd_configs:
[ - ... ]
# List of Marathon service discovery configurations.
marathon_sd_configs:
[ - ... ]
# List of AirBnB's Nerve service discovery configurations.
nerve_sd_configs:
[ - ... ]
# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
[ - ... ]
# List of Triton service discovery configurations.
triton_sd_configs:
[ - ... ]
# List of labeled statically configured targets for this job.
static_configs:
[ - ... ]
# List of target relabel configurations.
relabel_configs:
[ - ... ]
# List of metric relabel configurations.
metric_relabel_configs:
[ - ... ]
# Per-scrape limit on number of scraped samples that will be accepted.
# If more than this number of samples are present after metric relabelling
# the entire scrape will be treated as failed. 0 means no limit.
[ sample_limit: | default = 0 ]
配置详细信息
# CA certificate to validate API server certificate with.
[ ca_file: ]
# Certificate and key files for client cert authentication to the server.
[ cert_file: ]
[ key_file: ]
# ServerName extension to indicate the name of the server.
# http://tools.ietf.org/html/rfc4366#section-3.1
[ server_name: <string> ]
# Disable validation of the server certificate.
[ insecure_skip_verify: ]
prometheus支持好多种服务收集,从上面的配置信息说明就可以看得出比如kubernetes,openstack,ec2,dns等等,下面详细讲解kubernetes的
这个是以角色(role)来定义收集的,比如node,service,pod,endpoints,ingress等等
node
默认就是取kubelet的http端口号,可以从NodeInternalIP, NodeExternalIP, NodeLegacyHostIP, and NodeHostName
这里取IP地址,组成NodeInternalIP:Port模式
labels
__meta_kubernetes_node_name: 节点名称
__meta_kubernetes_node_label_ : node的labels
__meta_kubernetes_node_annotation_ : node的annotation
__meta_kubernetes_node_address_ : 如果存在四个值中(NodeInternalIP, NodeExternalIP, NodeLegacyHostIP, and NodeHostName)的一个
service
kubeDNS加端口
labels
__meta_kubernetes_namespace: service所处的namespace
__meta_kubernetes_service_name: serviceName.
__meta_kubernetes_service_label_ : service 的label
__meta_kubernetes_service_annotation_ : service的annotationname
__meta_kubernetes_service_port_name: service的portName
__meta_kubernetes_service_port_number: service的端口
__meta_kubernetes_service_port_protocol: 协议
pod
labels
__meta_kubernetes_namespace: pod所在的namespace.
__meta_kubernetes_pod_name: podName
__meta_kubernetes_pod_ip: podIP
__meta_kubernetes_pod_label_ : pod的label.
__meta_kubernetes_pod_annotation_ : pod的annotation
__meta_kubernetes_pod_container_name: 容器名称
__meta_kubernetes_pod_container_port_name: 容器端口名称.
__meta_kubernetes_pod_container_port_number: 容器端口号.
__meta_kubernetes_pod_container_port_protocol: 容器的协议
__meta_kubernetes_pod_ready: pod的状态 ture false
__meta_kubernetes_pod_node_name: 所在的节点名称
__meta_kubernetes_pod_host_ip: pod所在节点的IP.
__meta_kubernetes_pod_uid: pod的UUID.
endpoints
__meta_kubernetes_namespace: endpoints的namespace.
__meta_kubernetes_endpoints_name: endpoints的name.
__meta_kubernetes_endpoint_ready: endpoint状态 true false
__meta_kubernetes_endpoint_port_name: endpoint port Name
__meta_kubernetes_endpoint_port_protocol: 协议.
ingress
__meta_kubernetes_namespace: ingress的namespace.
__meta_kubernetes_ingress_name: ingressName.
__meta_kubernetes_ingress_label_: ingress 的label
__meta_kubernetes_ingress_annotation_: ingress 的annotation
__meta_kubernetes_ingress_scheme:默认是http, https
__meta_kubernetes_ingress_path: Path from ingress spec. Defaults to /.
详细信息请参考:prometheus-kubernetes
配置文件例子
# my global config
global:
scrape_interval: 15s
evaluation_interval: 30s
# scrape_timeout is set to the global default (10s).
external_labels:
monitor: codelab
foo: bar
rule_files:
- "first.rules"
- "my/*.rules"
remote_write:
- url: http://remote1/push
write_relabel_configs:
- source_labels: [__name__]
regex: expensive.*
action: drop
- url: http://remote2/push
remote_read:
- url: http://remote1/read
read_recent: true
- url: http://remote3/read
read_recent: false
required_matchers:
job: special
scrape_configs:
- job_name: prometheus
honor_labels: true
# scrape_interval is defined by the configured global (15s).
# scrape_timeout is defined by the global default (10s).
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
file_sd_configs:
- files:
- foo/*.slow.json
- foo/*.slow.yml
- single/file.yml
refresh_interval: 10m
- files:
- bar/*.yaml
static_configs:
- targets: ['localhost:9090', 'localhost:9191']
labels:
my: label
your: label
relabel_configs:
- source_labels: [job, __meta_dns_name]
regex: (.*)some-[regex]
target_label: job
replacement: foo-${1}
# action defaults to 'replace'
- source_labels: [abc]
target_label: cde
- replacement: static
target_label: abc
- regex:
replacement: static
target_label: abc
bearer_token_file: valid_token_file
- job_name: service-x
basic_auth:
username: admin_name
password: "multiline\nmysecret\ntest"
scrape_interval: 50s
scrape_timeout: 5s
sample_limit: 1000
metrics_path: /my_path
scheme: https
dns_sd_configs:
- refresh_interval: 15s
names:
- first.dns.address.domain.com
- second.dns.address.domain.com
- names:
- first.dns.address.domain.com
# refresh_interval defaults to 30s.
relabel_configs:
- source_labels: [job]
regex: (.*)some-[regex]
action: drop
- source_labels: [__address__]
modulus: 8
target_label: __tmp_hash
action: hashmod
- source_labels: [__tmp_hash]
regex: 1
action: keep
- action: labelmap
regex: 1
- action: labeldrop
regex: d
- action: labelkeep
regex: k
metric_relabel_configs:
- source_labels: [__name__]
regex: expensive_metric.*
action: drop
- job_name: service-y
consul_sd_configs:
- server: 'localhost:1234'
token: mysecret
services: ['nginx', 'cache', 'mysql']
scheme: https
tls_config:
ca_file: valid_ca_file
cert_file: valid_cert_file
key_file: valid_key_file
insecure_skip_verify: false
relabel_configs:
- source_labels: [__meta_sd_consul_tags]
separator: ','
regex: label:([^=]+)=([^,]+)
target_label: ${1}
replacement: ${2}
- job_name: service-z
tls_config:
cert_file: valid_cert_file
key_file: valid_key_file
bearer_token: mysecret
- job_name: service-kubernetes
kubernetes_sd_configs:
- role: endpoints
api_server: 'https://localhost:1234'
basic_auth:
username: 'myusername'
password: 'mysecret'
- job_name: service-kubernetes-namespaces
kubernetes_sd_configs:
- role: endpoints
api_server: 'https://localhost:1234'
namespaces:
names:
- default
- job_name: service-marathon
marathon_sd_configs:
- servers:
- 'https://marathon.example.com:443'
tls_config:
cert_file: valid_cert_file
key_file: valid_key_file
- job_name: service-ec2
ec2_sd_configs:
- region: us-east-1
access_key: access
secret_key: mysecret
profile: profile
- job_name: service-azure
azure_sd_configs:
- subscription_id: 11AAAA11-A11A-111A-A111-1111A1111A11
tenant_id: BBBB222B-B2B2-2B22-B222-2BB2222BB2B2
client_id: 333333CC-3C33-3333-CCC3-33C3CCCCC33C
client_secret: mysecret
port: 9100
- job_name: service-nerve
nerve_sd_configs:
- servers:
- localhost
paths:
- /monitoring
- job_name: 0123service-xxx
metrics_path: /metrics
static_configs:
- targets:
- localhost:9090
- job_name: 測試
metrics_path: /metrics
static_configs:
- targets:
- localhost:9090
- job_name: service-triton
triton_sd_configs:
- account: 'testAccount'
dns_suffix: 'triton.example.com'
endpoint: 'triton.example.com'
port: 9163
refresh_interval: 1m
version: 1
tls_config:
cert_file: testdata/valid_cert_file
key_file: testdata/valid_key_file
alerting:
alertmanagers:
- scheme: https
static_configs:
- targets:
- "1.2.3.4:9093"
- "1.2.3.5:9093"
- "1.2.3.6:9093"
参考:
configuration