Prometheus-Operator是一套为了方便整合prometheus和kubernetes的软件,使用Prometheus-Operator可以非常简单的在kubernetes集群中部署Prometheus服务,并且提供对kubernetes集群的监控,并且通过Prometheus-Operator用户能够使用简单的声明性配置来配置和管理Prometheus实例,这些配置将响应、创建、配置和管理Prometheus监控实例
Operator的核心思想是将Prometheus的部署与它监控的对象的配置分离,做到部署与监控对象的配置分离之后,就可以轻松实现动态配置。使用Operator部署了Prometheus之后就可以不用再管Prometheus Server了,以后如果要添加监控对象或者添加告警规则,只需要编写对应的ServiceMonitor和Prometheus资源就可以,不用再重启Prometheus服务,Operator会动态的观察配置的改动,并将其生成为对应的prometheus配置文件其中Operator可以部署、管理Prometheus Service
上图是Prometheus-Operator官方提供的架构图,从下向上看,Operator可以部署并且管理Prometheus Server,并且Operator可以观察Prometheus,那么这个观察是什么意思呢,上图中Service1 - Service5其实就是kubernetes的service,kubernetes的资源分为Service、Deployment、ServiceMonitor、ConfigMap等等,所以上图中的Service和ServiceMonitor都是kubernetes资源,一个ServiceMonitor可以通过labelSelector的方式去匹配一类Service(一般来说,一个ServiceMonitor对应一个Service),Prometheus也可以通过labelSelector去匹配多个ServiceMonitor。
Operator : Operator是整个系统的主要控制器,会以Deployment的方式运行于Kubernetes集群上,并根据自定义资源(Custom Resources Definition)CRD 来负责管理与部署Prometheus,Operator会通过监听这些CRD的变化来做出相对应的处理。
Prometheus : Operator会观察集群内的Prometheus CRD(Prometheus 也是一种CRD)来创建一个合适的statefulset在monitoring(.metadata.namespace指定)命名空间,并且挂载了一个名为prometheus-k8s的Secret为Volume到/etc/prometheus/config目录,Secret的data包含了以下内容
ServiceMonitor : ServiceMonitor就是一种kubernetes自定义资源(CRD),Operator会通过监听ServiceMonitor的变化来动态生成Prometheus的配置文件中的Scrape targets,并让这些配置实时生效,operator通过将生成的job更新到上面的prometheus-k8s这个Secret的Data的prometheus.yaml字段里,然后prometheus这个pod里的sidecar容器prometheus-config-reloader
当检测到挂载路径的文件发生改变后自动去执行HTTP Post请求到/api/-reload-
路径去reload配置。该自定义资源(CRD)通过labels选取对应的Service,并让prometheus server通过选取的Service拉取对应的监控信息(metric)
Service :Service其实就是指kubernetes的service资源,这里特指Prometheus exporter的service,比如部署在kubernetes上的mysql-exporter的service
想象一下,我们以传统的方式去监控一个mysql服务,首先需要安装mysql-exporter,获取mysql metrics,并且暴露一个端口,等待prometheus服务来拉取监控信息,然后去Prometheus Server的prometheus.yaml文件中在scarpe_config中添加mysql-exporter的job,配置mysql-exporter的地址和端口等信息,再然后,需要重启Prometheus服务,就完成添加一个mysql监控的任务
现在我们以Prometheus-Operator的方式来部署Prometheus,当我们需要添加一个mysql监控我们会怎么做,首先第一步和传统方式一样,部署一个mysql-exporter来获取mysql监控项,然后编写一个ServiceMonitor通过labelSelector选择刚才部署的mysql-exporter,由于Operator在部署Prometheus的时候默认指定了Prometheus选择label为:prometheus: kube-prometheus
的ServiceMonitor,所以只需要在ServiceMonitor上打上prometheus: kube-prometheus
标签就可以被Prometheus选择了,完成以上两步就完成了对mysql的监控,不需要改Prometheus配置文件,也不需要重启Prometheus服务,是不是很方便,Operator观察到ServiceMonitor发生变化,会动态生成Prometheus配置文件,并保证配置文件实时生效
ServiceMonitor是一个kubernetes的自定义资源,所以得遵循Kubernetes ServiceMonitor编写规范,这里通过动态添加一个mysql监控的示例来演示如何编写ServiceMonitor
先决条件:
在满足上述先决条件的情况下,首先打开prometheus server的界面,如下图,可以在Targets下看到已经有prometheus和kubernetes自身的监控了,还没有mysql监控
接下来在kubernetes中添加mysql监控的exporter:prometheus-mysql-exporter 这里采用helm的方式安装prometheus-mysql-exporter,按照github上的步骤进行安装,修改values.yaml中的datasource为安装在kubernetes中mysql的地址,然后执行命令helm install --name me-release -f values.yaml stable/prometheus-mysql-exporter
,接下来通过执行命令kubectl get service
查看刚才运行的mysql-exporter的service:
root@k8s1:/usr/local/src/charts# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None 9093/TCP,6783/TCP 4d
grafana-grafana NodePort 10.106.170.105 80:32324/TCP 4d
kube-prometheus NodePort 10.111.168.249 9090:30900/TCP 22h
kube-prometheus-alertmanager NodePort 10.101.224.156 9093:30903/TCP 22h
kube-prometheus-exporter-kube-state ClusterIP 10.107.60.242 80/TCP 22h
kube-prometheus-exporter-node ClusterIP 10.111.96.180 9100/TCP 22h
me-release-prometheus-mysql-exporter ClusterIP 10.109.252.44 9104/TCP 50s
prometheus-operated ClusterIP None 9090/TCP 4d
找到me-release-prometheus-mysql-exporter,说明服务已经起来了,然后执行命令kubectl describe service me-release-prometheus-mysql-exporter
查看该service信息:
root@k8s1:/usr/local/src/charts# kubectl describe service me-release-prometheus-mysql-exporter
Name: me-release-prometheus-mysql-exporter
Namespace: monitoring
Labels: app=prometheus-mysql-exporter
chart=prometheus-mysql-exporter-0.1.0
heritage=Tiller
release=me-release
Annotations:
Selector: app=prometheus-mysql-exporter,release=me-release
Type: ClusterIP
IP: 10.109.252.44
Port: mysql-exporter 9104/TCP
TargetPort: 9104/TCP
Endpoints: 10.244.4.15:9104
Session Affinity: None
Events:
可以看到该service在k8s内的端口是Port: mysql-exporter 9104/TCP
接下来编写ServiceMonitor文件,执行命令vim servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor #资源类型为ServiceMonitor
metadata:
labels:
prometheus: kube-prometheus #prometheus默认通过 prometheus: kube-prometheus发现ServiceMonitor,只要写上这个标签prometheus服务就能发现这个ServiceMonitor
name: prometheus-exporter-mysql
spec:
jobLabel: app #jobLabel指定的标签的值将会作为prometheus配置文件中scrape_config下job_name的值,也就是Target,如果不写,默认为service的name
selector:
matchLabels: #该ServiceMonitor匹配的Service的labels,如果使用mathLabels,则下面的所有标签都匹配时才会匹配该service,如果使用matchExpressions,则至少匹配一个标签的service都会被选择
app: prometheus-mysql-exporter # 由于前面查看mysql-exporter的service信息中标签包含了app: prometheus-mysql-exporter这个标签,写上就能匹配到
namespaceSelector:
any: true #表示从所有namespace中去匹配,如果只想选择某一命名空间中的service,可以使用matchNames: []的方式
# mathNames: []
endpoints:
- port: mysql-exporter #前面查看mysql-exporter的service信息中,提供mysql监控信息的端口是Port: mysql-exporter 9104/TCP,所以这里填mysql-exporter
interval: 30s #每30s获取一次信息
# path: /metrics HTTP path to scrape for metrics,默认值为/metrics
honorLabels: true
保存并退出文件,然后执行命令:kubectl create -f servicemonitor.yaml
,创建成功之后执行命令kubectl get serviceMonitor
查看是否有刚才创建的serviceMonitor:
root@k8s1:~/mysql-exporter# kubectl create -f servicemonitor.yaml
servicemonitor.monitoring.coreos.com/prometheus-exporter-mysql created
root@k8s1:~/mysql-exporter# kubectl get serviceMonitor
NAME AGE
grafana 4d
kafka-release 5h
kafka-release-exporter 5h
kube-prometheus 23h
kube-prometheus-alertmanager 23h
kube-prometheus-exporter-coredns 23h
kube-prometheus-exporter-kube-controller-manager 23h
kube-prometheus-exporter-kube-etcd 23h
kube-prometheus-exporter-kube-scheduler 23h
kube-prometheus-exporter-kube-state 23h
kube-prometheus-exporter-kubelets 23h
kube-prometheus-exporter-kubernetes 23h
kube-prometheus-exporter-node 23h
prometheus-exporter-mysql 13s
prometheus-operator 4d
root@k8s1:~/mysql-exporter#
可以看到Prometheus-exporter-mysql已经存在了,表示创建成功了,过1分钟左右,在prometheus的界面中查看Targets,可以看到已经成功添加了mysql监控。
上面提到prometheus通过标签prometheus: kube-prometheus
选择ServiceMonitor,该配置写在这里, 当然,你可以通过在values.yaml中配置serviceMonitorsSelector来指定按照自己的规则选择serviceMonitor,关于如何配置serviceMonitorsSelector将放在后文统一讲解
当我们动态添加了监控对象,一般会对该对象配置告警规则,采用prometheus-operator的架构模式下,当我们需要动态配置告警规则的时候,可以使用另一种自定义资源(CRD)PrometheusRule,PrometheusRule和ServiceMonitor都是一种自定义资源,ServiceMonitor用于动态添加监控实例,而PrometheusRule则用于动态添加告警规则,下面依然通过动态添加mysql的告警规则为例来演示如何使用PrometheusRule资源。
执行命令vim mysql-rule.yaml
,输入以下内容
apiVersion: monitoring.coreos.com/v1 #这和ServiceMonitor一样
kind: PrometheusRule #该资源类型是Prometheus,这也是一种自定义资源(CRD)
metadata:
labels:
app: "prometheus-rule-mysql"
prometheus: kube-prometheus #同ServiceMonitor,ruleSelector也会默认选择标签为prometheus: kube-prometheus的PrometheusRule资源
name: prometheus-rule-mysql
spec:
groups: #编写告警规则,和prometheus的告警规则语法相同
- name: mysql.rules
rules:
- alert: TooManyErrorFromMysql
expr: sum(irate(mysql_global_status_connection_errors_total[1m])) > 10
labels:
severity: critical
annotations:
description: mysql产生了太多的错误.
summary: TooManyErrorFromMysql
- alert: TooManySlowQueriesFromMysql
expr: increase(mysql_global_status_slow_queries[1m]) > 10
labels:
severity: critical
annotations:
description: mysql一分钟内产生了{{ $value }}条慢查询日志.
summary: TooManySlowQueriesFromMysql
Prometheus选择PrometheusRule资源是通过ruleSelector来选择,默认也是通过标签:prometheus: kube-prometheus
来选择,在这里可以看到,ruleSelector和ServiceMonitorsSelector都是可以配置的,如何配置将放在后文的配置统一讲解
保存以上文件之后执行kubectl create -f mysql-rule.yaml
,创建成功之后执行命令kubectl get prometheusRule
可以看到刚才创建的PrometheusRule资源prometheus-rule-mysql:
root@k8s1:~/mysql-exporter# kubectl create -f mysql-rule.yaml
prometheusrule.monitoring.coreos.com/prometheus-rule-mysql created
root@k8s1:~/mysql-exporter# kubectl get prometheusRule
NAME AGE
kube-prometheus 1h
kube-prometheus-alertmanager 1h
kube-prometheus-exporter-kube-controller-manager 1h
kube-prometheus-exporter-kube-etcd 1h
kube-prometheus-exporter-kube-scheduler 1h
kube-prometheus-exporter-kube-state 1h
kube-prometheus-exporter-kubelets 1h
kube-prometheus-exporter-kubernetes 1h
kube-prometheus-exporter-node 1h
kube-prometheus-rules 1h
prometheus-rule-mysql 8s
等待1分钟左右,在prometheus图形界面中可以找到刚才添加的mysql.rule的内容了
Operator部署Alertmanager的时候会生成一个statefulset类型对象,通过命令kubectl get statefulset --all-namespaces
可以找到这个statefulset,可以看到name是alertmanager-kube-prometheus
root@k8s1:~/prometheus-operator/helm/alertmanager/templates# kubectl get statefulset --all-namespaces
NAMESPACE NAME DESIRED CURRENT AGE
elk elastic-release-elasticsearch-data 2 2 8d
elk elastic-release-elasticsearch-master 3 3 8d
elk logstash-release 1 1 9d
kafka kafka-release 3 3 1d
kafka zookeeper-release 3 3 12d
kube-system mongodb-release-arbiter 1 1 16d
kube-system mongodb-release-primary 1 1 16d
kube-system mongodb-release-secondary 1 1 16d
kube-system my-release-mysqlha 3 3 15d
monitoring alertmanager-kube-prometheus 1 1 9h
monitoring prometheus-kube-prometheus 1 1 9h
然后执行命令kubectl describe statefulset alertmanager-kube-prometheus -n monitoring
可以看到该statefulset的详细信息:
root@k8s1:~/prometheus-operator/helm/alertmanager/templates# kubectl describe statefulset alertmanager-kube-prometheus -n monitoring
Name: alertmanager-kube-prometheus
Namespace: monitoring
CreationTimestamp: Wed, 05 Sep 2018 09:46:08 +0800
Selector: alertmanager=kube-prometheus,app=alertmanager
Labels: alertmanager=kube-prometheus
app=alertmanager
chart=alertmanager-0.1.6
heritage=Tiller
release=kube-prometheus
Annotations:
Replicas: 1 desired | 1 total
Update Strategy: RollingUpdate
Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: alertmanager=kube-prometheus
app=alertmanager
Containers:
alertmanager:
Image: intellif.io/prometheus-operator/alertmanager:v0.15.1
Ports: 9093/TCP, 6783/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--config.file=/etc/alertmanager/config/alertmanager.yaml
--cluster.listen-address=$(POD_IP):6783
--storage.path=/alertmanager
--web.listen-address=:9093
--web.external-url=http://192.168.11.178:30903
--web.route-prefix=/
--cluster.peer=alertmanager-kube-prometheus-0.alertmanager-operated.monitoring.svc:6783
Requests:
memory: 200Mi
Liveness: http-get http://:web/api/v1/status delay=0s timeout=3s period=10s #success=1 #failure=10
Readiness: http-get http://:web/api/v1/status delay=3s timeout=3s period=5s #success=1 #failure=10
Environment:
POD_IP: (v1:status.podIP)
Mounts:
/alertmanager from alertmanager-kube-prometheus-db (rw)
/etc/alertmanager/config from config-volume (rw) #挂载的secert目录
config-reloader:
Image: intellif.io/prometheus-operator/configmap-reload:v0.0.1
Port:
Host Port:
Args:
-webhook-url=http://localhost:9093/-/reload
-volume-dir=/etc/alertmanager/config
Limits:
cpu: 5m
memory: 10Mi
Environment:
Mounts:
/etc/alertmanager/config from config-volume (ro)
Volumes:
config-volume:
Type: Secret (a volume populated by a Secret)
SecretName: alertmanager-kube-prometheus
Optional: false
alertmanager-kube-prometheus-db:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
Volume Claims:
Events:
该statefulset挂载了一个名为alertmanager-kube-prometheus的secret
资源到alertmanager容器内部的/etc/alertmanager/config/alertmanager.yaml
,上面的Volumes:
下面的config-volume:
标签下可以看到,Type:
字段的值为Secret表示挂载一个secret
资源,secrect
的name是alertmanager-kube-prometheus
,通过一下命令查看该secret: kubectl describe secrets alertmanager-kube-prometheus -n monitoring
root@k8s1:~# kubectl describe secrets alertmanager-kube-prometheus -n monitoring
Name: alertmanager-kube-prometheus
Namespace: monitoring
Labels: alertmanager=kube-prometheus
app=alertmanager
chart=alertmanager-0.1.6
heritage=Tiller
release=kube-prometheus
Annotations:
Type: Opaque
Data
====
alertmanager.yaml: 567 bytes
可以看到该secert的Data项里面有一个key为alertmanager.yaml
的属性,其value包含567bytes,而这个alertmanager.yaml的值其实就是alertmanager容器的/etc/alertmanager/config/alertmanager.yaml
中的内容,statefulset通过挂载的方式将/etc/alertmanager/config
挂载成一个secret,执行kubectl edit secrets -n monitoring alertmanager-kube-prometheus
可以看到该secret的内容:
apiVersion: v1
data:
alertmanager.yaml: Z2xvYmFsOgogIHJlc29sdmVfdGltZW91dDogNW0KICBzbXRwX2F1dGhfcGFzc3dvcmQ6ICoqKioKICBzbXRwX2F1dGhfdXNlcm5hbWU6IGlub3JpX3lpbmprQDE2My5jb20KICBzbXRwX2Zyb206IGlub3JpX3lpbmprMUAxNjMuY29tCiAgc210cF9yZXF1aXJlX3RsczogZmFsc2UKICBzbXRwX3NtYXJ0aG9zdDogc210cC4xNjMuY29tOjI1CnJlY2VpdmVyczoKLSBlbWFpbF9jb25maWdzOgogIC0gaGVhZGVyczoKICAgICAgU3ViamVjdDogJ1tFUlJPUl0gcHJvbWV0aGV1cy4uLi4uLi4uLi4uLicKICAgIHRvOiB4eHh4QHFxLmNvbQogIG5hbWU6IHRlYW0tWC1tYWlscwotIG5hbWU6ICJudWxsIgpyb3V0ZToKICBncm91cF9ieToKICAtIGFsZXJ0bmFtZQogIC0gY2x1c3RlcgogIC0gc2VydmljZQogIGdyb3VwX2ludGVydmFsOiA1bQogIGdyb3VwX3dhaXQ6IDYwcwogIHJlY2VpdmVyOiB0ZWFtLVgtbWFpbHMKICByZXBlYXRfaW50ZXJ2YWw6IDI0aAogIHJvdXRlczoKICAtIG1hdGNoOgogICAgICBhbGVydG5hbWU6IERlYWRNYW5zU3dpdGNoCiAgICByZWNlaXZlcjogIm51bGwi
kind: Secret
metadata:
creationTimestamp: 2018-09-05T01:46:08Z
labels:
alertmanager: kube-prometheus
app: alertmanager
chart: alertmanager-0.1.6
heritage: Tiller
release: kube-prometheus
name: alertmanager-kube-prometheus
namespace: monitoring
resourceVersion: "5820063"
selfLink: /api/v1/namespaces/monitoring/secrets/alertmanager-kube-prometheus
uid: 75a589e8-b0ad-11e8-8746-005056bf1d6e
type: Opaque
其中data:
下面的alertmanager.yaml
这个key对应的值是一串base64编码过后的字符串,将这段字符串复制出来通过base64反编码之后内容如下:
global:
resolve_timeout: 5m
smtp_auth_password: xxxxxx
smtp_auth_username: [email protected]
smtp_from: [email protected]
smtp_require_tls: false
smtp_smarthost: smtp.163.com:25
receivers:
- email_configs:
- headers:
Subject: '[ERROR] prometheus............'
to: [email protected]
name: team-X-mails
- name: "null"
route:
group_by:
- alertname
- cluster
- service
group_interval: 5m
group_wait: 60s
receiver: team-X-mails
repeat_interval: 24h
routes:
- match:
alertname: DeadMansSwitch
receiver: "null"
这其实就是alertmanager的config配置,上面说到,该内容会被挂载到alertmanager容器的/etc/alertmanager/config/alertmanager.yaml
,我们进入alertmanager容器去看看该文件,执行命令kubectl exec -it alertmanager-kube-prometheus-0 -n monitoring sh
进入到容器(可能你的容器名和我的不同,可以通过kubectl get pods --all-namespaces命令查看所有的容器),然后进入目录/etc/alertmanager/config
,然后ls
可以看到该目录下有一个叫alertmanager.yaml的文件,而该文件的内容就是上面base64反编译之后的内容,我们通过修改名为alertmanager-kube-prometheus的secret的data属性中的alertmanager.yaml
字段对应的值就相当于修改了该文件中的内容,所以现在问题就变简单了,在alertmanager的pod中还有另一个container叫做config-reloader,它会监听/etc/alertmanager/config
目录,当该目录下的文件发生变化的时候,config-reloader会向alertmanager发起http://localhost:9093/-/reload
POST请求,alertmanager会重新加载该目录下的配置文件,从而实现了动态配置更新
在理解了alertmanager动态配置的原理之后,问题就很清晰了,我们需要动态配置alertmanager只需要更新名为alertmanager-kube-prometheus(你的secret名不一定为这个名字,但一定是alertmanager-{*}格式)的secret的data属性中的alertmanager.yaml字段的值就可以了,更新secret有两种方法,一是通过kubectl edit secret
的方式,一种是通过kubectl patch secret
的方式,但是两种方式更新secret都需要输入base64编码之后的字符串,这里通过linux下的base64命令进行编码:
首先修改上面base64反编译后的文件,比如smtp_from改成另一个邮箱发送,修改完成之后保存文件,然后通过命令base64 file > test.txt
的方式将配置通过base64编码并将编码结果输出到test.txt文件中,然后进入test.txt文件中复制编码之后的字符串,如果通过第一种方式更新secret,执行命令kubectl edit secrets -n monitoring alertmanager-kube-prometheus
然后data下面的alertmanager.yaml的值为刚才复制的字符串,保存并退出就可以了。如果通过第二种方式更新secert,执行命令kubectl patch secert alertmanager-kube-prometheus -n -n monitoring -p '{"data":{"alertmanager.yaml":"此处填写刚才复制的base64编码之后的配置字符串"}'
即可完成更新,该命令中 -p参数后面跟的是一个JSON字符串,将刚才复制的base64编码后的字符串填入正确的位置可以了
在完成更新之后可以访问alertmanager的界面http://192.168.11.178:30903/#/status
,查看配置已经生效了
通过上面的操作我们已经实现了监控对象的动态发现,监控告警规则的动态添加,告警配置(发送邮件)的动态配置,基本上已经实现了所有配置的动态配置
配置Prometheus一般情况下只需要配置kube-prometheus下的values.yaml就能实现对alertmanager、promethes的配置,该文件该如何配置在配置项上一般都有说明,这里主要讲解上文提到的几个配置以及其他比较常用的几个配置,避免占用太多篇幅,我已经将比较简单或不常用的配置以省略号代替:
...
alertmanager: #所有alertmanager的配置都在这个标签下面
config: #alermanager的config,和传统的alertmanager的config配置相同
global:
resolve_timeout: 5m
route:
group_by: ['job']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'null'
routes:
- match:
alertname: DeadMansSwitch
receiver: 'null'
receivers:
- name: 'null'
## 外部url,报警发送邮件后可以通过该Url访问Alertmanager的界面
externalUrl: "http://192.168.11.178:30903"
...
## Node labels for Alertmanager pod assignment 通过该选择器将会选择alertmanager在哪个node上面部署
## Ref: https://kubernetes.io/docs/user-guide/node-selection/
##
nodeSelector: {}
...
## List of Secrets in the same namespace as the AlertManager
## object, which shall be mounted into the AlertManager Pods.
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerspec
##
secrets: []
service:
...
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: 30903 #暴露的端口
## Service type
##
type: NodePort # 以NodePort的方式部署alertmanager 的Service,这将可以使用node的ip访问服务
prometheus: #所有的prometheus的配置都在这里编写
## Alertmanagers to which alerts will be sent
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerendpoints
##
alertingEndpoints: []
#alertmanager地址,不填写默认会使用同时部署的alertmanager,在helm/prometheus/templates/prometheus.yaml文件第40行,
#github中https://github.com/coreos/prometheus-operator/blob/master/helm/prometheus/templates/prometheus.yaml#L40
# - name: ""
# namespace: ""
# port: 9093
# scheme: http
...
## External URL at which Prometheus will be reachable
##同上,外部访问地址
externalUrl: ""
## List of Secrets in the same namespace as the Prometheus
## object, which shall be mounted into the Prometheus Pods.
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
##
secrets: []
## How long to retain metrics
## prometheus时序数据库数据保存时间
retention: 24h
## Namespaces to be selected for PrometheusRules discovery.
## If unspecified, only the same namespace as the Prometheus object is in is used.
## 选择PrometheusRule的namespace,如何配置any:true则回去所有命名空间中去寻找PrometheusRule资源
ruleNamespaceSelector: {}
## any: true
## or
##
## Rules PrometheusRule CRD selector
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
##
## 1. If `matchLabels` is used, `rules.additionalLabels` must contain all the labels from
## `matchLabels` in order to be be matched by Prometheus
## 2. If `matchExpressions` is used `rules.additionalLabels` must contain at least one label
## from `matchExpressions` in order to be matched by Prometheus
## Ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels
## 上文提到的ruleNamespaceSelector,不填写默认会使用prometheus: {{.Values.prometheusLabelValue}},
## 而prometheusLabel定义在helm/prometheus/values.yaml中,默认为.Release.Name即releaseName(kube-prometheus)
## 我们可以通过修改helm/prometheus/values.yaml中的prometheusLabel值来修改默认选择的标签,也可以在这里定义自己的标签,
## 在这里定义了标签之后默认的会失效,比如在这里定义一个comment: prometheus标签,默认的prometheus: kube-prometheus就会失效,prometheus也将根据新定义的标签来选择PrometheusRule资源
rulesSelector: {}
# rulesSelector: {
# matchExpressions: [{key: prometheus, operator: In, values: [example-rules, example-rules-2]}]
# }
### OR
# rulesSelector: {
# matchLabels: [{role: example-rules}]
# }
## Prometheus alerting & recording rules
## Ref: https://prometheus.io/docs/querying/rules/
## Ref: https://prometheus.io/docs/alerting/rules/
##
rules: #可以在这里配置PrometheusRule资源,一般不在这里配置,而是单独编写PrometheusRule这样可以实现动态配置
specifiedInValues: true
## What additional rules to be added to the PrometheusRule CRD
## You can use this together with `rulesSelector`
additionalLabels: {}
# prometheus: example-rules
# application: etcd
value: {}
service:
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: 30900 #同上,暴露的端口
## Service type
##
type: NodePort #同上,将Prometheus Service映射到外网
## Service monitors selector
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
## 同上,prometheus默认会选择带有prometheus: kube-prometheus标签的ServiceMonitor资源,也可以在这里配置自定义的标签,一旦在这里配置了标签,默认的将会失效,prometheus会按照新的serviceMonitorSelector中定义的标签来选择对应的ServiceMonitor
serviceMonitorsSelector: {}
# matchLabels:
# - comment: prometheus
# - release: kube-prometheus
## ServiceMonitor CRDs to create & be scraped by the Prometheus instance.
## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/service-monitor.md
##
serviceMonitors: [] #可以在这里配置serviceMonitor资源,一般不再这里配置,参见如何编写ServiceMonitor章节