Prometheus - SSL 证书过期监控 - 钉钉告警

Prometheus - SSL 证书过期监控 - 钉钉告警_第1张图片


目录

    • 前言
    • 一、配置 Prometheus 告警规则
    • 二、配置 Alertmanager
    • 三、配置 DingTalk
    • 四、模拟告警与恢复
    • 总结


前言

上次博客《Prometheus - SSL 证书过期监控》已经配置了 Grafana 如何展示 SSL 过期监控面板,本次接着将告警功能加上,这才是我们的最终目的。

一、配置 Prometheus 告警规则

1、先确定好 Prometheus 的规则文件路径

Prometheus - SSL 证书过期监控 - 钉钉告警_第2张图片

2、编写告警规则

vim /home/data/prometheus/rules/ssl_cert_alerts.yml
groups:
- name: "SSL证书过期提醒"
  rules:
  - alert: "证书过期时间<30天"
    expr: probe_ssl_earliest_cert_expiry{job="SSL证书时间"} - time() < 86400 * 30
    for: 0s
    labels:
      severity: "提示"
    annotations:
      summary: "{{ $labels.instance }} SSL 证书将在30天后过期,请注意及时续费!"
      description: "{{ $labels.instance }} SSL 证书将在30天后过期,请注意及时续费!"
  - alert: "证书过期时间<7天"
    expr: probe_ssl_earliest_cert_expiry{job="SSL证书时间"} - time() < 86400 * 7
    for: 0s
    labels:
      severity: "告警"
    annotations:
      summary: "{{ $labels.instance }} SSL 证书将在7天后过期,请注意及时续费!"
      description: "{{ $labels.instance }} SSL 证书将在7天后过期,请注意及时续费!"
  - alert: "证书过期时间<1天"
    expr: probe_ssl_earliest_cert_expiry{job="SSL证书时间"} - time() < 86400 * 1
    for: 0s
    labels:
      severity: "灾难"
    annotations:
      summary: "{{ $labels.instance }} SSL 证书将在1天后过期,请注意及时续费!"
      description: "{{ $labels.instance }} SSL 证书将在1天后过期,请注意及时续费!"

3、重启 Prometheus

docker restart prometheus

二、配置 Alertmanager

1、修改配置文件

vim /home/data/alertmanager/conf/config.yml
global:
  resolve_timeout: 5m
route:
  group_wait: 0s
  group_interval: 5s
  repeat_interval: 1m
  group_by: ['instance']
  receiver: 'web.hook.prometheusalert'

receivers:
- name: 'web.hook.prometheusalert'
  webhook_configs:
  - url: 'http://YourDingTalk_IP:8060/dingtalk/webhook1/send'

2、重启 Alertmanager

docker restart alertmanager

三、配置 DingTalk

1、配置文件

vim /home/data/dingtalk/conf/config.yml
templates:
  - /etc/prometheus-webhook-dingtalk/templates/default.tmpl
targets:
  webhook1:
    url: https://oapi.dingtalk.com/robot/send?access_token=8cf8d025f***a4537b22
    secret: SECb***95fbab
    mention:
      all: true

2、模板文件

vim /home/data/dingtalk/templates/default.tmpl

# 注意:这里的templates路径为什么与上面的templates路径不对应,那是因为我是用容器起的DingTalk,取的是容器内部路径
...
...
{{/* Firing */}}

{{ define "default.__text_alert_list" }}{{ range . }}

**触发时间:** {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}

**摘要:** {{ .Annotations.summary }}

**描述:** {{ .Annotations.description }}

**监控:** [grafana](http://grafana_ip:8000/grafana/d/GuJ5DHMnz/fu-wu-qi-jian-kong-tu-biao?orgId=1)

**详情:**
{{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "severity") (ne (.Name) "summary") }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}{{ end }}
{{ end }}{{ end }}

{{/* Resolved */}}

{{ define "default.__text_resolved_list" }}{{ range . }}

**触发时间:** {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}

**解除时间:** {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}

**摘要:** {{ .Annotations.summary }}

**监控:** [grafana](http://grafana_ip:8000/grafana/d/GuJ5DHMnz/fu-wu-qi-jian-kong-tu-biao?orgId=1)

**详情:**
{{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "severity") (ne (.Name) "summary") }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}{{ end }}
{{ end }}{{ end }}
...
...

3、重启 DingTalk

docker restart dingtalk

四、模拟告警与恢复

1、钉钉告警通知

Prometheus - SSL 证书过期监控 - 钉钉告警_第3张图片

2、钉钉解除告警通知

Prometheus - SSL 证书过期监控 - 钉钉告警_第4张图片

总结

整体来说都比较简单,重点是要理清楚整个过程链,配置过程中仔细点即可,接下来会继续剖析告警的原理/告警的时机。

你可能感兴趣的:(监控,prometheus,ssl,钉钉)