【博客688】如何实现keepalived vip监控与告警

如何实现keepalived vip监控与告警

1、使用的exporter

https://github.com/mehdy/keepalived-exporter

2、metrics里的状态的具体含义

注意:存状态的字符串的每个状态的值其实是跟状态在数组中的索引相对应的

具体参考这里:

https://github.com/mehdy/keepalived-exporter/blob/master/internal/collector/parser.go

3、准备keepalived-exporter二进制

export VERSION=1.3.0
wget https://github.com/mehdy/keepalived-exporter/releases/download/v${VERSION}/keepalived-exporter-${VERSION}.linux-amd64.tar.gz
tar xvzf keepalived-exporter-${VERSION}.linux-amd64.tar.gz keepalived-exporter-${VERSION}.linux-amd64/keepalived-exporter
sudo mv keepalived-exporter-${VERSION}.linux-amd64/keepalived-exporter /usr/local/bin/

4、keepalived-exporter.service

# /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target

[Service]
User=root
Type=simple
ExecStart=/usr/local/bin/keepalived-exporter
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

5、配置采集

 - job_name: keepalived
    static_configs:
      - targets: ['x.x.x.x:9165']

6、告警规则

- name: Keepalived down
  rules:
  - alert: KeepalivedDown
    expr: keepalived_up != 1
    for: 0s
    labels:
      severity: critical
    annotations:
      summary: "keepalived agent down"
      description: "{{ $labels.instance }} keepalived agent down"

- name: Keepalived script check failed
  rules:
  - alert: KeepalivedScriptCheckFailed
    expr: keepalived_script_status != 1
    for: 0s
    labels:
      severity: critical
    annotations:
      summary: "Keepalived script check failed"
      description: "{{ $labels.instance }} keepalived script check failed"

- name: Keepalived leader changed
  rules:
  - alert: KeepalivedLeaderChanged
    expr: keepalived_become_master_total > keepalived_become_master_total offset 15s
    for: 0s
    labels:
      severity: critical
    annotations:
      summary: "KeepalivedLeaderChanged"
      description: "{{ $labels.instance }} become new leader"

- name: Keepalived leader released
  rules:
  - alert: KeepalivedLeaderReleased
    expr: keepalived_release_master_total > keepalived_release_master_total offset 15s
    for: 0s
    labels:
      severity: critical
    annotations:
      summary: "KeepalivedLeaderReleased"
      description: "{{ $labels.instance }} release leader"

你可能感兴趣的:(服务器,kubernetes,容器,linux,运维)