3.alter manager安装,配置报警规则

第三部分:alter manager安装

1.下载,安装

https://prometheus.io/download/

tar -zxvf alertmanager-0.15.2.linux-amd64.tar.gz && mv alertmanager-0.15.2.linux-amd64 alertmanager

编辑 alertmanager.yaml

global:

  smtp_smarthost: 'smtp.163.com:25'

  smtp_from: '[email protected]'

  smtp_auth_username: '[email protected]'

  smtp_auth_password: 'tong161430222'

  smtp_require_tls: false

#templates:

#  - '/apps/alertmanager/template/*.tmpl'

route:

  group_by: ['alertname', 'cluster', 'service']

  group_wait: 30s

  group_interval: 5m

  repeat_interval: 10m

  receiver: default-receiver

receivers:

- name: 'default-receiver'

  email_configs:

  - to: '[email protected]'

    html: '{{ template "alert.html" . }}'

    headers: { Subject: "[WARN] 报警邮件test" }

备注,报警邮件模板,可以默认,也可以自定义,自己指定自定义位置即可


创建账户

# useradd prometheus

# chown -R prometheus:prometheus /apps/alertmanager

# vim /usr/lib/systemd/system/alertmanager.service

[Unit]

Description=Alertmanager

After=network.target

[Service]

Type=simple

User=prometheus

ExecStart=/apps/alertmanager/alertmanager --config.file=/apps/alertmanager/alertmanager.yml --storage.path=/apps/alertmanager/data

Restart=on-failure

[Install]

WantedBy=multi-user.target

启动

# systemctl enable alertmanager.service

# systemctl start alertmanager.service

访问

地址:http://ip:9093

2. 配置prometheus和alertmanager关联

编辑prometheus.yml,修改如下配置信息

# Alertmanager configuration

alerting:

  alertmanagers:

  - static_configs:

    - targets: ["localhost:9093"]

      #- alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files:

  - "/apps/prometheus/node_down.yml"

  - "/apps/prometheus/memory_over.yml"

  # - "second_rules.yml"

保存,重启

rules文件(是在prometheus.yml中加载,修改需要重启prometheus加载)

内存报警规则文件

cat memory_over.yml

groups:

- name: 内存报警规则

  rules:

  - alert: NodeMemoryUsage

    expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 80

    for: 1m

    labels:

      user: prometheus

    annotations:

      summary: "{{$labels.instance}}: High Memory usage detected"

      description: "{{$labels.instance}}: Memory usage is above 80% (current value is:{{ $value }})"

存活报警规则文件

cat node_down.yml

groups:

- name: node存活报警规则

  rules:

  - alert: InstanceDown

    expr: up == 0

    for: 1m

    labels:

      user: prometheus

    annotations:

      summary: "Instance {{ $labels.instance }} down"

      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."

3,验证效果

将内存报警阀值改小,触发报警,邮件如下:

3.alter manager安装,配置报警规则_第1张图片

你可能感兴趣的:(3.alter manager安装,配置报警规则)