Prometheus+grafana+alertmanager安装配置

环境配置:

两台服务器:
pro_server:监控平台+分析展示平台,Linux系统oraclelinux 7.3版本
exp_agent:被监控服务器,Linux系统oraclelinux 7.3版本
软件版本:
prometheus-2.3.2.linux-amd64.tar.gz
alertmanager-0.15.2.linux-amd64.tar.gz
node_exporter-0.16.0.linux-amd64.tar.gz


运行前关掉linux的防火墙,两台服务器都要关掉

# systemctl stop firewalld
# systemctl disable firewalld

安装软件

安装Prometheus server
在监控平台端安装Prometheus Server

[root@pro_server ~]# tar xf prometheus-2.3.2.linux-amd64.tar.gz
[root@pro_server ~]# cd prometheus-2.3.2.linux-amd64
[root@pro_server prometheus-2.3.2.linux-amd64]# ./prometheus  --config.file=prometheus.yml

然后通过访问 http://<服务器IP地址>:9090,验证Prometheus是否已安装成功,web显示应该如下
Prometheus+grafana+alertmanager安装配置_第1张图片

安装node_exporter
在被监控端安装node_exporter

[root@exp_agent ~]# tar xf node_exporter-0.16.0.linux-amd64.tar.gz 
[root@exp_agent ~]# cd node_exporter-0.16.0.linux-amd64
[root@exp_agent node_exporter-0.16.0.linux-amd64]# ./node_exporter &

然后在监控端添加node
打开#prometheus_path#/prometheus.yml,添加以下新的节点名以及IP网端号,注意空格和缩进。

- job_name: 'exporter'
    static_configs:
    - targets: [10.18.34.72:9100]

添加完成后,打开prometheus网页,在status->Targets下查看是否新增的节点能查看
Prometheus+grafana+alertmanager安装配置_第2张图片

安装alertmanager

tar xf alertmanager-0.15.2.linux-amd64.tar.gz

配置规则

[root@pro_server alertmanager-0.15.2.linux-amd64]# mkdir /etc/prometheus
[root@pro_server alertmanager-0.15.2.linux-amd64]# vi /etc/prometheus/alert.rules

在alert.rules中添加以下代码

groups:
  - name: web.hook
    rules:
    # Alert for any instance that is unreachable for >1 minutes.
    - alert: InstanceDown
      expr: up == 0
      for: 1m
      labels:
        severity: page
      annotations:
        summary: "Instance {{ $labels.instance }} down"
        description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

然后打开#prometheus_path#/prometheus.yml进行配置,添加alerting和rules_files

alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "localhost:9093"
rule_files:
  - /etc/prometheus/alert.rules

最后配置#alertmanagers_path#/alertmanager.yml

global:
  smtp_smarthost: 'smtp.qq.com:587'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'password'//该密码不是邮箱密码,是邮箱授权码
  resolve_timeout: 5m
route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'web.hook'
receivers:
  - name: 'web.hook'
    email_configs:
    - to: '[email protected]'//接收预警邮件的邮箱
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

通过kill被监控端的node_exporter进程可以试验alert是否成功绑定
Prometheus+grafana+alertmanager安装配置_第3张图片
同时可查看邮箱是否有预警邮件
Prometheus+grafana+alertmanager安装配置_第4张图片

安装grafana
根据不同系统,grafana安装过程不一样,具体参考:http://docs.grafana.org/installation/
以Linux为例

sudo yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.1.4-1.x86_64.rpm
sudo yum install initscripts fontconfig
sudo rpm -Uvh /var/tmp/yum-root-G9wqx0/grafana-5.1.4-1.x86_64.rpm

安装完成后,启动服务

service grafana-server start
systemctl daemon-reload
systemctl start grafana-server
systemctl status grafana-server // 查看服务状态

然后通过访问 http://<服务器IP地址>:3000,验证Grafana是否安装成功,web显示应该如下
Prometheus+grafana+alertmanager安装配置_第5张图片
然后注册进入即可。

你可能感兴趣的:(服务器监控)