1.摘要
本文主要介绍如何使用blackbox_exporter
的收集被监控主机的网站状态、端口等信息,借助 Prometheus 最终以仪表盘的形式显示在 Grafana 中。
blackbox_exporter是Prometheus 官方提供的 exporter 之一,可以提供 http、dns、tcp、icmp
的监控数据采集。
2.blackbox_exporter 应用场景
HTTP 测试
TCP 测试
ICMP 测试
POST 测试
SSL 证书过期
时间3. 安装blackbox_exporter
3.1 各个版本的blackbox_exporter如下:
https://github.com/prometheus/blackbox_exporter/releases
以linux系统为例,下载编译好的二进制包,解压使用:
$ wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.15.1/blackbox_exporter-0.15.1.linux-amd64.tar.gz
$ tar -xvf blackbox_exporter-0.15.1.linux-amd64.tar.gz
$ mv blackbox_exporter-0.15.1.linux-amd64 /usr/local/blackbox_exporter
3.2 验证是否安装成功
[root@izuf61mqd75uk09tjnh7dfz local]# cd blackbox_exporter/
[root@izuf61mqd75uk09tjnh7dfz blackbox_exporter]# ./blackbox_exporter --version
blackbox_exporter, version 0.15.1 (branch: HEAD, revision: 7dd86a593b5a2270e738be1654d9c112509e46ce)
build user: root@626ba8fd110c
build date: 20190917-12:31:25
go version: go1.13
3.3 创建systemd
服务
$ vim /lib/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
如果以非root
用户运行blackbox_exporter,为了使用icmp prober
,需要设置CAP_NET_RAW
,即对可执行文件blackbox_exporter执行下面的命令:
$ cd /usr/local/blackbox_exporter
$ setcap cap_net_raw+ep blackbox_exporter
3.4 启动blackbox_exporter
$ systemctl daemon-reload
$ systemctl start blackbox_exporter
3.5 验证是否启动成功 默认监听端口为9115
$ systemctl status blackbox_exporter
$ netstat -lnpt|grep 9115
4. prometheus.yml中加入blackbox_exporter
4.1 监控网站
状态
###vim /usr/local/prometheus/prometheus.yml
- job_name: web_status
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ['https://www.ssssss.cn']
labels:
instance: web_status
group: web
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 172.19.14.253:9115
监控主机存活状态:
$ vim /usr/local/prometheus/prometheus.yml
- job_name: 'node_status'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['172.19.14.253']
labels:
instance: 'node_status'
group: 'node'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
- target_label: __address__
replacement: 172.19.14.253:9115
监控主机端口存活状态
$ vim /usr/local/prometheus/prometheus.yml
- job_name: 'port_status'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['172.19.14.253:3306','172.19.14.253:80']
labels:
instance: 'port_status'
group: 'tcp'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
- target_label: __address__
replacement: 172.19.14.253:9115
4.2 检查配置文件是否书写正确
#### cd /usr/local/prometheus
[root@iZuf6ioqjurm6w0x1o7exjZ prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: 0 rule files found
重新加载prometheus的配置
[root@iZuf6ioqjurm6w0x1o7exjZ prometheus]# systemctl restart prometheus
5. grafana中加入blackbox_exporter监控数据
5.1 导入blackbox_exporter
模板。
此模板为9965
号模板,数据源选择Prometheus 模板下载地址 https://grafana.com/grafana/dashboards/9965
此模板需要安装饼状图插件 ,重启grafana生效。
$ grafana-cli plugins install grafana-piechart-panel
$ service grafana-server restart
注意
!!!检查此种安装目录是否在grafana插件目录
下。
5.2 访问grafana
[root@iZ prometheus]# more prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- 127.0.0.1:9093
rule_files:
- "rules/*.yml"
[root@iZ prometheus]# more rules/blackbox_exporter.yml
groups:
- name: blackbox_network_stats
rules:
- alert: blackbox_network_stats
expr: probe_success == 0
for: 1m #如1分钟内持续为0 报警
labels:
severity: critical
annotations:
description: 'Job {{ $labels.job }} 中的 网站/接口 {{ $labels.instance }} 已经down掉超过一分钟.'
summary: '网站/接口 {{ $labels.instance }} down ! ! !'
参考:https://www.centoscn.vip/8412.html
https://blog.csdn.net/qq_43190337/article/details/100577728
https://blog.csdn.net/qq_25934401/article/details/84325356