目录
前言
一、Prometheus + Grafana部署
二、监控数据源配置
二、监控服务配置
三、监控报警配置
总结
Prometheus:抓取metrics(指标)数据,然后进行存储
Grafana:提供各种模板或自定义界面展示数据
创建docker-compose.yml
version: '2'
networks:
monitor:
driver: bridge
#配置应用
services:
#监控服务
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: always
volumes:
- /mydata/pan/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- /mydata/pan/node_down/node_down.yml:/etc/prometheus/node_down.yml
ports:
- "9090:9090"
networks:
- monitor
#报警插件
alertmanager:
image: prom/alertmanager
container_name: alertmanager
hostname: alertmanager
restart: always
volumes:
- /mydata/pan/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
networks:
- monitor
#界面展示
grafana:
image: grafana/grafana
container_name: grafana
hostname: grafana
restart: always
ports:
- "3000:3000"
networks:
- monitor
#获取服务器信息
node-exporter:
image: quay.io/prometheus/node-exporter
container_name: node-exporter
hostname: node-exporter
restart: always
ports:
- "9100:9100"
networks:
- monitor
#监控docker容器服务
cadvisor:
image: google/cadvisor:latest
container_name: cadvisor
hostname: cadvisor
restart: always
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
networks:
- monitor
创建alertmanager.yml,配置收发邮件邮箱
global:
smtp_smarthost: 'smtp.qq.com:465' #qq服务器
smtp_from: '[email protected]' #发邮件的邮箱
smtp_auth_username: '[email protected]' #发邮件的邮箱用户名,也就是你的邮箱
smtp_auth_password: 'xxxxx' #发邮件的邮箱密码
smtp_require_tls: false #不进行tls验证
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 10m
receiver: live-monitoring
receivers:
- name: 'live-monitoring'
email_configs:
- to: '[email protected]' #收邮件的邮箱
创建prometheus.yml
global:
scrape_interval: 15s # 设置间隔15s,默认1分钟.
evaluation_interval: 15s # 每15秒评估一次规则, 默认1分钟.
#设置报警插件
alerting:
alertmanagers:
- static_configs:
- targets: ['192.168.2.170:9093']
# - alertmanager:9093
# 加载规则一次,并根据全局规则定期对其进行评估'evaluation_interval'.
rule_files:
- "node_down.yml"
# - "first_rules.yml"
# - "second_rules.yml"
# 监控配置:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['192.168.2.170:9090']
- job_name: 'cadvisor'
static_configs:
- targets: ['192.168.2.170:8080']
- job_name: 'node'
scrape_interval: 8s
static_configs:
- targets: ['192.168.2.170:9100']
添加报警规则 node_down.yml (node_down.yml为 prometheus targets 监控)
groups:
- name: node_down
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
user: test
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
运行docker-compose
#拉取镜像,创建容器运行
docker-compose up -d
#删除容器,停止运行
docker-compose down
Promethus地址:http://192.168.2.170:9090/targets
grafana地址:http://192.168.2.170:3000
(第一次访问:账户密码均为 admin ,登陆后第一次可进行设置密码)
打开grafana,创建数据源 promethus (按需可配置其他数据源,如mysql,Elasticsearch等)
选择Prometheus 2.0 Stats
填入promethus地址
保存
grafana模板查看及下载地址:Grafana: The open observability platform | Grafana Labs
version: '2'
networks:
monitor:
driver: bridge
services:
node-exporter:
image: quay.io/prometheus/node-exporter
container_name: node-exporter
hostname: node-exporter
restart: always
ports:
- "9100:9100"
networks:
- monitor
配置prometheus.yml文件,在targets中加入不同服务所在的ip:端口(端口为node-exporter的端口 scrape_configs:
- job_name: 'node'
scrape_interval: 8s
static_configs:
- targets: ['192.168.2.170:9100','192.168.2.169:9100','192.168.1.69:9100']
重启prometheus服务,访问http://192.168.2.170:9090/targets 查看是否正确运行version: '2'
networks:
monitor:
driver: bridge
services:
cadvisor:
image: google/cadvisor:latest
container_name: cadvisor
hostname: cadvisor
restart: always
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
networks:
- monitor
配置prometheus.yml文件,在targets中加入不同docker服务所在的ip:端口(端口为cadvisor的端口) scrape_configs:
- job_name: 'cadvisor'
static_configs:
- targets: ['192.168.2.170:8080','192.168.2.169:8080','192.168.1.69:8080']
重启prometheus服务,访问http://192.168.2.170:9090/targets 查看是否正确运行
org.springframework.boot
spring-boot-starter-actuator
io.micrometer
micrometer-registry-prometheus
项目application文件中加入以下配置 spring:
application:
name: test
server:
port: 11111
management:
endpoints:
web:
exposure:
# 将 Actuator 的 /actuator/prometheus 端点暴露出来
include: 'prometheus'
metrics:
tags:
application: ${spring.application.name}
注意:项目中若有家权限框架,请将路径/actuator/prometheus 加入白名单o.s.b.a.e.web.EndpointLinksResolver : Exposing 1 endpoint(s) beneath base path '/actuator'
配置prometheus.yml文件 scrape_configs:
- job_name: '自定义名称'
scrape_interval: 5s
metrics_path: '/xxxx/actuator/prometheus' #若项目中有配置server.servlet.context-path,则加上对应xxxx路径
static_configs:
- targets: ['192.168.1.69:18077','192.168.1.69:18078']
重启prometheus服务,访问http://192.168.2.170:9090/targets 查看是否正确运行监控mysql(模板代号:7362)
在需监控mysql的服务器安装 mysqld-exporter,编辑docker-compose文件
services:
mysqld-exporter:
image: prom/mysqld-exporter
container_name: mysqld-exporter
hostname: mysqld-exporter
restart: always
ports:
- "9104:9104"
environment:
- DATA_SOURCE_NAME=root:root@(192.168.2.169:3306)/ #username:password@(ip:端口)
networks:
- monitor
配置prometheus.yml文件
scrape_configs:
- job_name: 'mysql-exporter'
scrape_interval: 5s
static_configs:
- targets: ['192.168.2.170:9104']
重启prometheus服务,访问http://192.168.2.170:9090/targets 查看是否正确运行
效果如下
后续补充....
监控流程总体上是,prometheus的配置文件中配置如何获取信息,进行保存,然后grafana配置对应prometheus的数据源,选择合适的模板,进行数据展示。
以上文章仅作为个人学习累积,以便回顾