在Linux系统部署prometheus监控(2) --配置规则

首先确保服务开启

在Linux系统部署prometheus监控(2) --配置规则_第1张图片

 vim node_rules.yml

注意:编写这个文件注意不要用tab键,只用空格来缩进

在Linux系统部署prometheus监控(2) --配置规则_第2张图片

 访问localhost:9090/rules在Linux系统部署prometheus监控(2) --配置规则_第3张图片

 如果relod发现rules没有生效,可以重启服务

netstate -lntp |grep prom

kill -9 进程号

./prometheus &

再次访问

在Linux系统部署prometheus监控(2) --配置规则_第4张图片

cpu > 80

100-(avg(irate(node_cup_seconds_total{mode='idle'}[5m]))by(instance)*100) > 80

在Linux系统部署prometheus监控(2) --配置规则_第5张图片内存

100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
 

 在Linux系统部署prometheus监控(2) --配置规则_第6张图片

 disk

100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100)

在Linux系统部署prometheus监控(2) --配置规则_第7张图片

 节点状态

up metric

监视特定节点状态的另一个有用指标:up ,如果实例是健康的,度量就被设置为1 ,失败返回 - 或 0

用来监控节点是否健康,如果健康则为1,不健康的话说明该服务器node服务可能停了,也可能该节点down了需要立马检查

- alert: NodeDown

  expr: node_up == 0

  for: 0m

  labels: 

     severity: serious

  annotations: 

       summary: "NodeDown"

下面都一样的模板配置即可

MysqlDown

RedisDown

NginxDown

JavaDown

在Linux系统部署prometheus监控(2) --配置规则_第8张图片

 在Linux系统部署prometheus监控(2) --配置规则_第9张图片

 在Linux系统部署prometheus监控(2) --配置规则_第10张图片

groups: 
- name: Hoststate-alert()
  rules: 
  - alert: RedisDown
    expr: up == 0
    for: 0m
    labels: 
      status: critical
    annotations: 
      summary: "Redisdown"
      description: "Redis instance is down"
  - alert: MysqlDown
    expr: up == 0
    for: 0m
    labels: 
      status: critical
    annotations: 
      summary: "Msqldown"
      description: "Mysql instance is down"
  - alert: NginxDown
    expr: up == 0
    for: 0m
    labels: 
      status: critical
    annotations: 
      summary: "Nginxdown"
      description: "Nginx instance is down"
  - alert: NodeDown
    expr: up == 0
    for: 0m
    labels: 
      status: critical
    annotations: 
      summary: "Nodedown"
      description: "Node instance is down"
  - alert: JavaDown
    expr: up == 0
    for: 0m
    labels: 
      status: critical
    annotations: 
      summary: "Javadown"
      description: "Java instance is down"
  - alert: CPUusage
    expr: 100-(avg(irate(node_cpu_seconds_total{mode='idle'}[5m]))by(instance) * 100) > 80
    for: 5m
    labels: 
      status: critical
    annotations: 
      summary: "{{$labels.mountpoint}} CPU usage high"
      description: "{{$labels.mountpoint}} CPU usage above 80% ( current usage:{{$value}})"
  - alert: Memoryusage
    expr: 100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes)/ node_memory_MemTotal_bytes * 100 > 80
    for: 5m
    labels: 
      status: critical
    annotations: 
      summary: " Memory usage high"
      description: "Memory usage above 80%.( current usage:{{$value}})"
  - alert: Diskusage
    expr: 100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100)  > 80
    for: 5m
    labels: 
      status: critical
    annotations: 
      summary: "Disk usage high"
      description: "Disk usage above 80% ( current usage:{{$value}})"

你可能感兴趣的:(linux,prometheus)