Zabbix监控elasticsearch集群健康状态

1.es提供了一个可以获取集群健康状态的api,访问http://url:9200/_cluster/health?pretty 和 Elasticsearch 里其他 API 一样,cluster-health 会返回一个 JSON 响应。

内容解析:

"cluster_name": "my-es", #集群名

"status": "yellow", #集群健康状态,正常的话是green,缺少副本分片为yellow,缺少主分片为red "timed_out": false,

"number_of_nodes": 1,#集群节点数

"number_of_data_nodes": 1,#数据节点数

"active_primary_shards": 15,#主分片数

"active_shards": 15,#可用的分片数

"relocating_shards": 0,#正在迁移的分片数

"initializing_shards": 0,#正在初始化的分片数

"unassigned_shards": 15, #未分配的分片,但在集群中存在

"delayed_unassigned_shards": 0, #延时待分配到具体节点上的分片数

"number_of_pending_tasks": 0, #待处理的任务数,指主节点创建索引并分配shards等任务

"number_of_in_flight_fetch": 0,

"task_max_waiting_in_queue_millis": 0,

"active_shards_percent_as_number": 50 #可用分片数占总分片的比例

2.编写采集脚本获取集群状态

vi es.py

#encoding=utf‐8 2 import requests,json 3 import sys 4 headers={"User‐Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.3 6 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36"} 5 response=requests.get("http://192.58.23.46:9200/_cluster/health",headers=header s) 6 s=json.loads(response.content.decode()) 7 parm=sys.argv[1] 8 9 itemlist=["cluster_name","status","timed_out","number_of_nodes","number_of_data_no des","active_primary_shards","active_shards","relocating_shards","initializing_shard s","unassigned_shards","delayed_unassigned_shards","number_of_pending_tasks","number _of_in_flight_fetch","task_max_waiting_in_queue_millis","active_shards_percent_as_nu mber"] 10 11 if parm not in itemlist: 12 print("parm failed") 13 sys.exit(1) 14 else: 15 print(s[parm])

2.1 给脚本赋予执行权限

chmod +x /monitor_es.py

2.2 修改zabbix-agent配置文件

vim userparameter_es.conf

UserParameter=es.[*],/usr/bin/python /home/zabbix_agents/es.py $1

2.3 重启zabbix-agent

2.4 创建监控项(详细见Zabbix官网)

2.5 查看最新数据

Zabbix监控elasticsearch集群健康状态_第1张图片

 

你可能感兴趣的:(Zabbix技术文档,elasticsearch,搜索引擎,大数据,zabbix,运维开发)