在nagios中需要检测cockroach、nomad、consul服务集群节点状态的脚本,查看服务集群状态的命令是:
1.consul查看群集节点状态的命令 .[root@cgw122 ~]# consul members
Node Address Status Type Build Protocol DC Segment
cgw122.zencoo.com 192.168.196.122:8301 alive server 1.4.3 2 cd
cws113.zencoo.com 192.168.196.113:8301 alive server 1.4.3 2 cd
mandrill.zencoo.com 192.168.196.205:8301 alive client 1.4.3 2 cd
主要关注build那列,看是不是都是alive,如果都是,状态就正常,cockroach、nomad也类似这样,对这些服务完全不了解。

2.脚本如下:

#!/bin/bash
#Detection cockroach status
function cockroach { #检测cockroach群集节点状态的函数
NUM=cat /tmp/.cockroach.status|awk '{print $9}'|grep 'true'|wc -l
[ "$1" -eq "$NUM" ] && (echo "$NUM nodes is alive";exit 0) || (echo "Only $NUM nodes is alive,expect $1";exit 2)
}

#Detection nomad status
function nomad { #检测nomad群集节点状态的函数
NUM=cat /tmp/.nomad.status |grep ready|wc -l
[ "$1" -eq "$NUM" ] && (echo "$NUM nodes is alive";exit 0) || (echo "Only $NUM nodes is alive,expect $1";exit 2)
}

#Detection consul status
function consul { #检测consul群集节点状态的函数
NUM=cat /tmp/.consul.status |grep alive|wc -l
[ "$1" -eq "$NUM" ] && (echo "$NUM nodes is alive";exit 0) || (echo "Only $NUM nodes is alive,expect $1";exit 2)
}

case $1 in
cockroach)
cockroach $2;;
nomad)
nomad $2 ;;
consul)
consul $2 ;;
esac

2.1.判断每个服务有几个节点存活,将存活的节点数量赋值给NUM变量
2.2.case中的$2是传入的节点数量个数,到函数中就变成$1了
2.3.判断实际传入的节点个数($1)和检测到的存活节点个数($NUM)是否相等
2.4.在nagios中显示如下,状态正常,2 node is alive,传入的$2是2
nagios检测cockroach、nomad、consul集群节点状态的脚本