由于需要对公司特定服务进行监控,于是,通过编写脚本获取各个进程占用系统资源的信息,从而使用zabbix采集到这些数据进行特定进程的基础监控。
我这主要需要监控的程序如下:
nginx redis mysql tomcat sentinel mongodb openfire kafka zookeeper twemproxy mycat memcached php httpd
首先,在agent端编写监控脚本,脚本内容如下:
[root@monitor sbin]$ cat /data/zabbix/sbin/processstatus.sh #!/bin/bash #date:2015.06.15 nginxmem(){ ps aux|grep "nginx"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } nginxcpu(){ ps aux|grep "nginx"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } nginxnum(){ ps aux|grep "nginx"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } redismemmem(){ ps aux|grep "redismem"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } redismemcpu(){ ps aux|grep "redismem"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } redismemnum(){ ps aux|grep "redismem"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } mysqlmem(){ ps aux|grep "mysql"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } mysqlcpu(){ ps aux|grep "mysql"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } mysqlnum(){ ps aux|grep "mysql"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } tomcatmem(){ ps aux|grep "tomcat"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } tomcatcpu(){ ps aux|grep "tomcat"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } tomcatnum(){ ps aux|grep "tomcat"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } sentinelmem(){ ps aux|grep "sentinel"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } sentinelcpu(){ ps aux|grep "sentinel"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } sentinenum(){ ps aux|grep "sentinel"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } mongodbmem(){ ps aux|grep "mongod"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } mongodbcpu(){ ps aux|grep "mongod"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } mongodbnum(){ ps aux|grep "mongod"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } openfiremem(){ ps aux|grep "openfire"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } openfirecpu(){ ps aux|grep "openfire"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } openfirenum(){ ps aux|grep "openfire"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } kafkamem(){ ps aux|grep "kafka"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } kafkacpu(){ ps aux|grep "kafka"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } kafkanum(){ ps aux|grep "kafka"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } zookeepermem(){ ps aux|grep "zookeeper"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } zookeepercpu(){ ps aux|grep "zookeeper"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } zookeepernum(){ ps aux|grep "zookeeper"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } twemproxymem(){ ps aux|grep "twemproxy"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } twemproxycpu(){ ps aux|grep "twemproxy"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } twemproxynum(){ ps aux|grep "twemproxy"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } mycatmem(){ ps aux|grep "mycat"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } mycatcpu(){ ps aux|grep "mycat"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } mycatnum(){ ps aux|grep "mycat"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } httpdmem(){ ps aux|grep "httpd"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } httpdcpu(){ ps aux|grep "httpd"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } httpdnum(){ ps aux|grep "httpd"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } memcachedmem(){ ps aux|grep "memcached"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } memcachedcpu(){ ps aux|grep "memcached"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } memcachednum(){ ps aux|grep "memcached"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } phpmem(){ ps aux|grep "php"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$6}; END{print sum}' } phpcpu(){ ps aux|grep "php"|grep -v "grep"|grep -v "processstatus.sh"|awk '{sum+=$3}; END{print sum}' } phpnum(){ ps aux|grep "php"|grep -v "grep"|grep -v "processstatus.sh"| wc -l } case "$1" in nginxmem) nginxmem ;; nginxcpu) nginxcpu ;; nginxnum) nginxnum ;; redismem) redismem ;; rediscpu) rediscpu ;; redisnum) redisnum ;; mysqlmem) mysqlmem ;; mysqlcpu) mysqlcpu ;; mysqlnum) mysqlnum ;; tomcatmem) tomcatmem ;; tomcatcpu) tomcatcpu ;; tomcatnum) tomcatnum ;; sentinelmem) sentinelmem ;; sentinelcpu) sentinelcpu ;; sentinelnum) sentinelnum ;; mongodbmem) mongodbmem ;; mongodbcpu) mongodbcpu ;; mongodbnum) mongodbnum ;; openfiremem) openfiremem ;; openfirecpu) openfirecpu ;; openfirenum) openfirenum ;; kafkamem) kafkamem ;; kafkacpu) kafkacpu ;; kafkanum) kafkanum ;; zookeepermem) zookeepermem ;; zookeepercpu) zookeepercpu ;; zookeepernum) zookeepernum ;; twemproxymem) twemproxymem ;; twemproxycpu) twemproxycpu ;; twemproxynum) twemproxynum ;; mycatmem) mycatmem ;; mycatcpu) mycatcpu ;; mycatnum) mycatnum ;; httpdmem) httpdmem ;; httpdcpu) httpdcpu ;; httpdnum) httpdnum ;; memcachedmem) memcachedmem ;; memcachedcpu) memcachedcpu ;; memcachednum) memcachednum ;; phpmem) phpmem ;; phpcpu) phpcpu ;; phpnum) phpnum ;; *) echo "Usage: $0 {nginxmem|nginxcpu|nginxnum|redismem|rediscpu|redisnum|mysqlmem|mysqlcpu|mysqlnum|mongodbnum|tomcatmem|tomcatcpu|tomcatnum|sentinelmem|sentinelcpu|sentinelnum|mongodbmem|mongodbcpu|mongodbnum|openfiremem|openfirecpu|openfirenum|kafkamem|kafkacpu|kafkanum|zookeepermem|zookeepercpu|zookeepernum|twemproxymem|twemproxycpu|twemproxynum|mycatmem|mycatcpu|mycatnum|httpdmem|httpdcpu|httpdnum|memcachedmem|memcachedcpu|memcachednum|phpmem|phpcpu|phpnum}" esac
然后修改脚本的权限,使用:
chmod +x processstatus.sh
在zabbix_agentd.con.d/下面的配置文件中增加如下代码:
[9kgame@monitor zabbix_agentd.conf.d]$ cat /data/zabbix/etc/zabbix_agentd.conf.d/process_num_cpu_mem.conf #monitor process UserParameter=process.nginx.memory,/data/zabbix/sbin/processstatus.sh nginxmem UserParameter=process.nginx.cpu,/data/zabbix/sbin/processstatus.sh nginxcpu UserParameter=process.nginx.number,/data/zabbix/sbin/processstatus.sh nginxnum UserParameter=process.redis.memory,/data/zabbix/sbin/processstatus.sh redismem UserParameter=process.redis.cpu,/data/zabbix/sbin/processstatus.sh rediscpu UserParameter=process.redis.number,/data/zabbix/sbin/processstatus.sh redisnum UserParameter=process.mysql.memory,/data/zabbix/sbin/processstatus.sh mysqlmem UserParameter=process.mysql.cpu,/data/zabbix/sbin/processstatus.sh mysqlcpu UserParameter=process.mysql.number,/data/zabbix/sbin/processstatus.sh mysqlnum UserParameter=process.tomcat.memory,/data/zabbix/sbin/processstatus.sh tomcatmem UserParameter=process.tomcat.cpu,/data/zabbix/sbin/processstatus.sh tomcatcpu UserParameter=process.tomcat.number,/data/zabbix/sbin/processstatus.sh tomcatnum UserParameter=process.sentinel.memory,/data/zabbix/sbin/processstatus.sh sentinelmem UserParameter=process.sentinel.cpu,/data/zabbix/sbin/processstatus.sh sentinelcpu UserParameter=process.sentinel.number,/data/zabbix/sbin/processstatus.sh sentinelnum UserParameter=process.mongodb.memory,/data/zabbix/sbin/processstatus.sh mongodbmem UserParameter=process.mongodb.cpu,/data/zabbix/sbin/processstatus.sh mongodbcpu UserParameter=process.mongodb.number,/data/zabbix/sbin/processstatus.sh mongodbnum UserParameter=process.openfire.memory,/data/zabbix/sbin/processstatus.sh openfiremem UserParameter=process.openfire.cpu,/data/zabbix/sbin/processstatus.sh openfirecpu UserParameter=process.openfire.number,/data/zabbix/sbin/processstatus.sh openfirenum UserParameter=process.kafka.memory,/data/zabbix/sbin/processstatus.sh kafkamem UserParameter=process.kafka.cpu,/data/zabbix/sbin/processstatus.sh kafkacpu UserParameter=process.kafka.number,/data/zabbix/sbin/processstatus.sh kafkanum UserParameter=process.zookeeper.memory,/data/zabbix/sbin/processstatus.sh zookeepermem UserParameter=process.zookeeper.cpu,/data/zabbix/sbin/processstatus.sh zookeepercpu UserParameter=process.zookeeper.number,/data/zabbix/sbin/processstatus.sh zookeepernum UserParameter=process.twemproxy.memory,/data/zabbix/sbin/processstatus.sh twemproxymem UserParameter=process.twemproxy.cpu,/data/zabbix/sbin/processstatus.sh twemproxycpu UserParameter=process.twemproxy.number,/data/zabbix/sbin/processstatus.sh twemproxynum UserParameter=process.mycat.memory,/data/zabbix/sbin/processstatus.sh mycatmem UserParameter=process.mycat.cpu,/data/zabbix/sbin/processstatus.sh mycatcpu UserParameter=process.mycat.number,/data/zabbix/sbin/processstatus.sh mycatnum UserParameter=process.httpd.memory,/data/zabbix/sbin/processstatus.sh httpdmem UserParameter=process.httpd.cpu,/data/zabbix/sbin/processstatus.sh httpdcpu UserParameter=process.httpd.number,/data/zabbix/sbin/processstatus.sh httpdnum UserParameter=process.memcached.memory,/data/zabbix/sbin/processstatus.sh memcachedmem UserParameter=process.memcached.cpu,/data/zabbix/sbin/processstatus.sh memcachedcpu UserParameter=process.memcached.number,/data/zabbix/sbin/processstatus.sh memcachednum UserParameter=process.php.memory,/data/zabbix/sbin/processstatus.sh phpmem UserParameter=process.php.cpu,/data/zabbix/sbin/processstatus.sh phpcpu UserParameter=process.php.number,/data/zabbix/sbin/processstatus.sh phpnum
最后重启zabbix_agentd服务
service zabbix_agentd restart
然后在zabbix服务端使用zabbix_get看能否取到相应的数据,像下面这样就是成功获取到了数据。
[root@localhost zabbix]
# bin/zabbix_get -s 172.16.1.20 -p 10050 -k process.nginx.memory
184876
最后,需要在zabbix中定义模板。模板附件链接在下面。
zabbix模板下载
如果模板无法下载可以在附件中下载模板
需要注明的是内存取到的值得单位是KB,所以定义item的时候使用自定义倍数乘以1000,单位改成Byte,另外CPU占用率的值是带有小数点的一个数,所以在定义item的时候需要定义值得类型是浮点型,并且该值是占用逻辑单核的CPU占用率,所以需要定义自定义倍数,我实验中的服务器是2颗CPU,每颗CPU是8核16线程,所以自定义倍数是原来的基础上除以32,单位改成%就好。
下面是做好之后的显示效果:
注意这里MySQL占用cpu为0.0031%,表示占用1个核心的0.0031%,因为有32核心,所以mysql占总的cpu应该为
0.0031%*32=0.0992%
http://xianglinhu.blog.51cto.com/5787032/1657570