利用系统错误日志监控磁盘健康状况

一、故障现象

     这个星期出现了两块磁盘不能读写,后面通过系统日志查看,关键字“EXT4-fs error对应某个磁盘”,因此利用zabbix,把系统日志抓取出来,作告警!

二、步聚

1.机器太多,用到ansible的playbook,进行一次性推送

2.定义key值,在/etc/zabbix/zabbix_agentd.conf.d/agentd.conf 

###kernel_error of disk  from /var/log/messsage
UserParameter=disk_health,awk -v kernel_error=`sudo tail /var/log/messages | grep "EXT4-fs error" | wc -l` 'BEGIN{if(kernel_erro > 0){print 1} else {print 0}}'

3.zabbix用户的sudo权限

vim /etc/sudoers.d/zabbix 

zabbix ALL=(root) NOPASSWD:/bin/bash,/bin/netstat,/usr/bin/nmap,/bin/grep,/bin/awk,/usr/local/mysql/bin/mysql,/usr/bin/tail,/bin/cat

playbook

---
 - hosts: "{{hosts}}"
   gather_facts: false
   tasks:
   - name: Add include path
     lineinfile:
        dest: "{{ item.dest }}"
        regexp: "{{ item.regexp }}"
        line: "{{ item.line }}"
     with_items:
     - {
       dest: "/etc/zabbix/zabbix_agentd.conf",
       regexp: "^Include",
       line: "\n\n###Add include\nInclude=/etc/zabbix/zabbix_agentd.conf.d/*.conf" }
     - {
       dest: "/etc/sudoers",
       regexp: "^Defaults    requiretty",
       line: " #Defaults    requiretty" }

   - name: Copy configuration file
     copy:
        src=\'#\'" /etc/sudoers.d/zabbix",
       dest: "/etc/sudoers.d/" }
     - {
       src=\'#\'" /etc/zabbix/zabbix_agentd.conf.d/agentd.conf",
       dest: "/etc/zabbix/zabbix_agentd.conf.d/" }

   - name: Rresart zabbix service
     service: name=zabbix_agentd state=restarted


4、执行

ansible-playbook copyfile.yml -e "hosts=all"


你可能感兴趣的:(zabbix,ansible,playbook,磁盘健康状态监控,lineinfile)