判断标准是,通过sqlplus执行select sequence#,applied from v$archived_log order by sequence#;如果全是YES则表明standby库已经实时复制了,primary和standby数据一致了没有延迟,否则就是有延迟的。如下所示:
SQL> select sequence#,applied from v$archived_log order by sequence#;
SEQUENCE# APPLIED ---------- --------- 5118 YES 5119 YES 5120 YES 5121 YES 5122 YES 5123 YES 5124 YES 5125 YES 5126 YES 5127 YES 5128 YES
SEQUENCE# APPLIED ---------- --------- 5129 YES
419 rows selected.
SQL> |
所以通过进入oracle帐号,登录sqlplus控制台执行selectsequence#,applied from v$archived_log order by sequence#;来判断,将获取的查询结果集列出来,如果都是YES表明standby复制正常,反之则有问题,编写判断脚本如下:
脚本一get_dg.sh,获取standby状态,将所有的YES记录放在当前的文件bst.csv下面:
#!/bin/bash su - oracle -c "
cd /oracle/backup/data sqlplus -S zabbix/ys_zb_0418@PD2 << EOF set heading off set feedback off set pagesize 0 set verify off set echo off spool bts.csv select sequence#,applied from $1 order by sequence#; spool off exit EOF |
脚本二check_dg.sh,取出YES的num数量,取出standy所有的记录数,如果两个数字相等,则证明standby复制正常,否则复制不正常,则报警出来,check_dg.sh脚本如下:
#!/bin/bash #> /oracle/backup/data/bts.csv /bin/sh /oracle/backup/data/get_dg.sh $1 > /dev/null
cd /oracle/backup/data result_num=`cat bts.csv |wc |awk '{print $1}'` yes_num=`cat bts.csv |grep YES |wc |awk '{print $1}'` in_memory_num=`cat bts.csv |grep IN-MEMORY |wc |awk '{print $1}'` out_num=0; standby_num=`expr $yes_num + $in_memory_num + $out_num` if [ "$result_num" -eq 0 ] then echo 0 elif [ "$result_num" -eq 1 ] then echo 0 elif [ "$result_num" -eq "$standby_num" ] then echo 1 else echo 0 fi |
vim /usr/local/zabbix/conf/zabbix_agentd.conf
UnsafeUserParameters=1 UserParameter=oracle.standby_status,/oracle/backup/data/check_dg.sh 'v$archived_log' |
添加完后,重启zabbix_agentd
在oracle服务器可以正常调用:
[root@pldb02 data]# sh check_dg.sh'v$archived_log'
1
[root@pldb02 data]#
但是在zabbix-server上调用出错:
[root@zabbix_serv_121_12 ~]#/usr/local/zabbix/bin/zabbix_get -s 192.168.3.13 -p10050 -k"oracle.standby_status"
standard in must be a tty
1
[root@zabbix_serv_121_12 ~]#
需要在脚本里面配置一些oracle的环境变量,这样可以利用当前的zabbix帐号来运行sqlplus命令,get_dg.sh如下:
[root@pldb02 data]# more get_dg.sh #!/bin/bash export NLS_LANG=american_america.ZHS16GBK export ORACLE_BASE=/oracle/app/oracle export ORACLE_HOME=/oracle/app/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=powerdesdg2 export PATH=$ORACLE_HOME/bin:$PATH
cd /oracle/backup/data sqlplus -S zabbix/ys_zb_0418@PD2 << EOF set heading off set feedback off set pagesize 0 set verify off set echo off spool bts.csv select sequence#,applied from $1 order by sequence#; spool off exit EOF
[root@pldb02 data]# |
脚本2check_dg.sh基本没有变动,如下所示:
[root@pldb02 data]# more check_dg.sh #!/bin/bash #> /oracle/backup/data/bts.csv /bin/sh /oracle/backup/data/get_dg.sh $1 > /dev/null
cd /oracle/backup/data result_num=`cat bts.csv |wc |awk '{print $1}'` yes_num=`cat bts.csv |grep YES |wc |awk '{print $1}'` in_memory_num=`cat bts.csv |grep IN-MEMORY |wc |awk '{print $1}'` out_num=0; standby_num=`expr $yes_num + $in_memory_num + $out_num` if [ "$result_num" -eq 0 ] then echo 0 elif [ "$result_num" -eq 1 ] then echo 0 elif [ "$result_num" -eq "$standby_num" ] then echo 1 else echo 0 fi [root@pldb02 data]# |
gpasswd -a zabbix oinstall gpasswd -a zabbix dba gpasswd -a zabbix oracle |
在zabbix-server的服务器上,远程调用oracle的standby上的状态监控,看到能获取到实际的值1:
[root@zabbix_serv_121_12 ~]# /usr/local/zabbix/bin/zabbix_get -s 192.168.3.13 -p10050 -k "oracle.standby_status" 1 [root@zabbix_serv_121_12 ~]# |
在standby的主机上添加模版E:\u\azure_cloud\pd\024.png:
Actions里面添加触发条件的模版,这样当standby模版的条件满足后就会触发这个actions事件来发报警短信邮件或者打电话等等,E:\u\windows\pic\21.png:
至此,在zabbix上通过自己写的shell脚本来监控oracle的高可用的standby库复制状态已经完成了。