Linux/Unix shell 监控Oracle告警日志(monitor alter log file)

    使用shell脚本实现对Oracle数据库的监控与管理将大大简化DBA的工作负担,如常见的对实例的监控,监听的监控,告警日志的监控,以及数据库的备份,AWR report的自动邮件等。本文给出Linux 下使用 shell 脚本来监控 Oracle 告警日志(monitor alter log file)。

    Linux Shell的相关参考:
        Linux/Unix shell 脚本中调用SQL,RMAN脚本
        Linux/Unix shell sql 之间传递变量
        Linux/Unix shell 调用 PL/SQL
        Linux/Unix shell 监控Oracle实例(monitor instance)
        Linux/Unix shell 监控Oracle监听器(monitor listener)

 

1、监控Oracle告警日志脚本

robin@SZDB:~/dba_scripts/custom/bin> more ck_alert.sh
#!/bin/bash
# --------------------------------------------------------------------------+
#                  CHECK ALERT LOG FILE                                     |
#   Filename: ck_alert.sh                                                   |
#   Desc:                                                                   |
#       The script use to check alert log file.                             |
#       Once any error was caught, a mail alert will be sent.               |   
#       Deploy it by crontab. e.g. per 15 min                               |  
#   Usage:                                                                  |
#       ./ck_alert.sh $ORACLE_SID                                           |  
#                                                                           |
#   Author : Robinson                                                       | 
#   Blog   : http://blog.csdn.net/robinson_0612                             |
# --------------------------------------------------------------------------+
#
# --------------------------
#   Check SID
# --------------------------

if [ -z "${1}" ];then
    echo "Usage: "
    echo "      `basename $0` ORACLE_SID"
    exit 1
fi

# -------------------------------
#  Set environment here 
# ------------------------------

if [ -f ~/.bash_profile ]; then
    . ~/.bash_profile
fi

export ORACLE_SID=$1
export MACHINE=`hostname`
export MAIL_DIR=/users/robin/dba_scripts/sendEmail-v1.56
export MAIL_LIST='[email protected]'
export MAIL_FM='[email protected]'

# ----------------------------------------------
# check the database is running, if not exit
# ----------------------------------------------

db_stat=`ps -ef | grep pmon_$ORACLE_SID | grep -v grep| cut -f3 -d_`
if [ -z "$db_stat" ]; then
    date >/tmp/db_${ORACLE_SID}_stauts.log
    echo " $ORACLE_SID is not available on ${MACHINE} !!!" >>/tmp/db_${ORACLE_SID}_stauts.log 
    MAIL_SUB=" $ORACLE_SID is not available on ${MACHINE} !!!"
    $MAIL_DIR/sendEmail -u $MAIL_SUB -f $MAIL_FM -t $MAIL_LIST -o message-file=/tmp/db_${ORACLE_SID}_stauts.log
    exit 1
fi;

# --------------------------------------
#  Get the location of alert log file
# --------------------------------------

sqlplus '/ as sysdba' << EOF > /tmp/${ORACLE_SID}_monitor_temp.txt
column xxxx format a10
column value format a80
set lines 132
SELECT 'xxxx' ,value FROM  v\$parameter WHERE  name = 'background_dump_dest'
/
exit
EOF

cat /tmp/${ORACLE_SID}_monitor_temp.txt | awk '$1 ~ /xxxx/ {print $2}' > /tmp/${ORACLE_SID}_monitor_location.txt
read ALERT_DIR < /tmp/${ORACLE_SID}_monitor_location.txt
rm /tmp/${ORACLE_SID}_monitor_temp.txt 2>/dev/null

# ----------------------------------------
#  Define archive directory and log file
# ----------------------------------------

DT=`date +%Y%m%d`
DT_DIR=`date +%Y%m`
ARCH_DIR=${ALERT_DIR}/${DT_DIR}

if [ ! -d "${ARCH_DIR}" ] ; then
    mkdir $ARCH_DIR
fi

ORIG_ALERT_LOG=${ALERT_DIR}/alert_${ORACLE_SID}.log
NEW_ALERT_LOG=${ARCH_DIR}/alert_${ORACLE_SID}.log.${DT}
TEMP_ALERT_LOG=${ORIG_ALERT_LOG}.temp
AWK_DIR=/users/robin/dba_scripts/custom/bin

# -------------------------------------
#  Check alert log file and send email
# -------------------------------------
cat ${ORIG_ALERT_LOG} | awk -f $AWK_DIR/check_alert.awk > /tmp/${ORACLE_SID}_check_monitor_log.log
if [ -s "/tmp/${ORACLE_SID}_check_monitor_log.log" ];
   then 
     echo "Found errors in sid ${ORACLE_SID}, mailed errors"
     echo -e "The following errors were found in the alert log for ${ORACLE_SID} \n" > /tmp/${ORACLE_SID}_check_monitor_log.mail
     echo -e "Alert log was copied into ${NEW_ALERT_LOG} \n">> /tmp/${ORACLE_SID}_check_monitor_log.mail
     date >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     echo "--------------------------------------------------------------">>/tmp/${ORACLE_SID}_check_monitor_log.mail
     echo " "
     echo " " >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     echo " " >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     cat /tmp/${ORACLE_SID}_check_monitor_log.log >>  /tmp/${ORACLE_SID}_check_monitor_log.mail
     MAIL_SUB="Found errors in ${ORACLE_SID} on ${MACHINE}"
     $MAIL_DIR/sendEmail -u $MAIL_SUB -f $MAIL_FM -t $MAIL_LIST -o message-file=/tmp/${ORACLE_SID}_check_monitor_log.mail

# --------------------------------
#  Backup current alert log file
# --------------------------------
    mv ${ORIG_ALERT_LOG} ${TEMP_ALERT_LOG}
    cat ${TEMP_ALERT_LOG} >> ${NEW_ALERT_LOG}
    #touch ${ORIG_ALERT_LOG}
    cat /dev/null > ${ORIG_ALERT_LOG}
       rm /tmp/${ORACLE_SID}_check_monitor_log.log 
       rm /tmp/${ORACLE_SID}_check_monitor_log.mail
       rm ${TEMP_ALERT_LOG} > /dev/null 
    exit
fi

rm /tmp/${ORACLE_SID}_check_monitor_log.log > /dev/null
rm /tmp/${ORACLE_SID}_monitor_location.txt > /dev/null

exit

2、过滤Oracle告警日志错误信息

robin@SZDB:~/dba_scripts/custom/bin> more check_alert.awk
$0 ~ /Errors in file/ {print $0}
$0 ~ /PMON: terminating instance due to error 600/ {print $0}
$0 ~ /Started recovery/{print $0}
$0 ~ /Archival required/{print $0}
$0 ~ /Instance terminated/ {print $0}
$0 ~ /Checkpoint not complete/ {print $0}
$1 ~ /ORA-/ { print $0; flag=1 }
$0 !~ /ORA-/ {if (flag==1){print $0; flag=0;print " "} }
$0 ~ /ERROR_AUDIT/ {print $0}

3、老化Oracle告警日志脚本

robin@SZDB:~/dba_scripts/custom/bin> more age_alert.sh
#!/bin/bash
# ------------------------------------------------------------+
#                 Age the alert log file                      |
#   FileName: age_alert.sh                                    |
#   Desc:                                                     | 
#        The script use to age the alert log file             |
#   Usage:                                                    |
#        ./age_alert.sh $ORACLE_SID                           |
#                                                             |
#   Authror : Robinson                                        |
#   Blog    : http://blog.csdn.net/robinson_0612              |
# ------------------------------------------------------------+

# --------------------------
#   Check SID
# --------------------------

if [ -z "${1}" ];then
    echo "Usage: "
    echo "      `basename $0` ORACLE_SID"
    exit 1
fi

# -------------------------------
#  Set environment here
# ------------------------------

if [ -f ~/.bash_profile ]; then
    . ~/.bash_profile
fi

export MAIL_DIR=/users/robin/dba_scripts/sendEmail-v1.56
export MAIL_LIST='[email protected]'
export MAIL_FM='[email protected]'
ORACLE_SID=$1;  export ORACLE_SID

# ----------------------------------------------
# check if the database is running, if not exit
# ----------------------------------------------

db_stat=`ps -ef | grep pmon_$ORACLE_SID | grep -v grep| cut -f3 -d_`
if [ -z "$db_stat" ]; then
    echo " $ORACLE_SID is not available on `hostname` !!!"  
    MAIL_SUB=" $ORACLE_SID is not available on `hostname` !!!"
    MAIL_MSG="$ORACLE_SID is not available on `hostname` before age alert log file, exit, please check !"
    $MAIL_DIR/sendEmail -u $MAIL_SUB -f $MAIL_FM -t $MAIL_LIST -m $MAIL_MSG
    exit 1
fi

# -----------------------------------
# Find bdump directory for database
# -----------------------------------

DUMP_DIR=`sqlplus -S '/ as sysdba' << EOF
set pagesize 0 feedback off verify off heading off echo off
SELECT value FROM  v\\$parameter WHERE  name = 'background_dump_dest';
exit
EOF`
if [ -z ${DUMP_DIR} ]; then
    echo "The bdump directory was not found for ${ORACLE_SID}"
    MAIL_SUB="The bdump directory was not found for ${ORACLE_SID}"
    MAIL_MSG="The bdump directory was not found for ${ORACLE_SID} on `hostname` before age log file,exit,please check !"
    $MAIL_DIR/sendEmail -u $MAIL_SUB -f $MAIL_FM -t $MAIL_LIST -m $MAIL_MSG
    exit 1
else
    echo ${DUMP_DIR}
fi

# -------------------------------
#  Archive alert log file
# -------------------------------

DT=`date +%Y%m%d -d '-1 day'`
OLD_DIR=${DT:0:6}
NEW_DIR=`date +%Y%m`
ORIG_ALERT_LOG=${DUMP_DIR}/alert_${ORACLE_SID}.log
OLD_ARC_DIR=${DUMP_DIR}/${OLD_DIR}
NEW_ARC_DIR=${DUMP_DIR}/${NEW_DIR}

if [ ! -d "${NEW_ARC_DIR}" ] ; then
    mkdir ${NEW_ARC_DIR}
fi

if [ "${OLD_DIR}" \< "${NEW_DIR}" ];then
    ARC_LOG=${OLD_ARC_DIR}/alert_${ORACLE_SID}.log.${DT}
else
    ARC_LOG=${NEW_ARC_DIR}/alert_${ORACLE_SID}.log.${DT}
fi

cat ${ORIG_ALERT_LOG} >>${ARC_LOG}
cat /dev/null>${ORIG_ALERT_LOG}
exit

4、部署脚本到crontab

*/15 * * * * /users/robin/dba_scripts/custom/bin/ck_alert.sh MMBOTST
0 0 * * * /users/robin/dba_scripts/custom/bin/age_alert.sh MMBOTST


5、补充
  a、上面脚本用于实时监控Oracle告警日志,一旦检测到错误,将发送邮件。
  b、对于已经检查过且发现错误的日志将被移动作为归档,也就是Oracle错误不会被重复检测。
  c、所有有关错误检测的过滤条件被放置到过滤文件check_alert.awk中。
  d、第3个脚本用于老化告警日志,建议设置老化的时间为每天0点,这样子,每天将会保留当天的告警日志。
  e、对于老化的告警日值,按年月来存放,也即是以年月命名文件夹,当天告警日志会存放在当月文件夹。
  f、使用了sendEmail邮件发送程序来发送邮件。参阅:不可或缺的 sendEmail
  g、该脚本仅在Oracle 10g测试可用,Oracle 11g应做相应修改。

 

更多参考

有关Oracle RAC请参考
     使用crs_setperm修改RAC资源的所有者及权限
     使用crs_profile管理RAC资源配置文件
     RAC 数据库的启动与关闭
     再说 Oracle RAC services
     Services in Oracle Database 10g
     Migrate datbase from single instance to Oracle RAC
     Oracle RAC 连接到指定实例
     Oracle RAC 负载均衡测试(结合服务器端与客户端)
     Oracle RAC 服务器端连接负载均衡(Load Balance)
     Oracle RAC 客户端连接负载均衡(Load Balance)
     ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)
     ORACLE RAC 监听配置 (listener.ora tnsnames.ora)
     配置 RAC 负载均衡与故障转移
     CRS-1006 , CRS-0215 故障一例 
     基于Linux (RHEL 5.5) 安装Oracle 10g RAC
     使用 runcluvfy 校验Oracle RAC安装环境

有关Oracle 网络配置相关基础以及概念性的问题请参考:
     配置非默认端口的动态服务注册
     配置sqlnet.ora限制IP访问Oracle
     Oracle 监听器日志配置与管理
     设置 Oracle 监听器密码(LISTENER)
     配置ORACLE 客户端连接到数据库

有关基于用户管理的备份和备份恢复的概念请参考
     Oracle 冷备份
     Oracle 热备份
     Oracle 备份恢复概念
     Oracle 实例恢复
     Oracle 基于用户管理恢复的处理
     SYSTEM 表空间管理及备份恢复
     SYSAUX表空间管理及恢复
     Oracle 基于备份控制文件的恢复(unsing backup controlfile)

有关RMAN的备份恢复与管理请参考
     RMAN 概述及其体系结构
     RMAN 配置、监控与管理
     RMAN 备份详解
     RMAN 还原与恢复
     RMAN catalog 的创建和使用
     基于catalog 创建RMAN存储脚本
     基于catalog 的RMAN 备份与恢复
     RMAN 备份路径困惑
     使用RMAN实现异机备份恢复(WIN平台)
     使用RMAN迁移文件系统数据库到ASM
     linux 下RMAN备份shell脚本
     使用RMAN迁移数据库到异机

有关ORACLE体系结构请参考
     Oracle 表空间与数据文件
     Oracle 密码文件
     Oracle 参数文件
     Oracle 联机重做日志文件(ONLINE LOG FILE)
     Oracle 控制文件(CONTROLFILE)
     Oracle 归档日志
     Oracle 回滚(ROLLBACK)和撤销(UNDO)
     Oracle 数据库实例启动关闭过程
     Oracle 10g SGA 的自动化管理
     Oracle 实例和Oracle数据库(Oracle体系结构) 

你可能感兴趣的:(-----SQL/Shell,脚本)