在巡检数据库的时候,发现grid的安装目录非常大,于是进行了进一步的排查,发现有很多*.l10的日志已经删除了,但是并没有释放,使用lsof命令可以看到
lsof|grep delete
[smisa@smidb11 oraagent_grid]$ sudo lsof |grep delete|grep -E "oraagent|audit"oracle 18375 grid 33w REG 253,18 10549871 1715713 /oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.l10 (deleted)oracle 25466 grid 4w REG 253,18 10576898 1704191 /oracle/app/11.2.0/grid_1/log/smidb11/agent/ohasd/oraagent_grid/oraagent_grid.l10 (deleted)oracle 25468 grid 4w REG 253,18 10576898 1704191 /oracle/app/11.2.0/grid_1/log/smidb11/agent/ohasd/oraagent_grid/oraagent_grid.l10 (deleted)。。。。。。oracle 25603 grid 4w REG 253,18 10576898 1704191 /oracle/app/11.2.0/grid_1/log/smidb11/agent/ohasd/oraagent_grid/oraagent_grid.l10 (deleted)oracle 28628 grid 4w REG 253,18 10511009 1704519 /oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.l10 (deleted) 二节点: [smisa@smidb12 ~]$ sudo lsof |grep delete|grep -E "oraagent|audit"oracle 15647 grid 4w REG 253,18 10574076 3150129 /oracle/app/11.2.0/grid_1/log/smidb12/agent/ohasd/oraagent_grid/oraagent_grid.l10 (deleted)oracle 15649 grid 4w REG 253,18 10574076 3150129 /oracle/app/11.2.0/grid_1/log/smidb12/agent/ohasd/oraagent_grid/oraagent_grid.l10 (deleted)。。。。。。。。oracle 15887 grid 4w REG 253,18 10543680 3150126 /oracle/app/11.2.0/grid_1/log/smidb12/agent/crsd/oraagent_grid/oraagent_grid.l10 (deleted)
随着grid的运行,此日志会越来越多,最终耗尽内存.通过MOS查找问题,发现名字Bug17034444
但是发现此Bug还没有补丁,需要申请开发,所以通过自己编写脚本来进行规避,思路如下:
每隔一段时间扫描一下目录,查看是否生成 oraagent_grid.l10文件
如果发现此文件生成,那么在grid自动删除此文件前,提前进行删除,避免grid不会释放文件的bug
脚本如下:
source /home/grid/.bash_profile HOSTNAME=`hostname` LOGFILE1=/oracle/app/11.2.0/grid_1/log/${HOSTNAME}/agent/crsd/oraagent_grid/oraagent_grid.l10 LOGFILE2=/oracle/app/11.2.0/grid_1/log/${HOSTNAME}/agent/ohasd/oraagent_grid/oraagent_grid.l10 LISTENER_XML_LOG=/oracle/app/grid/diag/tnslsnr/${HOSTNAME}/listener/alert/log_*.xml LISTENER_TRACE_FILE=/oracle/app/grid/diag/tnslsnr/${HOSTNAME}/listener/trace/listener.log DATETIME=`date +%Y%m%d%H%M%S` echo "${DATETIME} Is Run!" >>/home/grid/script/clearlog.log if [ -e ${LOGFILE1} ] ; then rm -rf ${LOGFILE1} echo "${LOGFILE1} Deleted!">>/home/grid/script/clearlog.log fi if [ -e ${LOGFILE2} ] ; then rm -rf ${LOGFILE2} echo "${LOGFILE2} Deleted!">>/home/grid/script/clearlog.log fi FILE_CNT=`ls -l /oracle/app/grid/diag/tnslsnr/${HOSTNAME}/listener/alert/|wc -l` if [ ${FILE_CNT} -gt 100 ] ; then ls -lrt ${LISTENER_XML_LOG}>>/home/grid/script/clearlog.log rm -rf ${LISTENER_XML_LOG} fi if [ `du -sk|awk '{print $1}'` -gt 1073741824 ] ; then echo "${LISTENER_TRACE_FILE} Clear!!">>/home/grid/script/clearlog.log >${LISTENER_TRACE_FILE} fi exit 0
使用的时候需要将前面的几个变量修改为当前服务器对应的值,此脚本除了会删除 *.l10的日志文件以外,还会自动清除监听日志.
Oracle官方描述: