RAC节点频繁重启

  Current log# 8 seq# 207761 mem# 0: +ILN_DATA/ilndb/onlinelog/group_8.269.757270173
Fri Sep 09 12:00:04 EAT 2016
Thread 3 advanced to log sequence 207762 (LGWR switch)
  Current log# 7 seq# 207762 mem# 0: +ILN_DATA/ilndb/onlinelog/group_7.268.757270171
Fri Sep 09 12:00:29 EAT 2016
Thread 3 advanced to log sequence 207763 (LGWR switch)
  Current log# 9 seq# 207763 mem# 0: +ILN_DATA/ilndb/onlinelog/group_9.270.757270173
Thread 3 cannot allocate new log, sequence 207764
Checkpoint not complete
  Current log# 9 seq# 207763 mem# 0: +ILN_DATA/ilndb/onlinelog/group_9.270.757270173
Fri Sep 09 12:00:44 EAT 2016
Thread 3 advanced to log sequence 207764 (LGWR switch)
  Current log# 8 seq# 207764 mem# 0: +ILN_DATA/ilndb/onlinelog/group_8.269.757270173
Fri Sep 09 12:23:15 EAT 2016
Errors in file /home/oracle/product/admin/ilndb/udump/ilndb3_ora_15090.trc:
ORA-07445: exception encountered: core dump [_memcmp()+16] [SIGSEGV] [unknown code] [0x13139332C31] [] []
Fri Sep 09 12:23:17 EAT 2016
Trace dumping is performing id=[cdmp_20160909122317]
Fri Sep 09 12:23:32 EAT 2016
Errors in file /home/oracle/product/admin/ilndb/bdump/ilndb3_pmon_14666.trc:
ORA-07445: exception encountered: core dump [kgllkdl()+548] [SIGSEGV] [unknown code] [0x800000065556B2C8] [] []
Fri Sep 09 12:23:34 EAT 2016
Errors in file /home/oracle/product/admin/ilndb/bdump/ilndb3_pmon_14666.trc:
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], []
Fri Sep 09 12:23:35 EAT 2016
Errors in file /home/oracle/product/admin/ilndb/bdump/ilndb3_pmon_14666.trc:
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], []
Fri Sep 09 12:23:35 EAT 2016
PMON: terminating instance due to error 472
Fri Sep 09 12:23:37 EAT 2016
Shutting down instance (abort)

主要的trc文件是:
/home/oracle/product/admin/ilndb/udump/ilndb3_ora_15090.trc
/home/oracle/product/admin/ilndb/bdump/ilndb3_pmon_14666.trc

查看trc文件内容,首先查看ilndb3_ora_15090.trc,发现trc文件中存在大量
WARNING:Could not increase the asynch I/O limit to 192 for SQL direct I/O. It is set to 0

匹配操作系统类型和数据库版本,此告警定位为ORACLE Bug。
metalink原文如下:

ug 9949948 - Linux: Process spin under ksfdrwat0 if OS Async IO not configured high enough [ID 9949948.8]

解决办法是增大系统aio的限制值,或者禁用异步io

  1. 增大限制值:
  2. 禁用异步功能:设置Oracle参数DISK_ASYNC_IO=FALSE

查看在12:23:15时不再报警,而是触发数据库重启

*** 2016-09-09 12:23:15.468
WARNING:Could not increase the asynch I/O limit to 64 for SQL direct I/O. It is set to 0
WARNING:Could not increase the asynch I/O limit to 64 for SQL direct I/O. It is set to 0
NOTICE: Signal generated by another process. A call stack
        dump does not mean that a problem exists in this process
Exception signal: 11 (SIGSEGV), code: 0 (unknown code), addr: 0x0000013139332c31, PC: [0xc00000000018eba0, _memcmp()+16]

未完待续

你可能感兴趣的:(RAC节点频繁重启)