LMD0 (ospid: 8664): terminating the instance due to error 481

LMD0 (ospid: 8664): terminating the instance due to error 481
Bug一例

LGWR: Standby redo logfile selected for thread 2 sequence 5168 for destination LOG_ARCHIVE_DEST_3
Thread 2 advanced to log sequence 5168 (LGWR switch)
Current log# 8 seq# 5168 mem# 0: +DATA/sqmesdb/onlinelog/group_8.275.966805977
Current log# 8 seq# 5168 mem# 1: +FRA/sqmesdb/onlinelog/group_8.268.966805979
Sat Nov 24 16:36:10 2018
Archived Log entry 29259 added for thread 2 sequence 5167 ID 0x620b7fd7 dest 1:
Sat Nov 24 16:50:48 2018
LMD0 (ospid: 8664) received an instance eviction notification from instance 1 [2]
Sat Nov 24 16:50:48 2018
LMON received an instance eviction notification from instance 1
The instance eviction reason is 0x2
The instance eviction map is 2
Received an instance abort message from instance 3
Please check instance 3 alert and LMON trace files for detail.
Sat Nov 24 16:50:49 2018
Received an instance abort message from instance 3
Please check instance 3 alert and LMON trace files for detail.
LMD0 (ospid: 8664): terminating the instance due to error 481
Sat Nov 24 16:50:49 2018
System state dump requested by (instance=2, osid=8664 (LMD0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/oracle/diag/rdbms/sqmesdb/sqmesdb2/trace/sqmesdb2_diag_8652_20181124165049.trc
Errors in file /u01/oracle/diag/rdbms/sqmesdb/sqmesdb2/trace/sqmesdb2_diag_8652_20181124165049.trc:
ORA-00601: cleanup lock conflict
Dumping diagnostic data in directory=[cdmp_20181124165049], requested by (instance=2, osid=8664 (LMD0)), summary=[abnormal instance termination].
Instance terminated by LMD0, pid = 8664
Sat Nov 24 16:57:26 2018
Adjusting the default value of parameter parallel_max_servers
from 3600 to 1470 due to the value of parameter processes (1500)
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = 64 KB

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB

RECOMMENDATION:

#4 bug 13593999 - Instance startup fails after Timeout waiting for receiver sync for indirect messaging
Symptoms

This is an example of instance2 gets evicted by instance1 while instance2 is starting up

lmon trace (evicting instance)

2012-05-10 08:09:37.047017 : * kjfcdrmrcfg: waited 249 secs for lmd to receive all  ftdones in sync step 34, requesting memberkill of instances w/o ftdones:
2012-05-10 08:09:37.047266 : kjfsprn: sync status  inst 1  tmout 0 (sec)
2012-05-10 08:09:37.047281 : kjfsprn: sync propose inc 0  level 0
2012-05-10 08:09:37.047294 : kjfsprn: ftdone bitmap ver 118 (step 0.34/0.0)
..

alert_racdb2.log (evicted instance - the one trying to startup)

LMON received an instance eviction notification from instance 1
The instance eviction reason is 0x2
The instance eviction map is 2
Thu May 10 08:09:39 2012
LMS1 (ospid: 6947492) received an instance eviction notification from instance 1 [2]
Thu May 10 08:09:40 2012
PMON (ospid: 6750786): terminating the instance due to error 481
Thu May 10 08:09:40 2012
opiodr aborting process unknown ospid (58328524) as a result of ORA-1092
Thu May 10 08:09:40 2012
System state dump requested by (instance=2, osid=6750786 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /dump/diag/rdbms/coredb/coredb2/trace/coredb2_diag_5964522.trc
Instance terminated by PMON, pid = 6750786

lmd0 trace (evicted instance)

Timeout waiting for receiver sync for indirect messaging, waited=164 seconds (-856134688 -684158216)   SYNC_RCVCHANNEL 70000112245d9e8 from 1 spnum 11 ver[118,1]
     receiver 0 last sent seq 0.268500122
     receiver 1 last sent seq 0.381803485
..
MSG [85:KJX_SYNCALL] inc=118 len=64 sender=(1,1) seq=299546896
     fg=s stat=KJUSERSTAT_NOVALUE spnum=11 flg=x0
  START DEFER MSG QUEUE 0 ON LMD0 flg xc4:
MSG [61:KJX_SYNC_RCVCHANNEL] inc=118 len=424 sender=(4,4) seq=267582035
     fg=iq stat=KJUSERSTAT_DONE spnum=11 flg=x1
     SYNC_RCVCHANNEL 700001121ed1e70 from 4 spnum 11 ver[118,-659046560]
     receiver 0 last sent seq 0.267526687
     receiver 1 last sent seq 0.52735268
..
MSG [25:KJX_FTDONE] inc=118 len=64 sender=(4,4) seq=29564188
     fg=s stat=KJUSERSTAT_TIMEOUT spnum=11 flg=x0
MSG [25:KJX_FTDONE] inc=118 len=64 sender=(4,4) seq=29564188
     fg=s stat=KJUSERSTAT_TIMEOUT spnum=11 flg=x0
...
  START DEFER MSG QUEUE 2 ON LMS0 flg x87:
MSG [61:KJX_SYNC_RCVCHANNEL] inc=118 len=424 sender=(1,1) seq=268500218
     fg=iq stat=KJUSERSTAT_DONE spnum=11 flg=x1
     SYNC_RCVCHANNEL 70000112245d9e8 from 1 spnum 11 ver[118,1]
     receiver 0 last sent seq 0.268500122
     receiver 1 last sent seq 0.381803485
..

Solutions

bug 13593999 is fixed in 11.2.0.4, please request for interim patch is it’s not available for your platform/version.

你可能感兴趣的:(LMD0 (ospid: 8664): terminating the instance due to error 481)