问题背景:
将11gR2 RAC正常部署完成之后执行两节点重启操作发现其中有一个节点的集群资源无法启动,遂再次重启该无法启动集群资源的节点,还是不可。随即将正常节点重启发现原故障节点资源起来了,待重启完毕后原正常节点资源无法启动。
集群环境:
OS:RedHat EnterPrise5.8 x86_x64
DB:Oracle EnterPrise Database 11.2.0.4.0 x86_x64
GRID:Oracle Grid Infrastructure 11.2.0.4 x86_x64
心跳和公网网卡均做bond1绑定;
存储是IBM的,故采用RDAC做多路径聚合;
问题分析:
检查OS系统日志和集群日志
db01 grid 日志信息
2014-09-29 15:36:36.587: [ctssd(12616)]CRS-2405:The Cluster Time Synchronization Service on host db01 is shutdown by user 2014-09-29 15:36:36.589: [mdnsd(6173)]CRS-5602:mDNS service stopping by request. [client(17411)]CRS-10001:29-Sep-14 15:36 ACFS-9290: Waiting for ASM to shutdown. 2014-09-29 15:36:46.395: [cssd(8958)]CRS-1603:CSSD on node db01 shutdown by user. 2014-09-29 15:36:46.509: [ohasd(12463)]CRS-2767:Resource state recovery not attempted for 'ora.cssdmonitor' as its target state is OFFLINE 2014-09-29 15:36:46.509: [ohasd(12463)]CRS-2769:Unable to failover resource 'ora.cssdmonitor'. 2014-09-29 15:36:46.608: [cssd(8958)]CRS-1660:The CSS daemon shutdown has completed ocssd 2014-09-29 15:36:46.052: [ CSSD][1122224448]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:36:46.053: [ CSSD][1122224448]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmExitGrock: client 1 (0x1261ee30), grock haip.cluster_interconnect, member 0 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmUnregisterPrimary: Unregistering member 0 (0x1261ace0) in global grock haip.cluster_interconnect 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmUnreferenceMember: global grock haip.cluster_interconnect member 0 refcount is 0 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmAllocateRPCIndex: allocated rpc 357 (0x2aaaaaee8a58) 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmRPC: rpc 0x2aaaaaee8a58 (RPC#357) tag(165002a) sent to node 1 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmHandleMasterMemberExit: [s(1) d(1)] 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmRemoveMember: grock haip.cluster_interconnect, member number 0 (0x1261ace0) node number 1 state 0x4 grock type 2 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmGrantLocks: 0-> new master (1/2) group haip.cluster_interconnect 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmRPCDone: rpc 0x2aaaaaee8a58 (RPC#357) state 6, flags 0x100 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmDelMemCmpl: rpc 0x2aaaaaee8a58, ret 0, client 0x1261ee30 member 0x1261ace0 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmFreeRPCIndex: freeing rpc 357 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmAllocateRPCIndex: allocated rpc 358 (0x2aaaaaee8b00) 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmDiscEndpcl: gipcDestroy 0x1a370 2014-09-29 15:36:46.388: [ CSSD][1079531840]clssgmRPCBroadcast: rpc(0x166002a), status(1), sendcount(1), filtered by specific properties: 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmDeadProc: proc 0x1262a8a0 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmDestroyProc: cleaning up proc(0x1262a8a0) con(0x1a341) skgpid ospid 12515 with 0 clients, refcount 0 2014-09-29 15:36:46.388: [ CSSD][1089472832]clssgmDiscEndpcl: gipcDestroy 0x1a341 2014-09-29 15:36:46.389: [ CSSD][1079531840]clssgmRPCDone: rpc 0x2aaaaaee8b00 (RPC#358) state 4, flags 0x402 2014-09-29 15:36:46.389: [ CSSD][1079531840]clssgmBroadcastGrockRcfgCmpl: RPC(0x166002a) of grock(haip.cluster_interconnect) received all acks, grock update sequence(4) 2014-09-29 15:36:46.389: [ CSSD][1079531840]clssgmFreeRPCIndex: freeing rpc 358 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmExecuteClientRequest: MAINT recvd from proc 7 (0x1264c8e0) 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmShutDown: Received explicit shutdown request from client. 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmClientShutdown: total iocapables 0 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmClientShutdown: graceful shutdown completed. 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmClientShutdown: signaling to the agent that resource should remain down 2014-09-29 15:36:46.395: [ CSSD][1089472832]clssgmCompareSwapEventValue: changed CmInfo State val 0, from 11, changes 21 2014-09-29 15:36:46.395: [ CSSD][1079531840]clssgmPeerListener: terminating at incarn(307480783) 2014-09-29 15:36:46.395: [ CSSD][1079531840]clssgmPeerDeactivate: node 2 (db02), death 0, state 0x1 connstate 0xf 2014-09-29 15:36:46.395: [ CSSD][1079531840]clssgmCleanFuture: discarded 0 future msgs for 2 2014-09-29 15:36:46.395: [ CSSD][1079531840]clssgmDiscEndppl: gipcDestroy 0x26027 2014-09-29 15:36:46.496: [ CSSD][1089472832]clssnmSendManualShut: Notifying all nodes that this node has been manually shut down 2014-09-29 15:36:46.497: [GIPCHAUP][1091049792] gipchaUpperDisconnect: initiated discconnect umsg 0x12ac30d0 { msg 0x12ad9f08, ret gipcretRequestPending (15), flags 0x2 }, msg 0x12ad9f08 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00026038, dstCid 00000000-00000d5b }, endp 0x12ac12d0 [0000000000026038] { gipchaEndpoint : port '5996-da20-cc7e-b119', peer 'db02:gm2_db-cluster/2991-10ef-6fca-054c', srcCid 00000000-00026038, dstCid 00000000-00000d5b, numSend 0, maxSend 100, groupListType 2, hagroup 0x12a9f4e0, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:46.497: [GIPCHAUP][1091049792] gipchaUpperCallbackDisconnect: completed DISCONNECT ret gipcretSuccess (0), umsg 0x12ac30d0 { msg 0x12ad9f08, ret gipcretSuccess (0), flags 0x2 }, msg 0x12ad9f08 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00026038, dstCid 00000000-00000d5b }, hendp 0x12ac12d0 [0000000000026038] { gipchaEndpoint : port '5996-da20-cc7e-b119', peer 'db02:gm2_db-cluster/2991-10ef-6fca-054c', srcCid 00000000-00026038, dstCid 00000000-00000d5b, numSend 0, maxSend 100, groupListType 2, hagroup 0x12a9f4e0, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x2aaaac8b3180) proc (0x2aaaac937fd0), iocapables 1. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x2aaaac937fd0), pid (6300), iocapables 1, client (0x2aaaac8b3180) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x2aaaac9773e0) proc (0x2aaaac937fd0), iocapables 2. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x2aaaac937fd0), pid (6300), iocapables 2, client (0x2aaaac9773e0) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x2aaaac958d50) proc (0x2aaaac99d570), iocapables 3. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x2aaaac99d570), pid (6208), iocapables 3, client (0x2aaaac958d50) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x12645650) proc (0x1264c8e0), iocapables 4. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1264c8e0), pid (12518), iocapables 4, client (0x12645650) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x126171b0) proc (0x1264c8e0), iocapables 5. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1264c8e0), pid (12518), iocapables 5, client (0x126171b0) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x1264cef0) proc (0x1264c8e0), iocapables 6. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1264c8e0), pid (12518), iocapables 6, client (0x1264cef0) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x125f0340) proc (0x1260f910), iocapables 7. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1260f910), pid (12520), iocapables 7, client (0x125f0340) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x125e98b0) proc (0x1260f910), iocapables 8. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1260f910), pid (12520), iocapables 8, client (0x125e98b0) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: Aborting client (0x12ac2790) proc (0x1260f910), iocapables 9. 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmSendShutdown: I/O capable proc (0x1260f910), pid (12520), iocapables 9, client (0x12ac2790) 2014-09-29 15:36:46.506: [ CSSD][1089472832]clssgmClientShutdown: sending shutdown, fence_done 1 2014-09-29 15:36:46.608: [ default][1089472832]kgzf_fini: called 2014-09-29 15:36:46.608: [ default][1089472832]kgzf_fini1: completed. kgzf layer has quit. crsd 2014-09-29 15:36:13.419: [UiServer][1180916032]{1:44501:180} Sending message to PE. ctx= 0x8a97b60 2014-09-29 15:36:13.420: [ CRSPE][1178814784]{1:44501:180} Cmd : 0x2aaab018b580 : flags: FORCE_TAG 2014-09-29 15:36:13.420: [ CRSPE][1178814784]{1:44501:180} Processing PE command id=238. Description: [Server Shutdown {} : pass=0 : 0x2aaab018b580] 2014-09-29 15:36:13.420: [ CRSPE][1178814784]{1:44501:180} Prepared shutdown cmd for: db01 2014-09-29 15:36:13.420: [ CRSPE][1178814784]{1:44501:180} Server [db01] has changed state from [ONLINE] to [LEAVING] 2014-09-29 15:36:13.420: [ CRSOCR][1176713536]{1:44501:180} Multi Write Batch processing... 2014-09-29 15:36:13.420: [ CRSRPT][1180916032]{1:44501:180} Published to EVM CRS_SERVER_STATE_CHANGE for db01 2014-09-29 15:36:13.437: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab02145f0 has 3 WOs 2014-09-29 15:36:13.437: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab0215980 has 4 WOs 2014-09-29 15:36:13.438: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab02765c0 has 5 WOs 2014-09-29 15:36:13.438: [UiServer][1180916032]{1:44501:180} Container [ Name: ORDER MESSAGE: TextMessage[CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'db01'] MSGTYPE: TextMessage[3] OBJID: TextMessage[db01] WAIT: TextMessage[0] ] 2014-09-29 15:36:13.438: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab0217fb0 has 11 WOs 2014-09-29 15:36:13.438: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab0214150 has 9 WOs 2014-09-29 15:36:13.438: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab0213ee0 has 3 WOs 2014-09-29 15:36:13.439: [ CRSPE][1178814784]{1:44501:180} Op 0x2aaab018ab50 has 14 WOs 2014-09-29 15:36:13.439: [ CRSPE][1178814784]{1:44501:180} RI [ora.cvu 1 1] new internal state: [STOPPING] old value: [STABLE] 2014-09-29 15:36:13.439: [ CRSPE][1178814784]{1:44501:180} Sending message to agfw: id = 1238 2014-09-29 15:36:13.439: [ AGFW][1168308544]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_STOP[ora.cvu 1 1] ID 4099:1238 2014-09-29 15:36:13.439: [ CRSPE][1178814784]{1:44501:180} CRS-2673: Attempting to stop 'ora.cvu' on 'db01' 2014-09-29 15:36:13.439: [ AGFW][1168308544]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_STOP[ora.cvu 1 1] ID 4099:1238 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.440: [UiServer][1180916032]{1:44501:180} Container [ Name: ORDER MESSAGE: TextMessage[CRS-2673: Attempting to stop 'ora.cvu' on 'db01'] MSGTYPE: TextMessage[3] OBJID: TextMessage[ora.cvu] WAIT: TextMessage[0] ] ………….
db02grid 日志信息
grid crsd 2014-09-29 15:36:13.348: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_START[ora.cvu 1 1] ID 4098:1263 2014-09-29 15:36:13.348: [ AGFW][1163692352]{1:44501:180} Creating the resource: ora.cvu 1 1 2014-09-29 15:36:13.348: [ AGFW][1163692352]{1:44501:180} Initializing the resource ora.cvu 1 1 for type ora.cvu.type 2014-09-29 15:36:13.348: [ AGFW][1163692352]{1:44501:180} SR: acl = owner:grid:rwx,pgrp:oinstall:rwx,other::r-- 2014-09-29 15:36:13.349: [ AGFW][1163692352]{1:44501:180} Starting the agent: /DBSoft/11.2.4/grid/bin/scriptagent with user id: grid and incarnation:1 2014-09-29 15:36:13.351: [ AGFW][1163692352]{1:44501:180} Starting the HB [Interval = 30000, misscount = 6kill allowed=1] for agent: /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.418: [ CRSCOMM][1157388608] Accepted connection from user: grid 2014-09-29 15:36:13.418: [ CRSCOMM][1157388608] Sending member number msg 0x2aaab00ab4e0 2014-09-29 15:36:13.418: [ CRSCOMM][1157388608] IpcL: Sent member number to client 5 2014-09-29 15:36:13.418: [ CRSCOMM][1157388608] Sending member data msg 0x2aaab00a3800 2014-09-29 15:36:13.418: [ CRSCOMM][1157388608] IpcL: Sent member data to client 5 2014-09-29 15:36:13.419: [ CRSCOMM][1157388608] IpcL: Received member data 2014-09-29 15:36:13.419: [ CRSCOMM][1157388608] Ipc: client version (5) version: 11.2.1.0 2014-09-29 15:36:13.419: [ CRSCOMM][1157388608] IpcL: Found connection in pending connections 2014-09-29 15:36:13.419: [ CRSCOMM][1157388608] IpcL: Adding connection: 5 2014-09-29 15:36:13.419: [CLSFRAME][1157388608] New IPC Member:{Relative|Node:0|Process:5|Type:3}:AGENT username=grid 2014-09-29 15:36:13.419: [CLSFRAME][1157388608] New process connected to us ID:{Relative|Node:0|Process:5|Type:3} Info:AGENT 2014-09-29 15:36:13.420: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server received the message: AGENT_HANDSHAKE[Proxy] ID 20484:11 2014-09-29 15:36:13.420: [ AGFW][1163692352]{0:5:2} Expected username [grid] actual [grid] for:/DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.420: [ AGFW][1163692352]{0:5:2} Agent /DBSoft/11.2.4/grid/bin/scriptagent_grid with pid:7275 connected to server. 2014-09-29 15:36:13.421: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[cluster_resource] ID 8196:280 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.421: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[local_resource] ID 8196:281 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.422: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[ora.cluster_resource.type] ID 8196:282 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.422: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[ora.cvu.type] ID 8196:283 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.422: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[ora.local_resource.type] ID 8196:284 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.422: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESTYPE_ADD[ora.oc4j.type] ID 8196:285 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.423: [ AGFW][1163692352]{0:5:2} Agfw Proxy Server sending message: RESOURCE_ADD[ora.cvu 1 1] ID 4356:286 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.423: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_START[ora.cvu 1 1] ID 4098:1263 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.423: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server replying to the message: AGENT_HANDSHAKE[Proxy] ID 20484:11 2014-09-29 15:36:13.425: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[cluster_resource] ID 8196:280 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.426: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[local_resource] ID 8196:281 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.426: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[ora.cluster_resource.type] ID 8196:282 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.427: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[ora.cvu.type] ID 8196:283 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.428: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[ora.local_resource.type] ID 8196:284 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.428: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESTYPE_ADD[ora.oc4j.type] ID 8196:285 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.428: [ AGFW][1163692352]{0:5:2} Received the reply to the message: RESOURCE_ADD[ora.cvu 1 1] ID 4356:286 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.480: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.cvu 1 1] ID 4098:287 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.481: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.cvu 1 1] ID 4098:1263 2014-09-29 15:36:13.532: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.cvu 1 1] ID 4098:287 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:13.532: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.cvu 1 1] ID 4098:1263 2014-09-29 15:36:14.975: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_START[ora.scan1.vip 1 1] ID 4098:1294 2014-09-29 15:36:14.975: [ AGFW][1163692352]{1:44501:180} Creating the resource: ora.scan1.vip 1 1 2014-09-29 15:36:14.975: [ AGFW][1163692352]{1:44501:180} Initializing the resource ora.scan1.vip 1 1 for type ora.scan_vip.type 2014-09-29 15:36:14.975: [ AGFW][1163692352]{1:44501:180} SR: acl = owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x 2014-09-29 15:36:14.976: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending message: RESOURCE_ADD[ora.scan1.vip 1 1] ID 4356:298 to the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:14.977: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_ADD[ora.scan1.vip 1 1] ID 4356:298 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:14.977: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_START[ora.scan1.vip 1 1] ID 4098:1294 to the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:15.477: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_START[ora.db01.vip 1 1] ID 4098:1297 2014-09-29 15:36:15.477: [ AGFW][1163692352]{1:44501:180} Creating the resource: ora.db01.vip 1 1 2014-09-29 15:36:15.477: [ AGFW][1163692352]{1:44501:180} Initializing the resource ora.db01.vip 1 1 for type ora.cluster_vip_net1.type 2014-09-29 15:36:15.477: [ AGFW][1163692352]{1:44501:180} SR: acl = owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x 2014-09-29 15:36:15.477: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending message: RESOURCE_ADD[ora.db01.vip 1 1] ID 4356:302 to the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:15.479: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_ADD[ora.db01.vip 1 1] ID 4356:302 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:15.479: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_START[ora.db01.vip 1 1] ID 4098:1297 to the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:17.491: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.scan1.vip 1 1] ID 4098:300 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:17.491: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.scan1.vip 1 1] ID 4098:1294 2014-09-29 15:36:17.491: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.scan1.vip 1 1] ID 4098:300 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:17.491: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.scan1.vip 1 1] ID 4098:1294 2014-09-29 15:36:17.520: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:1306 2014-09-29 15:36:17.520: [ AGFW][1163692352]{1:44501:180} Creating the resource: ora.LISTENER_SCAN1.lsnr 1 1 2014-09-29 15:36:17.521: [ AGFW][1163692352]{1:44501:180} Initializing the resource ora.LISTENER_SCAN1.lsnr 1 1 for type ora.scan_listener.type 2014-09-29 15:36:17.521: [ AGFW][1163692352]{1:44501:180} SR: acl = owner:grid:rwx,pgrp:oinstall:r-x,other::r-- 2014-09-29 15:36:17.521: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending message: RESOURCE_ADD[ora.LISTENER_SCAN1.lsnr 1 1] ID 4356:308 to the agent /DBSoft/11.2.4/grid/bin/oraagent_grid 2014-09-29 15:36:17.522: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_ADD[ora.LISTENER_SCAN1.lsnr 1 1] ID 4356:308 from the agent /DBSoft/11.2.4/grid/bin/oraagent_grid 2014-09-29 15:36:17.523: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:1306 to the agent /DBSoft/11.2.4/grid/bin/oraagent_grid 2014-09-29 15:36:17.538: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: RESOURCE_START[ora.oc4j 1 1] ID 4098:1310 2014-09-29 15:36:17.538: [ AGFW][1163692352]{1:44501:180} Creating the resource: ora.oc4j 1 1 2014-09-29 15:36:17.538: [ AGFW][1163692352]{1:44501:180} Initializing the resource ora.oc4j 1 1 for type ora.oc4j.type 2014-09-29 15:36:17.538: [ AGFW][1163692352]{1:44501:180} SR: acl = owner:grid:rwx,pgrp:oinstall:rwx,other::r-- 2014-09-29 15:36:17.539: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending message: RESOURCE_ADD[ora.oc4j 1 1] ID 4356:312 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:17.540: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_ADD[ora.oc4j 1 1] ID 4356:312 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:17.540: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server forwarding the message: RESOURCE_START[ora.oc4j 1 1] ID 4098:1310 to the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:17.569: [UiServer][1178401088] CS(0x2aaab00becd0)set Properties ( grid,0x1b344150) 2014-09-29 15:36:17.581: [UiServer][1176299840]{2:33589:66} Container [ Name: UI_START API_HDR_VER: TextMessage[2] ASYNC_TAG: TextMessage[1] CLIENT: TextMessage[] CLIENT_NAME: TextMessage[tnslsnr] CLIENT_PID: TextMessage[7304] CLIENT_PRIMARY_GROUP: TextMessage[oinstall] EVENT_TAG: TextMessage[1] HOST: TextMessage[db02] HOST_TAG: TextMessage[1] LOCALE: TextMessage[AMERICAN_AMERICA.AL32UTF8] NO_WAIT_TAG: TextMessage[1] RESOURCE: TextMessage[ora.LISTENER_SCAN1.lsnr USR_ORA_OPI=true] ] 2014-09-29 15:36:17.581: [UiServer][1176299840]{2:33589:66} Sending message to PE. ctx= 0x2aaab40525b0, Client PID: 7304 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} Container [ Name: ORDER MESSAGE: TextMessage[CRS-2545: Cannot operate on 'ora.LISTENER_SCAN1.lsnr'. It is locked by 'root' for command 'Stack Shutdown : db01' issued from 'db01'] MSGTYPE: TextMessage[1] OBJID: TextMessage[ora.LISTENER_SCAN1.lsnr] WAIT: TextMessage[0] ] 2014-09-29 15:36:17.584: [ COMMCRS][1176299840]clscsendx: (0x1b344150) Connection not active 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} CS(0x2aaab00becd0)Error sending msg over socket.6 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client. (File: clsMessageStream.cpp, line: 275 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} Container [ Name: UI_DATA ora.LISTENER_SCAN1.lsnr: TextMessage[233] ] 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} CS(0x2aaab00becd0)No connection to client.6 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client. (File: clsMessageStream.cpp, line: 275 2014-09-29 15:36:17.584: [UiServer][1176299840]{2:33589:66} Done for ctx=0x2aaab40525b0 2014-09-29 15:36:17.638: [UiServer][1178401088] CS(0x2aaab00bded0)set Properties ( grid,0x1b343e90) 2014-09-29 15:36:17.650: [UiServer][1176299840]{2:33589:67} Sending message to PE. ctx= 0x2aaab40375b0, Client PID: 6871 2014-09-29 15:36:17.653: [UiServer][1176299840]{2:33589:67} Done for ctx=0x2aaab40375b0 2014-09-29 15:36:17.662: [UiServer][1178401088] CS(0x2aaab00bdd80)set Properties ( grid,0x1b54b280) 2014-09-29 15:36:17.674: [UiServer][1176299840]{2:33589:68} Sending message to PE. ctx= 0x2aaab40375b0, Client PID: 6871 2014-09-29 15:36:17.676: [UiServer][1176299840]{2:33589:68} Done for ctx=0x2aaab40375b0 2014-09-29 15:36:17.686: [UiServer][1178401088] CS(0x2aaab00bd4e0)set Properties ( grid,0x1b346970) 2014-09-29 15:36:17.698: [UiServer][1176299840]{2:33589:69} Sending message to PE. ctx= 0x2aaab40375b0, Client PID: 6871 2014-09-29 15:36:17.700: [UiServer][1176299840]{2:33589:69} Done for ctx=0x2aaab40375b0 2014-09-29 15:36:17.990: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.db01.vip 1 1] ID 4098:304 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:17.991: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.db01.vip 1 1] ID 4098:1297 2014-09-29 15:36:17.991: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.db01.vip 1 1] ID 4098:304 from the agent /DBSoft/11.2.4/grid/bin/orarootagent_root 2014-09-29 15:36:17.991: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.db01.vip 1 1] ID 4098:1297 [ OCRSRV][1140578624]proas_get_value: failed in proas_relay_ack_request 207 2014-09-29 15:36:18.808: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:310 from the agent /DBSoft/11.2.4/grid/bin/oraagent_grid 2014-09-29 15:36:18.809: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:1306 2014-09-29 15:36:18.923: [UiServer][1178401088] CS(0x2aaab00bc3e0)set Properties ( grid,0x1b022650) 2014-09-29 15:36:18.935: [UiServer][1176299840]{2:33589:70} Sending message to PE. ctx= 0x2aaab4022160, Client PID: 6871 2014-09-29 15:36:18.938: [UiServer][1176299840]{2:33589:70} Done for ctx=0x2aaab4022160 2014-09-29 15:36:18.940: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:310 from the agent /DBSoft/11.2.4/grid/bin/oraagent_grid 2014-09-29 15:36:18.940: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.LISTENER_SCAN1.lsnr 1 1] ID 4098:1306 2014-09-29 15:36:27.059: [UiServer][1178401088] CS(0x2aaab00bbd10)set Properties ( grid,0x1b022650) 2014-09-29 15:36:27.070: [UiServer][1176299840]{2:33589:71} Sending message to PE. ctx= 0x2aaab4056190, Client PID: 6871 2014-09-29 15:36:27.073: [UiServer][1176299840]{2:33589:71} Done for ctx=0x2aaab4056190 2014-09-29 15:36:34.502: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.oc4j 1 1] ID 4098:314 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:34.502: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.oc4j 1 1] ID 4098:1310 2014-09-29 15:36:34.604: [ AGFW][1163692352]{1:44501:180} Received the reply to the message: RESOURCE_START[ora.oc4j 1 1] ID 4098:314 from the agent /DBSoft/11.2.4/grid/bin/scriptagent_grid 2014-09-29 15:36:34.604: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.oc4j 1 1] ID 4098:1310 2014-09-29 15:36:35.356: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server received the message: CMD_COMPLETED[Proxy] ID 20482:1367 2014-09-29 15:36:35.356: [ AGFW][1163692352]{1:44501:180} Agfw Proxy Server replying to the message: CMD_COMPLETED[Proxy] ID 20482:1367 2014-09-29 15:36:35.462: [GIPCHDEM][1115363648] gipchaDaemonProcessHAInvalidate: completed ha name invalidate for node 0x2aaaac14ccd0 { host 'db01', haName 'e20e-bb89-5130-202d', srcLuid a24af02a-da350a22, dstLuid 6961d085-3c142640 numInf 1, contigSeq 205, lastAck 142, lastValidAck 205, sendSeq [142 : 142], createTime 4294208350, sentRegister 1, localMonitor 0, flags 0x28 } 2014-09-29 15:36:35.630: [ CRSCCL][1086126400]Reconfig event received by clssgsgrpstat 2014-09-29 15:36:35.630: [ CRSCCL][1086126400]cclGetMemberData called 2014-09-29 15:36:35.631: [ CRSCCL][1086126400]Member (2, 4294214460, db02:11.2.0.4.0) @ found. 2014-09-29 15:36:35.631: [CLSFRAME][1086126400] CCL MEMBER LEFT:1:1:CRSD:db01 2014-09-29 15:36:35.631: [CLSFRAME][1086126400] Disconnected from CRSD:db01 process: {Absolute|Node:1|Process:2338224|Type:1} 2014-09-29 15:36:35.631: [ AGFW][1163692352]{2:33589:74} Agfw Proxy Server received process disconnected notification, count=1 2014-09-29 15:36:35.631: [ CRSCCL][1086126400]Reconfig handled 2014-09-29 15:36:35.894: [ OCRMAS][1121667392]rcfg_con:2: Member [1] left. Inc [3]. 2014-09-29 15:36:35.894: [ OCRMAS][1121667392]proath_master:24: Set context state to master_changing: last established master [1] new master [2]. 2014-09-29 15:36:35.894: [ OCRSRV][1121667392]th_not_master_change: Invoking master change callback. Master [2] Inc [3] 2014-09-29 15:36:35.894: [ OCRMSG][1119566144]prom_recv: Failed to receieve [3] 2014-09-29 15:36:35.894: [ OCRMSG][1119566144]GIPC error [3] msg [gipcretInvalidObject] 2014-09-29 15:36:35.894: [ CRSSE][1176299840]{2:33589:79} Master Change Event; New Master Node ID:2 This Node's ID:2 2014-09-29 15:36:35.895: [ OCRMAS][1121667392]th_master:13: I AM THE NEW OCR MASTER at incar 2. Node Number 2 2014-09-29 15:36:35.895: [ CRSPE][1174198592]{2:33589:79} PE Role|State Update: old role [SLAVE] new [MASTER]; old state [Running] new [Configuring] 2014-09-29 15:36:35.895: [ CRSPE][1174198592]{2:33589:79} PE MASTER NAME: db02 2014-09-29 15:36:35.895: [ CRSPE][1174198592]{2:33589:79} Starting to read configuration 2014-09-29 15:36:35.895: [ OCRSRV][1165793600]proas_amiwriter: ctx is MASTER CHANGING/CONNECTING 2014-09-29 15:36:35.895: [ OCRSRV][1167894848]proas_amiwriter: ctx is MASTER CHANGING/CONNECTING 2014-09-29 15:36:35.895: [ OCRASM][1121667392]proprasmcache: ASM cache size is [5MB] 2014-09-29 15:36:35.905: [ OCRASM][1121667392]proprasmcache: ASM cache [5MB] enabled for disk group [OCR_VOTE]. 2014-09-29 15:36:35.915: [ OCRRAW][1121667392]proprioo: for disk 0 (+OCR_VOTE), id match (1), total id sets, (1) need recover (0), my votes (0), total votes (0), commit_lsn (75), lsn (75) 2014-09-29 15:36:35.915: [ OCRRAW][1121667392]proprioo: my id set: (760227868, 1028247821, 0, 0, 0) 2014-09-29 15:36:35.915: [ OCRRAW][1121667392]proprioo: 1st set: (760227868, 1028247821, 0, 0, 0) 2014-09-29 15:36:35.915: [ OCRRAW][1121667392]proprioo: 2nd set: (0, 0, 0, 0, 0) 2014-09-29 15:36:35.925: [ OCRSRV][1121667392]proath_update_grppubdata: Successfully updated and published the configured devices in public data. 2014-09-29 15:36:35.938: [ OCRMAS][1121667392]th_master:21.1: Wake up upgrade thread [ OCRMAS][1121667392]th_master: Received group public data event. Incarnation [2] 2014-09-29 15:36:35.939: [ OCRMAS][1121667392]th_master:1': Recvd pubdata event from node [2] 2014-09-29 15:36:35.939: [ OCRMAS][1121667392]th_master:2': Recvd pubdata event for self. Do nothing. 2014-09-29 15:36:35.939: [ OCRSRV][1182603584]th_upgrade: Starting upgrade calculation 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} DM: set global config version to: 57 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} DM: set pool freeze timeout to: 60000 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} DM: Set event seq number to: 300000 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} DM: Set threshold event seq number to: 380000 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} Sent request to write event sequence number 400000 to repository 2014-09-29 15:36:35.951: [ CRSPE][1174198592]{2:33589:79} Requesting recovery information from server: db02 2014-09-29 15:36:35.957: [ CRSPE][1174198592]{2:33589:79} Starting the recovery timer for: 93 2014-09-29 15:36:35.957: [ CRSPE][1174198592]{2:33589:79} Reading (2) servers 2014-09-29 15:36:35.957: [ AGFW][1163692352]{2:33589:79} Agfw Proxy Server received the message: REQ_RECOVERYINFO[Proxy] ID 20485:358 2014-09-29 15:36:35.957: [ AGFW][1163692352]{2:33589:79} AGFW global config version = 57, PE global config version = 57 2014-09-29 15:36:35.957: [ AGFW][1163692352]{2:33589:79} Sending the recovery information to the master..Total number of pending cmds = 0 2014-09-29 15:36:35.958: [ AGFW][1163692352]{2:33589:79} Agfw Proxy Server sending message to PE, Contents = [MIDTo:2|OpID:9|FromA:{Invalid|Node:0|Process:0|Type:0}|ToA:{Invalid|Node:-1|Process:-1|Type:-1}|MIDFrom:0|Type:4|Pri2|Id:360:Ver:2] 2014-09-29 15:36:35.959: [ AGFW][1163692352]{2:33589:79} Agfw Proxy Server replying to the message: REQ_RECOVERYINFO[Proxy] ID 20485:358 2014-09-29 15:36:35.959: [ CRSPE][1174198592]{2:33589:79} Received recovery information from: db02 2014-09-29 15:36:35.978: [ CRSPE][1174198592]{2:33589:79} Wrote new event sequence to repository 2014-09-29 15:36:35.980: [ OCRSRV][1182603584]th_upgrade:10.1 AV [186647552]. State [11]. Already upgraded.Updated global data to the crs version group. Return [0] 2014-09-29 15:36:35.992: [ CRSPE][1174198592]{2:33589:79} Reading (16) types 2014-09-29 15:36:35.994: [ CRSPE][1174198592]{2:33589:79} Reading (2) server pools 2014-09-29 15:36:36.002: [ CRSPE][1174198592]{2:33589:79} Reading (16) resources 2014-09-29 15:36:36.228: [ CRSPE][1174198592]{2:33589:79} Finished reading configuration. Parsing... 2014-09-29 15:36:36.228: [ CRSPE][1174198592]{2:33589:79} Parsing resource types... 2014-09-29 15:36:36.255: [ CRSPE][1174198592]{2:33589:79} Resource Types parsed 2014-09-29 15:36:36.263: [ CRSPE][1174198592]{2:33589:79} Parsing server pools... 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Parsed and validated SERVERPOOL: Free [min:0][max:-1][importance:0] NO SERVERS ASSIGNED 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Parsed and validated SERVERPOOL: Generic [min:2147483647][max:-1][importance:0] NO SERVERS ASSIGNED 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Server pools parsed 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Server Pool Free has been registered 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Server Pool Generic has been registered 2014-09-29 15:36:36.264: [ CRSPE][1174198592]{2:33589:79} Parsing resources... 2014-09-29 15:36:36.275: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.BAK001.dg db01 1] 2014-09-29 15:36:36.275: [ CRSPE][1174198592]{2:33589:79} RI [ora.BAK001.dg db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.275: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.BAK001.dg db02 1] 2014-09-29 15:36:36.275: [ CRSPE][1174198592]{2:33589:79} RI [ora.BAK001.dg db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} Resource ora.BAK001.dg has been registered with the PE data model:0x2aaaac458a10 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.DATA001.dg db01 1] 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} RI [ora.DATA001.dg db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.DATA001.dg db02 1] 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} RI [ora.DATA001.dg db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.276: [ CRSPE][1174198592]{2:33589:79} Resource ora.DATA001.dg has been registered with the PE data model:0x2aaaac45ddf0 2014-09-29 15:36:36.277: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.FRA001.dg db01 1] 2014-09-29 15:36:36.277: [ CRSPE][1174198592]{2:33589:79} RI [ora.FRA001.dg db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.277: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.FRA001.dg db02 1] 2014-09-29 15:36:36.277: [ CRSPE][1174198592]{2:33589:79} RI [ora.FRA001.dg db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.277: [ CRSPE][1174198592]{2:33589:79} Resource ora.FRA001.dg has been registered with the PE data model:0x2aaaac462e30 2014-09-29 15:36:36.278: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.LISTENER.lsnr db01 1] 2014-09-29 15:36:36.278: [ CRSPE][1174198592]{2:33589:79} RI [ora.LISTENER.lsnr db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.278: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.LISTENER.lsnr db02 1] 2014-09-29 15:36:36.278: [ CRSPE][1174198592]{2:33589:79} RI [ora.LISTENER.lsnr db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.278: [ CRSPE][1174198592]{2:33589:79} Resource ora.LISTENER.lsnr has been registered with the PE data model:0x2aaaac468740 2014-09-29 15:36:36.279: [ CRSPE][1174198592]{2:33589:79} Resource ora.LISTENER_SCAN1.lsnr has been registered with the PE data model:0x2aaaac46e410 2014-09-29 15:36:36.280: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.OCR_VOTE.dg db01 1] 2014-09-29 15:36:36.280: [ CRSPE][1174198592]{2:33589:79} RI [ora.OCR_VOTE.dg db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.280: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.OCR_VOTE.dg db02 1] 2014-09-29 15:36:36.280: [ CRSPE][1174198592]{2:33589:79} RI [ora.OCR_VOTE.dg db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.280: [ CRSPE][1174198592]{2:33589:79} Resource ora.OCR_VOTE.dg has been registered with the PE data model:0x2aaaac472d80 2014-09-29 15:36:36.281: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.asm db01 1] 2014-09-29 15:36:36.281: [ CRSPE][1174198592]{2:33589:79} RI [ora.asm db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.281: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.asm db02 1] 2014-09-29 15:36:36.281: [ CRSPE][1174198592]{2:33589:79} RI [ora.asm db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.281: [ CRSPE][1174198592]{2:33589:79} Resource ora.asm has been registered with the PE data model:0x2aaaac4784c0 2014-09-29 15:36:36.282: [ CRSPE][1174198592]{2:33589:79} Resource ora.cvu has been registered with the PE data model:0x2aaaac47d7a0 2014-09-29 15:36:36.283: [ CRSPE][1174198592]{2:33589:79} Resource ora.db01.vip has been registered with the PE data model:0x2aaaac482940 2014-09-29 15:36:36.284: [ CRSPE][1174198592]{2:33589:79} Resource ora.db02.vip has been registered with the PE data model:0x2aaaac487bc0 2014-09-29 15:36:36.285: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.gsd db01 1] 2014-09-29 15:36:36.285: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.gsd db02 1] 2014-09-29 15:36:36.285: [ CRSPE][1174198592]{2:33589:79} Resource ora.gsd has been registered with the PE data model:0x2aaaac48bfa0 2014-09-29 15:36:36.286: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.net1.network db01 1] 2014-09-29 15:36:36.286: [ CRSPE][1174198592]{2:33589:79} RI [ora.net1.network db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.286: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.net1.network db02 1] 2014-09-29 15:36:36.286: [ CRSPE][1174198592]{2:33589:79} RI [ora.net1.network db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.286: [ CRSPE][1174198592]{2:33589:79} Resource ora.net1.network has been registered with the PE data model:0x2aaaac490a60 2014-09-29 15:36:36.287: [ CRSPE][1174198592]{2:33589:79} Resource ora.oc4j has been registered with the PE data model:0x2aaaac496020 2014-09-29 15:36:36.287: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.ons db01 1] 2014-09-29 15:36:36.287: [ CRSPE][1174198592]{2:33589:79} RI [ora.ons db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.287: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.ons db02 1] 2014-09-29 15:36:36.287: [ CRSPE][1174198592]{2:33589:79} RI [ora.ons db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.288: [ CRSPE][1174198592]{2:33589:79} Resource ora.ons has been registered with the PE data model:0x2aaaac49a7e0 2014-09-29 15:36:36.288: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db01 for [ora.registry.acfs db01 1] 2014-09-29 15:36:36.288: [ CRSPE][1174198592]{2:33589:79} RI [ora.registry.acfs db01 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.288: [ CRSPE][1174198592]{2:33589:79} Set LAST_SERVER to db02 for [ora.registry.acfs db02 1] 2014-09-29 15:36:36.288: [ CRSPE][1174198592]{2:33589:79} RI [ora.registry.acfs db02 1] new target state: [ONLINE] old value: [OFFLINE] 2014-09-29 15:36:36.289: [ CRSPE][1174198592]{2:33589:79} Resource ora.registry.acfs has been registered with the PE data model:0x2aaaac49fa50 2014-09-29 15:36:36.290: [ CRSPE][1174198592]{2:33589:79} Resource ora.scan1.vip has been registered with the PE data model:0x2aaaac4a5880 2014-09-29 15:36:36.290: [ CRSPE][1174198592]{2:33589:79} Resources parsed 2014-09-29 15:36:36.290: [ CRSPE][1174198592]{2:33589:79} Server [db01] has been registered with the PE data model 2014-09-29 15:36:36.290: [ CRSPE][1174198592]{2:33589:79} STARTUPCMD_REQ = false: 0 2014-09-29 15:36:36.290: [ CRSPE][1174198592]{2:33589:79} Server [db01] has changed state from [Invalid/unitialized] to [VISIBLE] 2014-09-29 15:36:36.291: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch processing... 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} Server [db02] has been registered with the PE data model 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} STARTUPCMD_REQ = false: 0 2014-09-29 15:36:36.291: [ CRSRPT][1176299840]{2:33589:79} Connecting to EVM 2014-09-29 15:36:36.291: [ CRSCEVT][1176299840]{2:33589:79} ClusterPublisher::connect connecting 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} Server [db02] has changed state from [Invalid/unitialized] to [JOINING] 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} Server db02 is expected to rejoin the cluster. 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} PE Role|State Update: old role [MASTER] new [MASTER]; old state [Configuring] new [Starting] 2014-09-29 15:36:36.291: [ CRSPE][1174198592]{2:33589:79} Sending join request: MIDTo:2|OpID:11|FromA:{Invalid|Node:0|Process:0|Type:0}|ToA:{Invalid|Node:-1|Process:-1|Type:-1}|MIDFrom:0|Type:4|Pri2|Id:368:Ver:2 2014-09-29 15:36:36.292: [ CRSPE][1174198592]{2:33589:79} Configuration has been parsed 2014-09-29 15:36:36.292: [ CRSCEVT][1176299840]{2:33589:79} ClusterPublisher::connect Connected 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Got Join Req from [db02] Msg Details: MIDTo:2|OpID:11|FromA:{Absolute|Node:2|Process:-752836|Type:1}|ToA:{Absolute|Node:2|Process:-752836|Type:1}|MIDFrom:2|Type:1|Pri2|Id:368:Ver:2String params:SIGNATURE=db-cluster|SNN=db02|Int params:NODE_INC=307480783|RES_PROBE_TAG=1|_RC=0|Map params: Map [SV_MAP] BEGIN OF VALUES:CRS_CSS_NODENAME=db02|CRS_CSS_NODENUMBER=2|CRS_CSS_NODENUMBER_PLUS1=3|CRS_HOME=/DBSoft/11.2.4/grid| END OF VALUES 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} PE Role|State Update: old role [MASTER] new [MASTER]; old state [Starting] new [Running] 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Processing pending join requests: 1 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Server [db02] has changed state from [JOINING] to [ONLINE] 2014-09-29 15:36:36.293: [ AGFW][1163692352]{2:33589:79} Agfw Proxy Server received the message: PE_HANDHSAKE[Proxy] ID 20487:370 2014-09-29 15:36:36.293: [ AGFW][1163692352]{2:33589:79} Received handshake message from PE. 2014-09-29 15:36:36.293: [ AGFW][1163692352]{2:33589:79} Agfw Proxy Server replying to the message: PE_HANDHSAKE[Proxy] ID 20487:370 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Appending a resource alias for: ora.LISTENER.lsnr 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Resource ora.db02.LISTENER_DB02.lsnr has been registered with the PE data model:0x2aaaac46f9e0 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Appending a resource alias for: ora.asm 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Resource ora.db02.ASM2.asm has been registered with the PE data model:0x2aaaac372ab0 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Appending a resource alias for: ora.gsd 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Resource ora.db02.gsd has been registered with the PE data model:0x2aaaac47e960 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Appending a resource alias for: ora.ons 2014-09-29 15:36:36.293: [ CRSPE][1174198592]{2:33589:79} Resource ora.db02.ons has been registered with the PE data model:0x2aaaac47fbd0 2014-09-29 15:36:36.293: [ CRSRPT][1176299840]{2:33589:79} Published to EVM CRS_SERVER_STATE_CHANGE for db01 2014-09-29 15:36:36.294: [ CRSRPT][1176299840]{2:33589:79} Published to EVM CRS_SERVER_STATE_CHANGE for db02 2014-09-29 15:36:36.340: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch done. 2014-09-29 15:36:36.340: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch processing... 2014-09-29 15:36:36.356: [GIPCHAUP][1113262400] gipchaUpperDisconnect: initiated discconnect umsg 0x2aaab0084570 { msg 0x2aaab0120068, ret gipcretRequestPending (15), flags 0x2 }, msg 0x2aaab0120068 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00000aca, dstCid 00000000-00001f5c }, endp 0x2aaaac370c90 [0000000000000aca] { gipchaEndpoint : port 'bdc3-4793-190e-2853/dfa9-e361-7387-7e79', peer 'db01:b217-bf27-b3fe-627f', srcCid 00000000-00000aca, dstCid 00000000-00001f5c, numSend 0, maxSend 100, groupListType 2, hagroup 0x2aaaac15f440, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:36.356: [GIPCHAUP][1113262400] gipchaUpperDisconnect: initiated discconnect umsg 0x2aaab010f5a0 { msg 0x2aaab00b16d8, ret gipcretRequestPending (15), flags 0x2 }, msg 0x2aaab00b16d8 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00000253, dstCid 00000000-00001b89 }, endp 0x1b0f1d70 [0000000000000253] { gipchaEndpoint : port '542c-b1b2-8427-5793', peer 'db01:ee7e-889d-248a-ce00/e36a-0584-a7b1-7255', srcCid 00000000-00000253, dstCid 00000000-00001b89, numSend 0, maxSend 100, groupListType 2, hagroup 0x2aaaac15f440, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:36.356: [GIPCHAUP][1113262400] gipchaUpperDisconnect: initiated discconnect umsg 0x2aaab011d410 { msg 0x2aaab00b0638, ret gipcretRequestPending (15), flags 0x2 }, msg 0x2aaab00b0638 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-000002ae, dstCid 00000000-00001bb7 }, endp 0x1b0de1f0 [00000000000002ae] { gipchaEndpoint : port '1628-1cad-f0b0-838c', peer 'db01:ee7e-889d-248a-ce00/afbd-2f08-61ec-7c58', srcCid 00000000-000002ae, dstCid 00000000-00001bb7, numSend 0, maxSend 100, groupListType 2, hagroup 0x2aaaac15f440, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:36.356: [GIPCHAUP][1113262400] gipchaUpperDisconnect: initiated discconnect umsg 0x2aaab00017e0 { msg 0x2aaab009a068, ret gipcretRequestPending (15), flags 0x2 }, msg 0x2aaab009a068 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00000779, dstCid 00000000-00001da4 }, endp 0x1b545060 [0000000000000779] { gipchaEndpoint : port '6251-7879-6e3f-e18d', peer 'db01:CLSFRAME_1/96fc-33f4-8dc1-8aa7', srcCid 00000000-00000779, dstCid 00000000-00001da4, numSend 0, maxSend 100, groupListType 2, hagroup 0x2aaaac1db5b0, usrFlags 0x4000, flags 0x21c } 2014-09-29 15:36:36.356: [GIPCHAUP][1113262400] gipchaUpperProcessNodeDeath: destroying the failed node interface 0x1b0f5910 { host 'db01', haName 'e20e-bb89-5130-202d', local 0x1b077f30, ip '10.10.11.1:19144', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0x6 } 2014-09-29 15:36:36.369: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch done. 2014-09-29 15:36:36.369: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch processing... 2014-09-29 15:36:36.400: [ CRSOCR][1172097344]{2:33589:79} Multi Write Batch done. 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} Server JOIN completed for : db02, incarnation num = 307480783 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} No startup command created for server:db02 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} Starting Master Recovery... 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} Processing recovery info messages... 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} Processing recovery information for [db02] MIDTo:2|OpID:9|FromA:{Absolute|Node:2|Process:-752836|Type:1}|ToA:{Absolute|Node:2|Process:-752836|Type:1}|MIDFrom:4|Type:1|Pri2|Id:360:Ver:2 2014-09-29 15:36:36.400: [CLSFRAME][1174198592]{2:33589:79} String params:SERVER_NAME=db02| 2014-09-29 15:36:36.400: [CLSFRAME][1174198592]{2:33589:79} Int params:RECV_CMDS_COUNT=0|RES_COUNT=15|_RC=0| 2014-09-29 15:36:36.400: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.BAK001.dg db02 1 2014-09-29 15:36:36.403: [ CRSPE][1174198592]{2:33589:79} RI [ora.BAK001.dg db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.403: [ CRSPE][1174198592]{2:33589:79} ora.BAK001.dg db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.DATA001.dg db02 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.DATA001.dg db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.DATA001.dg db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.FRA001.dg db02 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.FRA001.dg db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.FRA001.dg db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.LISTENER.lsnr db02 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.LISTENER.lsnr db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.LISTENER.lsnr db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.LISTENER_SCAN1.lsnr 1 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.LISTENER_SCAN1.lsnr 1 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.LISTENER_SCAN1.lsnr 1 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.OCR_VOTE.dg db02 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.OCR_VOTE.dg db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.OCR_VOTE.dg db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.asm db02 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.asm db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [Started] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Set State Details to [Started] from [ ] for [ora.asm db02 1] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.asm db02 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.cvu 1 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.cvu 1 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} ora.cvu 1 1 lock value: 0 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.db01.vip 1 1 2014-09-29 15:36:36.404: [ CRSPE][1174198592]{2:33589:79} RI [ora.db01.vip 1 1] new external state [INTERMEDIATE] old value: [OFFLINE] on db02 label = [FAILED OVER] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Set State Details to [FAILED OVER] from [ ] for [ora.db01.vip 1 1] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.db01.vip 1 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.db02.vip 1 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.db02.vip 1 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.db02.vip 1 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.net1.network db02 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.net1.network db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.net1.network db02 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.oc4j 1 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.oc4j 1 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.oc4j 1 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.ons db02 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.ons db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.ons db02 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.registry.acfs db02 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.registry.acfs db02 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.registry.acfs db02 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Extracting resource info for: ora.scan1.vip 1 1 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} RI [ora.scan1.vip 1 1] new external state [ONLINE] old value: [OFFLINE] on db02 label = [] 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} ora.scan1.vip 1 1 lock value: 0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Creating recovery command... 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Processing orphan command messages... 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Processing the queued orphan cmd messages .. count=0 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Processing resource fault messages... 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Processing res faults for db02 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Done processing res faults for db02 2014-09-29 15:36:36.405: [ CRSPE][1174198592]{2:33589:79} Sending master change update to UI servers... 2014-09-29 15:36:36.405: [ AGFW][1163692352]{2:33589:79} Deleting the message: MIDTo:2|OpID:9|FromA:{Absolute|Node:2|Process:-752836|Type:1}|ToA:{Absolute|Node:2|Process:-752836|Type:1}|MIDFrom:4|Type:2|Pri2|Id:360:Ver:2 2014-09-29 15:36:36.406: [UiServer][1176299840]{2:33589:79} Master change notification has received. New master: 2 2014-09-29 15:36:36.406: [ CRSPE][1174198592]{2:33589:79} Processing pernding UI messages... 2014-09-29 15:36:36.406: [ CRSPE][1174198592]{2:33589:79} Master Recovery Completed 2014-09-29 15:36:36.406: [UiServer][1176299840]{2:33589:79} Master change notification has received. New master: 2 2014-09-29 15:36:36.406: [ CRSPE][1174198592]{2:33589:79} Got confirmation from UI on 2 2014-09-29 15:36:46.299: [ CRSSE][1148983616] Forwarding Node Leave to PE for: db01 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} Server [db01] has changed state from [VISIBLE] to [OFFLINE] 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} Processing PE command id=149. Description: [Server Fault {db01 } : forceEvents=0 : 0x2aaaac485250] 2014-09-29 15:36:46.299: [ CRSD][1174198592]{2:33589:80} {2:33589:80} Created alert : (:CRSPE00141:) : Posting Node Down Event 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} Filtering duplicate ops: server [] state [ONLINE] 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} PE Command [ Server Fault {db01 } : forceEvents=0 : 0x2aaaac485250 ] has completed 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} Deleting the server : db01 2014-09-29 15:36:46.299: [ CRSPE][1174198592]{2:33589:80} Server [db01] has been unregistered with the PE data model 2014-09-29 15:36:46.300: [ CRSD][1174198592]{2:33589:80} {2:33589:80}Server [db01] has been un-assigned from the server pool: Free 2014-09-29 15:36:46.300: [ CRSRPT][1176299840]{2:33589:80} Reporter publishing Node Down to EVM 2014-09-29 15:36:46.300: [ CRSOCR][1172097344]{2:33589:80} Multi Write Batch processing... 2014-09-29 15:36:46.310: [ CRSRPT][1176299840]{2:33589:80} Published to EVM CRS_SERVER_STATE_CHANGE for db01 2014-09-29 15:36:46.310: [ CRSRPT][1176299840]{2:33589:80} Published to EVM CRS_SERVER_POOL_STATE_CHANGE for Free 2014-09-29 15:36:46.359: [ CRSOCR][1172097344]{2:33589:80} Multi Write Batch done. 2014-09-29 15:36:47.390: [UiServer][1178401088] CS(0x2aaab00d3230)set Properties ( ,0x1b269870) 2014-09-29 15:36:47.402: [UiServer][1176299840]{2:33589:81} Sending message to PE. ctx= 0x2aaab008b430 2014-09-29 15:36:47.402: [ CRSPE][1174198592]{2:33589:81} Processing PE command id=150. Description: [Stat Resource : 0x2aaaac45f580] 2014-09-29 15:36:47.408: [UiServer][1176299840]{2:33589:81} Done for ctx=0x2aaab008b430 2014-09-29 15:36:56.748: [GIPCHGEN][1115363648] gipchaInterfaceFail: marking interface failing 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:33873', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0xd } 2014-09-29 15:36:57.356: [GIPCHGEN][1113262400] gipchaInterfaceDisable: disabling interface 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:33873', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x18d } 2014-09-29 15:36:57.356: [GIPCHDEM][1113262400] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:33873', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x1ad } 2014-09-29 15:36:57.357: [GIPCXCPT][1115363648] gipchaDaemonProcessRecv: dropping unrecognized daemon request 17, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x5 }, ret gipcretFail (1) 2014-09-29 15:36:57.357: [GIPCHDEM][1115363648] gipchaDaemonProcessRecv: EXCEPTION[ ret gipcretFail (1) ] exception processing requset type 17, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x5 } 2014-09-29 15:36:58.356: [GIPCHDEM][1115363648] gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x1 } to gipcd 2014-09-29 15:37:04.357: [GIPCHDEM][1115363648] gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x1 } to gipcd 2014-09-29 15:37:10.357: [GIPCHDEM][1115363648] gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x1 } to gipcd 2014-09-29 15:37:11.979: [GIPCHGEN][1115363648] gipchaNodeAddInterface: adding interface information for inf 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x1 } 2014-09-29 15:37:12.357: [GIPCHTHR][1113262400] gipchaWorkerCreateInterface: created local interface for node 'db02', haName 'ab7a-cf72-e3c3-0259', inf 'udp://10.10.11.2:24301' 2014-09-29 15:37:19.055: [UiServer][1178401088] CS(0x2aaab00bb7b0)set Properties ( grid,0x1b269870) 2014-09-29 15:37:19.067: [UiServer][1176299840]{2:33589:82} Sending message to PE. ctx= 0x2aaab00b94a0, Client PID: 6871 2014-09-29 15:37:19.067: [ CRSPE][1174198592]{2:33589:82} Processing PE command id=151. Description: [Stat Resource : 0x2aaaac45f580] 2014-09-29 15:37:19.067: [ CRSPE][1174198592]{2:33589:82} Expression Filter : ((NAME == ora.scan1.vip) AND (LAST_SERVER == db02)) 2014-09-29 15:37:19.069: [UiServer][1176299840]{2:33589:82} Done for ctx=0x2aaab00b94a0 2014-09-29 15:37:46.761: [GIPCHGEN][1115363648] gipchaInterfaceFail: marking interface failing 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:24301', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0xd } 2014-09-29 15:37:47.379: [GIPCHGEN][1113262400] gipchaInterfaceDisable: disabling interface 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:24301', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x18d } 2014-09-29 15:37:47.379: [GIPCHDEM][1113262400] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2:24301', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x1ad } 2014-09-29 15:37:47.379: [GIPCXCPT][1115363648] gipchaDaemonProcessRecv: dropping unrecognized daemon request 17, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x5 }, ret gipcretFail (1) 2014-09-29 15:37:47.379: [GIPCHDEM][1115363648] gipchaDaemonProcessRecv: EXCEPTION[ ret gipcretFail (1) ] exception processing requset type 17, hctx 0x1af1b3f0 [0000000000000010] { gipchaContext : host 'db02', name 'ab7a-cf72-e3c3-0259', luid 'a24af02a-00000000', numNode 0, numInf 0, usrFlags 0x0, flags 0x5 } 2014-09-29 15:37:48.519: [GIPCHGEN][1115363648] gipchaNodeAddInterface: adding interface information for inf 0x1b077f30 { host '', haName 'ab7a-cf72-e3c3-0259', local (nil), ip '10.10.11.2', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x1 } 2014-09-29 15:37:49.379: [GIPCHTHR][1113262400] gipchaWorkerCreateInterface: created local interface for node 'db02', haName 'ab7a-cf72-e3c3-0259', inf 'udp://10.10.11.2:63341' 2014-09-29 15:38:19.029: [UiServer][1178401088] CS(0x2aaab0095dc0)set Properties ( root,0x1b54db00) 2014-09-29 15:38:19.041: [UiServer][1176299840]{2:33589:83} Sending message to PE. ctx= 0x2aaaac4583f0, Client PID: 6263 2014-09-29 15:38:19.041: [ CRSPE][1174198592]{2:33589:83} Processing PE command id=152. Description: [Stat Resource : 0x1b349af0] 2014-09-29 15:38:19.041: [UiServer][1176299840]{2:33589:83} Done for ctx=0x2aaaac4583f0 ocssd 2014-09-29 15:37:18.813: [ CSSD][1091881280]clssnmCompleteGMReq: Completed request type 17 with status 1 2014-09-29 15:37:18.813: [ CSSD][1091881280]clssgmDoneQEle: re-queueing req 0x2aaaac9c9900 status 1 2014-09-29 15:37:18.813: [ CSSD][1101797696]clssgmCheckReqNMCompletion: Completing request type 17 for proc (0x2aaaac65d250), operation status 1, client status 0 2014-09-29 15:37:22.990: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:22.990: [ CSSD][1128089920]clssnmSendingThread: sent 5 status msgs to all nodes 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmExitGrock: client 1 (0x2aaaac9a88c0), grock CRF-, member 2 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmUnregisterPrimary: Unregistering member 2 (0x2aaaac9a8be0) in global grock CRF- 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmQueueGrockEvent: groupName(CRF-) count(2) master(3) event(2), incarn 10, mbrc 2, to member 3, events 0x38, state 0x0 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmUnreferenceMember: global grock CRF- member 2 refcount is 0 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmAllocateRPCIndex: allocated rpc 139 (0x2aaaaaedfb48) 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmRPC: rpc 0x2aaaaaedfb48 (RPC#139) tag(8b002a) sent to node 2 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmHandleMasterMemberExit: [s(2) d(2)] 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmRemoveMember: grock CRF-, member number 2 (0x2aaaac9a8be0) node number 2 state 0x4 grock type 2 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmQueueGrockEvent: groupName(CRF-) count(1) master(3) event(1), incarn 11, mbrc 1, to member 3, events 0x38, state 0x0 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmRPCDone: rpc 0x2aaaaaedfb48 (RPC#139) state 6, flags 0x100 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmDelMemCmpl: rpc 0x2aaaaaedfb48, ret 0, client 0x2aaaac9a88c0 member 0x2aaaac9a8be0 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmFreeRPCIndex: freeing rpc 139 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmAllocateRPCIndex: allocated rpc 140 (0x2aaaaaedfbf0) 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmRPCBroadcast: rpc(0x8c002a), status(0), sendcount(0), filtered by specific properties: 2014-09-29 15:37:23.016: [ CSSD][1124936000]clssgmFreeRPCIndex: freeing rpc 140 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmDiscEndpcl: gipcDestroy 0x4a20 2014-09-29 15:37:23.016: [ CSSD][1101797696]clssgmUnregNodeGroup: Unregistering member (0x2aaaac9ab960) client (0x2aaaac9ab640) 2014-09-29 15:37:23.017: [ CSSD][1101797696]clssgmDiscEndpcl: gipcDestroy 0x4a4e 2014-09-29 15:37:23.017: [ CSSD][1101797696]clssgmDeadProc: proc 0x2aaaac9c3410 2014-09-29 15:37:23.017: [ CSSD][1101797696]clssgmDestroyProc: cleaning up proc(0x2aaaac9c3410) con(0x49f1) skgpid ospid 7243 with 0 clients, refcount 0 2014-09-29 15:37:23.017: [ CSSD][1101797696]clssgmDiscEndpcl: gipcDestroy 0x49f1 2014-09-29 15:37:23.020: [ CSSD][1101797696]clssscSelect: cookie accept request 0x2aaaac024f00 2014-09-29 15:37:23.020: [ CSSD][1101797696]clssgmAllocProc: (0x2aaaac9c0230) allocated 2014-09-29 15:37:23.020: [ CSSD][1101797696]clssgmClientConnectMsg: properties of cmProc 0x2aaaac9c0230 - 1,2,3,4,5 2014-09-29 15:37:23.020: [ CSSD][1101797696]clssgmClientConnectMsg: Connect from con(0x624f) proc(0x2aaaac9c0230) pid(7243) version 11:2:1:4, properties: 1,2,3,4,5 2014-09-29 15:37:23.020: [ CSSD][1101797696]clssgmClientConnectMsg: msg flags 0x0000 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssscSelect: cookie accept request 0x2aaaac9c0230 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssscevtypSHRCON: getting client with cmproc 0x2aaaac9c0230 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmRegisterClient: proc(22/0x2aaaac9c0230), client(1/0x2aaaac9add10) 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmJoinGrock: global grock CRF- new client 0x2aaaac9add10 with con 0x627e, requested num -1, flags 0x4000e00 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmAddGrockMember: adding member to grock CRF- 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmAllocateRPCIndex: allocated rpc 141 (0x2aaaaaedfc98) 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmRPC: rpc 0x2aaaaaedfc98 (RPC#141) tag(8d002a) sent to node 2 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmHandleMasterMemberAdd: [s(2) d(2)] 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmAddMember: Adding fencing for member 0, group CRF-, death 1, SAGE 0 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmAddMember: member (0/0x2aaaac5594d0) added. pbsz(0) prsz(336) flags 0x0 to grock (0x9ddf690/CRF-) 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmQueueGrockEvent: groupName(CRF-) count(2) master(3) event(1), incarn 12, mbrc 2, to member 3, events 0x38, state 0x0 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmQueueGrockEvent: groupName(CRF-) count(2) master(3) event(1), incarn 12, mbrc 2, to member 0, events 0x38, state 0x0 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmCommonAddMember: global group grock CRF- member(0/Local) node(2) flags 0x0 0x430d02f0 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmRPCDone: rpc 0x2aaaaaedfc98 (RPC#141) state 6, flags 0x100 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmAddGrockMemCmpl: rpc 0x2aaaaaedfc98, ret 0, client 0x2aaaac9add10 member 0x2aaaac5594d0 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmAddGrockMemCmpl: sending type 6, size 540 to 0x2aaaac9add10 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmFreeRPCIndex: freeing rpc 141 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmAllocateRPCIndex: allocated rpc 142 (0x2aaaaaedfd40) 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmRPCBroadcast: rpc(0x8e002a), status(0), sendcount(0), filtered by specific properties: 2014-09-29 15:37:23.021: [ CSSD][1124936000]clssgmFreeRPCIndex: freeing rpc 142 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssscSelect: cookie accept request 0x2aaaac9c0230 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssscevtypSHRCON: getting client with cmproc 0x2aaaac9c0230 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmRegisterClient: proc(22/0x2aaaac9c0230), client(2/0x2aaaac6537f0) 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmJoinGrock: local grock CSS_INTERNAL_NODE_GROUP new client 0x2aaaac6537f0 with con 0x6298, requested num 0, flags 0x30000 2014-09-29 15:37:23.021: [ CSSD][1101797696]clssgmAddNodeGrpMember: member (0x2aaaac60e430) added 2014-09-29 15:37:23.022: [ CSSD][1101797696]clssgmQueueGrockEvent: groupName(CRF-) count(2) master(3) event(6), incarn 2, mbrc 1, to member 3, events 0x38, state 0x0 2014-09-29 15:37:23.022: [ CSSD][1101797696]clssgmQueueGrockEvent: groupName(CRF-) count(2) master(3) event(6), incarn 2, mbrc 1, to member 0, events 0x38, state 0x0 2014-09-29 15:37:28.001: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:28.001: [ CSSD][1128089920]clssnmSendingThread: sent 5 status msgs to all nodes 2014-09-29 15:37:32.012: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:32.012: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:36.024: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:36.024: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:40.036: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:40.036: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:44.047: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:44.047: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:46.761: [GIPCHGEN][1109166400] gipchaInterfaceFail: marking interface failing 0x2aaab015dc30 { host '', haName 'CSS_db-cluster', local (nil), ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 1, numFail 0, idxBoot 0, flags 0x184d } 2014-09-29 15:37:47.349: [GIPCHGEN][1090304320] gipchaInterfaceFail: marking interface failing 0x9dd4160 { host 'db01', haName 'CSS_db-cluster', local 0x2aaab015dc30, ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 1, flags 0x1846 } 2014-09-29 15:37:48.055: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:48.055: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:48.349: [GIPCHGEN][1090304320] gipchaInterfaceDisable: disabling interface 0x2aaab015dc30 { host '', haName 'CSS_db-cluster', local (nil), ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 1, idxBoot 0, flags 0x19cd } 2014-09-29 15:37:48.349: [GIPCHGEN][1090304320] gipchaInterfaceDisable: disabling interface 0x9dd4160 { host 'db01', haName 'CSS_db-cluster', local 0x2aaab015dc30, ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 1, flags 0x18c6 } 2014-09-29 15:37:48.349: [GIPCHALO][1090304320] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x9dd4160 { host 'db01', haName 'CSS_db-cluster', local 0x2aaab015dc30, ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 1, flags 0x18e6 } 2014-09-29 15:37:49.349: [GIPCHDEM][1090304320] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x2aaab015dc30 { host '', haName 'CSS_db-cluster', local (nil), ip '10.10.11.2:27733', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x18ed } 2014-09-29 15:37:50.519: [GIPCHDEM][1109166400] gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x9b369c0 [0000000000000010] { gipchaContext : host 'db02', name 'CSS_db-cluster', luid 'feec02b2-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd 2014-09-29 15:37:50.546: [GIPCHGEN][1109166400] gipchaNodeAddInterface: adding interface information for inf 0x2aaab02fd5d0 { host '', haName 'CSS_db-cluster', local (nil), ip '10.10.11.2', subnet '10.10.11.0', mask '255.255.255.0', mac '40-f2-e9-db-c9-fc', ifname 'bond1', numRef 0, numFail 0, idxBoot 0, flags 0x1841 } 2014-09-29 15:37:51.349: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created local interface for node 'db02', haName 'CSS_db-cluster', inf 'udp://10.10.11.2:38014' 2014-09-29 15:37:51.349: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'db02', haName 'CSS_db-cluster', inf 'mcast://224.0.0.251:42424/10.10.11.2' 2014-09-29 15:37:51.349: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'db02', haName 'CSS_db-cluster', inf 'mcast://230.0.1.0:42424/10.10.11.2' 2014-09-29 15:37:51.349: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created local bootstrap broadcast interface for node 'db02', haName 'CSS_db-cluster', inf 'udp://10.10.11.255:42424' 2014-09-29 15:37:51.349: [GIPCHGEN][1090304320] gipchaNodeAddInterface: adding interface information for inf 0x9dd4160 { host 'db01', haName 'CSS_db-cluster', local (nil), ip '10.10.11.2:38014', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 } 2014-09-29 15:37:51.350: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'db01', haName 'CSS_db-cluster', inf 'mcast://224.0.0.251:42424/10.10.11.2:38014' 2014-09-29 15:37:51.350: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'db01', haName 'CSS_db-cluster', inf 'mcast://230.0.1.0:42424/10.10.11.2:38014' 2014-09-29 15:37:51.350: [GIPCHTHR][1090304320] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'db01', haName 'CSS_db-cluster', inf 'udp://10.10.11.255:42424' 2014-09-29 15:37:51.350: [GIPCHGEN][1090304320] gipchaWorkerAttachInterface: Interface attached inf 0x9dd4160 { host 'db01', haName 'CSS_db-cluster', local 0x2aaab02fd5d0, ip '10.10.11.2:38014', subnet '10.10.11.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 } 2014-09-29 15:37:52.063: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:52.063: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes 2014-09-29 15:37:57.073: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:37:57.073: [ CSSD][1128089920]clssnmSendingThread: sent 5 status msgs to all nodes 2014-09-29 15:38:02.083: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:38:02.083: [ CSSD][1128089920]clssnmSendingThread: sent 5 status msgs to all nodes 2014-09-29 15:38:06.091: [ CSSD][1128089920]clssnmSendingThread: sending status msg to all nodes 2014-09-29 15:38:06.091: [ CSSD][1128089920]clssnmSendingThread: sent 4 status msgs to all nodes
###db01 message Sep 29 15:29:27 db01 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it Sep 29 15:29:27 db01 kernel: bonding: bond1: making interface eth2 the new active one. Sep 29 15:29:31 db01 kernel: igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Sep 29 15:29:31 db01 kernel: bonding: bond1: link status definitely up for interface eth3. Sep 29 15:31:28 db01 kernel: igb: eth2 NIC Link is Down Sep 29 15:31:28 db01 kernel: bonding: bond1: link status definitely down for interface eth2, disabling it Sep 29 15:31:28 db01 kernel: bonding: bond1: making interface eth3 the new active one. Sep 29 15:31:28 db01 kernel: igb: eth3 NIC Link is Down Sep 29 15:31:29 db01 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it Sep 29 15:31:29 db01 kernel: bonding: bond1: now running without any active interface ! Sep 29 15:31:54 db01 kernel: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Sep 29 15:31:54 db01 kernel: bonding: bond1: link status definitely up for interface eth2. Sep 29 15:31:54 db01 kernel: bonding: bond1: making interface eth2 the new active one. Sep 29 15:31:54 db01 kernel: bonding: bond1: first active interface up! Sep 29 15:31:54 db01 kernel: igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Sep 29 15:31:54 db01 kernel: bonding: bond1: link status definitely up for interface eth3. Sep 29 15:36:10 db01 shutdown[17047]: shutting down for system reboot Sep 29 15:36:11 db01 gconfd (root-6536): Received signal 15, shutting down cleanly Sep 29 15:36:11 db01 gconfd (root-6536): Exiting ###db02 message Sep 29 15:36:54 db02 kernel: igb: eth2 NIC Link is Down Sep 29 15:36:54 db02 kernel: bonding: bond1: link status definitely down for interface eth2, disabling it Sep 29 15:36:54 db02 kernel: bonding: bond1: making interface eth3 the new active one. Sep 29 15:36:55 db02 kernel: igb: eth3 NIC Link is Down Sep 29 15:36:55 db02 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it Sep 29 15:36:55 db02 kernel: bonding: bond1: now running without any active interface ! Sep 29 15:37:10 db02 kernel: igb: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX Sep 29 15:37:10 db02 kernel: igb: eth3 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX Sep 29 15:37:10 db02 kernel: bonding: bond1: link status definitely up for interface eth2. Sep 29 15:37:10 db02 kernel: bonding: bond1: making interface eth2 the new active one. Sep 29 15:37:10 db02 kernel: bonding: bond1: first active interface up! Sep 29 15:37:10 db02 kernel: bonding: bond1: link status definitely up for interface eth3.
问题分析:
从如上详细的日志信息我们不难看出有如下动作,在db01上执行关机操作之后ocss和crsd进程都会向远端返送消息告诉对端本机即将执行关闭操作。后再停止各个进程。
在节点二中我们可从message日志中看到之前的私网bond1口状态时Down,在15:37分将第一个节点shutdown -immediate之后私网bond1居然自动执行了up操作。随即我们从ocss和crsd的日志中可以看到集群进程都正在起来。
那么这个时候我们可以分析问题应该是出在私网网络这一部分,可能是网卡绑定的问题。
问题处理过程:
既然从日志中看出是网络问题,那么我们就从网络排除,待节点一重启启动后,首先采用ping私网来确定,节点一启动了,同样集群服务是没有起来的:
ping节点2的私网,不通: [root@db01 ~]# ping pri02 PING pri02.xmtvdb.com (10.10.11.2) 56(84) bytes of data. From pri01.xmtvdb.com (10.10.11.1) icmp_seq=193 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=194 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=195 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=197 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=198 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=199 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=201 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=202 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=203 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=204 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=205 Destination Host Unreachable From pri01.xmtvdb.com (10.10.11.1) icmp_seq=206 Destination Host Unreachable 检查bonding,是好的,没有问题: ###db01 [root@db01 ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth2 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth2 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: 40:f2:e9:db:c9:c4 Slave Interface: eth3 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: 40:f2:e9:db:c9:c5 ###db02 [root@db02 ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth2 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth2 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:f2:e9:db:c9:fc Slave Interface: eth3 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:f2:e9:db:c9:fd 随即尝试down掉节点二的bond1中的eth3网口,发现可以ping通,且集群能够起来。 ###db02 [root@db02 ~]# ifdow eth3 Sep 29 15:40:55 db02 kernel: bonding: bond1: Removing slave eth3 ###db01 [root@db01 ~]# ping pri02 PING pri02.xmtvdb.com (10.10.11.2) 56(84) bytes of data. 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=1 ttl=64 time=0.071 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=2 ttl=64 time=0.122 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=3 ttl=64 time=0.134 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=4 ttl=64 time=0.098 ms 同时这个时候集群服务也起来了: [root@db01 ~]# su - grid -c "crsctl status res -t" -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BAK001.dg ONLINE ONLINE db01 ONLINE ONLINE db02 ora.DATA001.dg ONLINE ONLINE db01 ONLINE ONLINE db02 ora.FRA001.dg ONLINE ONLINE db01 ONLINE ONLINE db02 ora.LISTENER.lsnr ONLINE ONLINE db01 ONLINE ONLINE db02 ora.OCR_VOTE.dg ONLINE ONLINE db01 ONLINE ONLINE db02 ora.asm ONLINE ONLINE db01 Started ONLINE ONLINE db02 Started ora.gsd OFFLINE OFFLINE db01 OFFLINE OFFLINE db02 ora.net1.network ONLINE ONLINE db01 ONLINE ONLINE db02 ora.ons ONLINE ONLINE db01 ONLINE ONLINE db02 ora.registry.acfs ONLINE ONLINE db01 ONLINE ONLINE db02 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE db01 ora.cvu 1 ONLINE ONLINE db01 ora.db01.vip 1 ONLINE ONLINE db01 ora.db02.vip 1 ONLINE ONLINE db02 ora.oc4j 1 ONLINE ONLINE db01 ora.scan1.vip 1 ONLINE ONLINE db01 ora.xmman.db 1 ONLINE ONLINE db01 Open 2 ONLINE ONLINE db02 Open ora.xmman.taf.svc 1 ONLINE ONLINE db01 2 ONLINE ONLINE db02 再次把eth3 up起来,不受影响 ###db02 [root@db02 ~]# ifup eth3 ###db01 [root@db01 ~]# ping pri02 PING pri02.xmtvdb.com (10.10.11.2) 56(84) bytes of data. 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=1 ttl=64 time=0.161 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=2 ttl=64 time=0.022 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=3 ttl=64 time=0.034 ms 64 bytes from pri02.xmtvdb.com (10.10.11.2): icmp_seq=4 ttl=64 time=0.196 ms
随即根据Oracle最佳实践将直连的两根心跳线连接上交换后,问题没有再现;原因未知,有知道的请告知。