Oracle RAC root.sh 报错 Timed out waiting for the CRS stack to start 解决方法


一.问题描述

在Oracle Linux 6.1 上安装11.2.0.1的RAC,在第二个节点执行root.sh时,报time out,如下:

[root@rac2 ~]# /u01/app/11.2.0/grid/root.sh

Running Oracle 11g root.sh script...

The following environment variables are setas:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/app/11.2.0/grid

Enter the full pathname of the local bindirectory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin ...

Copying oraenv to /usr/local/bin ...

Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratabfile as needed by

Database Configuration Assistant when adatabase is created

Finished running generic part of root.shscript.

Now product-specific root actions will beperformed.

2012-06-27 14:46:35: Parsing the host name

2012-06-27 14:46:35: Checking for superuser privileges

2012-06-27 14:46:35: User has super userprivileges

Using configuration parameter file:/u01/app/11.2.0/grid/crs/install/crsconfig_params

Creating trace directory

LOCAL ADD MODE

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Adding daemon to inittab

CRS-4123: Oracle High Availability Serviceshas been started.

ohasd is starting

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

CRS-4402: The CSS daemon was started inexclusive mode but found an active CSS daemon on node rac1, number 1, and isterminating

An active cluster was found duringexclusive startup, restarting to join the cluster

CRS-2672: Attempting to start 'ora.mdnsd'on 'rac2'

CRS-2676: Start of 'ora.mdnsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gipcd'on 'rac2'

CRS-2676: Start of 'ora.gipcd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gpnpd'on 'rac2'

CRS-2676: Start of 'ora.gpnpd' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.cssdmonitor' on 'rac2'

CRS-2676: Start of 'ora.cssdmonitor' on'rac2' succeeded

CRS-2672: Attempting to start 'ora.cssd' on'rac2'

CRS-2672: Attempting to start 'ora.diskmon'on 'rac2'

CRS-2676: Start of 'ora.diskmon' on 'rac2'succeeded

CRS-2676: Start of 'ora.cssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.ctssd'on 'rac2'

CRS-2676: Start of 'ora.ctssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.asm' on'rac2'

CRS-2676: Start of 'ora.asm' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.crsd' on'rac2'

CRS-2676: Start of 'ora.crsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.evmd' on'rac2'

CRS-2676: Start of 'ora.evmd' on 'rac2'succeeded

Timed outwaiting for the CRS stack to start.

查看相关的状态:

[oracle@rac1 bin]$ ./crsctl check cluster-all

**************************************************************

rac1:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: EventManager is online

[oracle@rac2 bin]$ ./crsctl check cluster -all

**************************************************************

rac2:

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: EventManager is online

[oracle@rac1 bin]$ ./crs_stat -t -v

Name Type R/RA F/FTTarget State Host

----------------------------------------------------------------------

ora.DATA.dg ora....up.type 0/5 0/ONLINE ONLINE rac1

ora....N1.lsnr ora....er.type 0/5 0/0ONLINE ONLINE rac1

ora.asm ora.asm.type 0/50/ ONLINE ONLINErac1

ora.eons ora.eons.type 0/30/ ONLINE ONLINErac1

ora.gsd ora.gsd.type 0/50/ OFFLINE OFFLINE

ora....network ora....rk.type 0/5 0/ONLINE ONLINE rac1

ora.oc4j ora.oc4j.type 0/50/0 OFFLINE OFFLINE

ora.ons ora.ons.type 0/30/ ONLINE ONLINErac1

ora....SM1.asm application 0/50/0 ONLINE ONLINErac1

ora.rac1.gsd application 0/50/0 OFFLINE OFFLINE

ora.rac1.ons application 0/30/0 ONLINE ONLINErac1

ora.rac1.vip ora....t1.type 0/0 0/0ONLINE ONLINE rac1

ora.scan1.vip ora....ip.type 0/0 0/0ONLINE ONLINE rac1

[oracle@rac2 bin]$ ./crs_stat -t -v

CRS-0184: Cannot communicate with the CRSdaemon.

[oracle@rac2 bin]$

在节点2上的命令没有成功执行的。

二.MOS 上的说明

Root.Sh Failing with 'Prom_rpc: Clsc SendFailure..Ret Code 6' [ID 745215.1]

2.1 Symptoms

During CRS install root.sh fails on thelast node with follow message:


Waiting for the Oracle CRSD and EVMD tostart
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Timed out waiting for the CRS stack to start.


The crsd.log on the first node shows:

2008-10-21 21:04:55.087: [OCRMSG][1325496672]prom_rpc: CLSC send failure..ret
code 6
2008-10-21 21:04:55.087: [ OCRMSG][1325496672]prom_rpc: possible OCRretry
scenario
2008-10-21 21:04:55.087: [ OCRSRV][1325496672]proas_forward_request:PROM_TIM
E_OUT or Master Fail
2008-10-21 21:04:55.296: [ COMMCRS][2540957248]clscsendx: (0xc43bd0)Connection
not active

2008-10-21 21:04:55.296: [OCRMSG][2540957248]prom_rpc: CLSC send failure..ret
code 6
2008-10-21 21:04:55.296: [ OCRMSG][2540957248]prom_rpc: possible OCRretry
scenario
2008-10-21 21:04:55.296: [ OCRCLI][2540957248]proac_open_key:[SYSTEM.crs.debug.
ist4-db1-3-sfm.COMMNS]: Writer failed. Retval [203]



The ocssd.log on the first node shows:

2008-10-21 20:33:01.626: [OCRAPI][2540958848]procr_open: Node Failure.
Attempting retry #0
2008-10-21 20:33:02.628: [ OCRAPI][2540958848]procr_open: Node Failure.
Attempting retry #1
2008-10-21 20:33:03.631: [ OCRAPI][2540958848]procr_open: Node Failure.
Attempting retry #2


The ocssd.log on the last nodeshows:

[ CSSD]2008-10-22 04:46:42.457 [1241577824]>TRACE: clssnmRcfgMgrThread:
lastleader(1) unique(1224650474)
[ CSSD]2008-10-22 04:46:43.219 [1168148832] >TRACE:clssnmSendVoteInfo:
node(1) syncSeqNo(4)
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >ERROR: clssgmStartNMMon:
timed out waiting on nested NM reconfig. Self-sacrificing to kick othersawake.
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >ERROR: StartCMMon():
clssnmNMDetach failed - 2
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >TRACE: clssscctx: dump of
0x0x5d2360, len 3792



2.2 Cause

This is due to a failure of communicationbetween crsd.bin on nodes.

2.3 Solution

Check the network for the following items:

- Check to see that there is No firewall between the nodes
- Make sure that the MTU size is same.
- If MTU is larger than 1500, then the switch must be able to support largerMTU size.
- Make sure that you have disabled SELINUX
- Make sure that NICs are using full duplex and not auto negotiate.
- Misconfiguration on the switches will also cause this issue.

三.解决方法

在MOS的文档里提示的原因和防火墙,时间,SELINUX,网卡类型有关,基本可以确定就是和网卡相关的原因导致这类问题,我的的原因是是2个节点的网卡名称不一致,所以修改网卡名一致后,尝试重新执行一下root.sh 命令。

即修改之前:rac1 是eth0和eth1,节点2是:eth5和eth6. 怎么修改网卡名,这个google一下,这里不做说明。

卸载之前的操作,命令如下:

/u01/app/11.2.0/grid/crs/install/rootcrs.pl-deconfig -verbose -force

注意这里,Oracle11g与10g中命令的区别。

--卸载:

[root@rac2 ~]#/u01/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -verbose -force

2012-06-27 15:12:30: Parsing the host name

2012-06-27 15:12:30: Checking for superuser privileges

2012-06-27 15:12:30: User has super userprivileges

Using configuration parameter file:/u01/app/11.2.0/grid/crs/install/crsconfig_params

PRCR-1035 : Failed to look up CRS resourceora.cluster_vip.type for 1

PRCR-1068 : Failed to query resources

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.gsd is registered

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.ons is registered

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.eons is registered

Cannot communicate with crsd

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

ACFS-9201: Not Supported

CRS-2791: Starting shutdown of Oracle HighAvailability Services-managed resources on 'rac2'

CRS-2673: Attempting to stop 'ora.mdnsd' on'rac2'

CRS-2673: Attempting to stop 'ora.gpnpd' on'rac2'

CRS-2673: Attempting to stop'ora.cssdmonitor' on 'rac2'

CRS-2673: Attempting to stop 'ora.ctssd' on'rac2'

CRS-2673: Attempting to stop 'ora.evmd' on'rac2'

CRS-2673: Attempting to stop 'ora.asm' on'rac2'

CRS-2677: Stop of 'ora.cssdmonitor' on 'rac2'succeeded

CRS-2677: Stop of 'ora.gpnpd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.evmd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.ctssd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.asm' on 'rac2' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on'rac2'

CRS-2677: Stop of 'ora.cssd' on 'rac2'succeeded

CRS-2673: Attempting to stop 'ora.diskmon'on 'rac2'

CRS-2673: Attempting to stop 'ora.gipcd' on'rac2'

CRS-2677: Stop of 'ora.gipcd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.diskmon' on 'rac2'succeeded

CRS-2793: Shutdown of Oracle HighAvailability Services-managed resources on 'rac2' has completed

CRS-4133: Oracle High Availability Serviceshas been stopped.

error: package cvuqdisk is not installed

Successfully deconfigured Oracleclusterware stack on this node

--重新执行root.sh,这次成功。

[root@rac2 ~]# /u01/app/11.2.0/grid/root.sh

Running Oracle 11g root.sh script...

The following environment variables are setas:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/app/11.2.0/grid

Enter the full pathname of the local bindirectory: [/usr/local/bin]:

The file "dbhome" already existsin /usr/local/bin. Overwrite it? (y/n)

[n]:

The file "oraenv" already existsin /usr/local/bin. Overwrite it? (y/n)

[n]:

The file "coraenv" already existsin /usr/local/bin. Overwrite it? (y/n)

[n]:

Entries will be added to the /etc/oratabfile as needed by

Database Configuration Assistant when adatabase is created

Finished running generic part of root.shscript.

Now product-specific root actions will beperformed.

2012-06-27 16:21:25: Parsing the host name

2012-06-27 16:21:25: Checking for superuser privileges

2012-06-27 16:21:25: User has super userprivileges

Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params

LOCAL ADD MODE

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Adding daemon to inittab

CRS-4123: Oracle High Availability Serviceshas been started.

ohasd is starting

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

CRS-4402: The CSS daemon was started inexclusive mode but found an active CSS daemon on node rac1, number 1, and isterminating

An active cluster was found duringexclusive startup, restarting to join the cluster

CRS-2672: Attempting to start 'ora.mdnsd'on 'rac2'

CRS-2676: Start of 'ora.mdnsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gipcd'on 'rac2'

CRS-2676: Start of 'ora.gipcd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gpnpd'on 'rac2'

CRS-2676: Start of 'ora.gpnpd' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.cssdmonitor' on 'rac2'

CRS-2676: Start of 'ora.cssdmonitor' on'rac2' succeeded

CRS-2672: Attempting to start 'ora.cssd' on'rac2'

CRS-2672: Attempting to start 'ora.diskmon'on 'rac2'

CRS-2676: Start of 'ora.diskmon' on 'rac2'succeeded

CRS-2676: Start of 'ora.cssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.ctssd'on 'rac2'

CRS-2676: Start of 'ora.ctssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.asm' on'rac2'

CRS-2676: Start of 'ora.asm' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.crsd' on'rac2'

CRS-2676: Start of 'ora.crsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.evmd' on'rac2'

CRS-2676: Start of 'ora.evmd' on 'rac2'succeeded

rac22012/06/27 16:25:16/u01/app/11.2.0/grid/cdata/rac2/backup_20120627_162516.olr

Preparing packages for installation...

cvuqdisk-1.0.7-1

Configure Oracle Grid Infrastructure for aCluster ... succeeded

Updating inventory properties forclusterware

Starting Oracle Universal Installer...

Checking swap space: must be greater than500 MB. Actual 999 MB Passed

The inventory pointer is located at/etc/oraInst.loc

The inventory is located at/u01/app/oraInventory

[root@rac2 ~]#

验证:

[oracle@rac2 bin]$ ./crsctl check cluster-all

**************************************************************

rac1:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

**************************************************************

rac2:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

**************************************************************

[oracle@rac2 bin]$ ./crs_stat -t -v

Name Type R/RA F/FTTarget State Host

----------------------------------------------------------------------

ora.DATA.dg ora....up.type 0/5 0/ONLINE ONLINE rac1

ora....N1.lsnr ora....er.type 0/5 0/0ONLINE ONLINE rac1

ora.asm ora.asm.type 0/50/ ONLINE ONLINErac1

ora.eons ora.eons.type 0/3 0/ONLINE ONLINE rac1

ora.gsd ora.gsd.type 0/50/ OFFLINE OFFLINE

ora....network ora....rk.type 0/5 0/ONLINE ONLINE rac1

ora.oc4j ora.oc4j.type 0/50/0 OFFLINE OFFLINE

ora.ons ora.ons.type 0/30/ ONLINE ONLINErac1

ora....SM1.asm application 0/50/0 ONLINE ONLINErac1

ora.rac1.gsd application 0/50/0 OFFLINE OFFLINE

ora.rac1.ons application 0/30/0 ONLINE ONLINErac1

ora.rac1.vip ora....t1.type 0/0 0/0ONLINE ONLINE rac1

ora....SM2.asm application 0/50/0 ONLINE ONLINErac2

ora.rac2.gsd application 0/50/0 OFFLINE OFFLINE

ora.rac2.ons application 0/30/0 ONLINE ONLINErac2

ora.rac2.vip ora....t1.type 0/0 0/0ONLINE ONLINE rac2

ora.scan1.vip ora....ip.type 0/0 0/0ONLINE ONLINE rac1

[oracle@rac2 bin]$

Root.sh 执行成功。

-------------------------------------------------------------------------------------------------------

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

Skype: tianlesoftware

QQ: [email protected]

Email: [email protected]

Blog: http://www.tianlesoftware.com

Weibo: http://weibo.com/tianlesoftware

Twitter: http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware

-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----

DBA1 群:62697716(满); DBA2 群:62697977(满)DBA3 群:62697850(满)

DBA 超级群:63306533(满); DBA4 群:83829929 DBA5群: 142216823

DBA6 群:158654907 DBA7 群:172855474 DBA总群:104207940


你可能感兴趣的:(Oracle RAC root.sh 报错 Timed out waiting for the CRS stack to start 解决方法)