RAC ONS 不能启动

DBA 群里朋友的RAC 环境的ONS 进程无法启动。 平台是Redhat 5.3 64bit的。

 

Ons log 如下:

 

2010-10-18 09:42:11.384: [RACG][3041022624] [16815][3041022624][ora.rac1.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/oracle/product/10.2.0/crs_1

 

2010-10-18 09:42:11.384: [RACG][3041022624] [16815][3041022624][ora.rac1.ons]: clsrcexecut: cmd = /u01/oracle/product/10.2.0/crs_1/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/oracle/product/10.2.0/crs_1/bin/onsctl stop

 

2010-10-18 09:42:11.384: [RACG][3041022624] [16815][3041022624][ora.rac1.ons]: clsrcexecut: rc = 99, time = 540.630s

 

2010-10-18 10:55:44.720: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: timeout: killed the spawned process

 

2010-10-18 10:55:44.721: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/oracle/product/10.2.0/crs_1

 

2010-10-18 10:55:44.721: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: cmd = /u01/oracle/product/10.2.0/crs_1/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/oracle/product/10.2.0/crs_1/bin/onsctl start

 

2010-10-18 10:55:44.721: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: rc = 99, time = 540.410s

 

2010-10-18 10:59:12.517: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: /u01/oracle/product/10.2.0/crs_1/bin/onsctl: line 81: 31584Terminated              $ONSADMIN ping

ons is not running ...

 

2010-10-18 10:59:12.517: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/oracle/product/10.2.0/crs_1

 

2010-10-18 10:59:12.517: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: cmd = /u01/oracle/product/10.2.0/crs_1/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/oracle/product/10.2.0/crs_1/bin/onsctl ping

 

2010-10-18 10:59:12.517: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: clsrcexecut: rc = 1, time = 207.800s

 

2010-10-18 10:59:12.517: [RACG][1604653728] [18288][1604653728][ora.rac1.ons]: end for resource = ora.rac1.ons, action = start, status = 1, time = 748.230s

 

2010-10-18 10:59:13.781: [RACG][1366147744] [1357][1366147744][ora.rac1.ons]: onsctl: shutting down ons daemon ...

/u01/oracle/product/10.2.0/crs_1/bin/onsctl: line 118:  1362 Terminated              $ONSADMIN shutdown

onsctl: shutdown of ons failed!

 

crsd.log 信息如下:

 

timeout for ora.rac1.ons timeout=600

start resource error for ora.rac1.ons error code=-2

 

从错误看是连接超时。而且RAC 运行正常,但是ONS 进程较多,而且占用大量的CPU 资源,cpu 消耗100%。因为这个是生产库,所以慎重操作。 将DBA1 群的布豆 加入讨论组,布豆在RAC上的经验比较丰富。

 

布豆的说法, Oracle RAC进程有时会有莫名其妙的不正常, Oracle 原厂也说不清。 朋友重启了节点1的服务器后,ons 启动正常了,然后又重启了节点2.   朋友怀疑是网络的策略做了变更,对系统产生了影响。

 

问题解决后,我们三小聊了会,其中一个话题就是备份。 备份对与数据库来说重于一切。 要备份数据库,控制文件,spfile。 这些文件对恢复来说很重要。 只有有效的备份,才可能将出现的损失降到最低。


------------------------------------------------------------------------------

你可能感兴趣的:(RAC ONS 不能启动)