CRS-1006 , CRS-0215 故障一例

安装好sles 10 sp3 + Oracle 10g RAC之后,在配置监听器时,总是提示主机bo2dbp上的监听服务已经在运行,忽略错误之后手动在bo2dbp节点上启,总是收到TNS-12545: Connect failed because target host or object does not exist错误信息。后来发现节点bo2dbp的vip总是漂移到另一节点bo2dbs上,原来这才是罪魁祸首 ??????

1、集群环境netca之后,节点bo2dbp监听无法启动
手动尝试启动监听仍不成功,查看了listener.ora以及监听日志未发现任何异常,查看主机/etc/hosts配置也正常
无奈之下,删除监听器
oracle@bo2dbp:~> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....BP.lsnr application ONLINE OFFLINE
ora.bo2dbp.gsd application ONLINE ONLINE bo2dbp
ora.bo2dbp.ons application ONLINE ONLINE bo2dbp
ora.bo2dbp.vip application ONLINE ONLINE bo2dbs
ora....BS.lsnr application ONLINE ONLINE bo2dbs
ora.bo2dbs.gsd application ONLINE ONLINE bo2dbs
ora.bo2dbs.ons application ONLINE ONLINE bo2dbs
ora.bo2dbs.vip application ONLINE ONLINE bo2dbs

2、节点bo2dbp.vip被漂移到bo2dbs
oracle@bo2dbp:~/product/10.2.0/crs_1/bin> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.bo2dbp.gsd application ONLINE ONLINE bo2dbp
ora.bo2dbp.ons application ONLINE ONLINE bo2dbp
ora.bo2dbp.vip application ONLINE ONLINE bo2dbs
ora.bo2dbs.gsd application ONLINE ONLINE bo2dbs
ora.bo2dbs.ons application ONLINE ONLINE bo2dbs
ora.bo2dbs.vip application ONLINE ONLINE bo2dbs

3、尝试为bo2dbp进行relocate也不听使唤
oracle@bo2dbp:~/product/10.2.0/crs_1/bin> ./crs_relocate ora.bo2dbp.vip -c bo2dbp
Attempting to stop `ora.bo2dbp.vip` on member `bo2dbs`
Stop of `ora.bo2dbp.vip` on member `bo2dbs` succeeded.
Attempting to start `ora.bo2dbp.vip` on member `bo2dbp`
Start of `ora.bo2dbp.vip` on member `bo2dbp` failed.
Attempting to start `ora.bo2dbp.vip` on member `bo2dbs`
Start of `ora.bo2dbp.vip` on member `bo2dbs` succeeded.
CRS-0217: Could not relocate resource 'ora.bo2dbp.vip'.

4、crs_stop 出马
oracle@bo2dbp:~/product/10.2.0/crs_1/bin> crs_stop ora.bo2dbp.vip
Attempting to stop `ora.bo2dbp.vip` on member `bo2dbs`
Stop of `ora.bo2dbp.vip` on member `bo2dbs` succeeded.

oracle@bo2dbp:~/product/10.2.0/crs_1/bin> crs_start ora.bo2dbp.vip
Attempting to start `ora.bo2dbp.vip` on member `bo2dbp`
Start of `ora.bo2dbp.vip` on member `bo2dbp` failed.
Attempting to start `ora.bo2dbp.vip` on member `bo2dbs`
Start of `ora.bo2dbp.vip` on member `bo2dbs` succeeded.

我晕,又漂移到bo2dbs节点

5、狠一点,直接将节点2bo2dbs的crs停掉,这下没得漂了吧 :)
bo2dbs:/u01/app/oracle/product/10.2.0/crs_1/bin # ./crsctl stop crs
Stopping resources. This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

6、露出马脚
oracle@bo2dbp:~/product/10.2.0/crs_1/bin> crs_start ora.bo2dbp.vip
Attempting to start `ora.bo2dbp.vip` on member `bo2dbp`
Start of `ora.bo2dbp.vip` on member `bo2dbp` failed.
Remote start for `ora.bo2dbp.vip` failed on member `bo2dbs`
CRS-1006: No more members to consider

CRS-0215: Could not start resource 'ora.bo2dbp.vip'.



7、check log

oracle@bo2dbp:~/product/10.2.0/crs_1/log/bo2dbp/racg> tail -50 ora.bo2dbp.vip.log

2012-08-30 16:18:55.942: [ RACG][4250161648] [20571][4250161648][ora.bo2dbp.vip]: end for resource = ora.bo2dbp.vip, action = start, status = 1, time = 6.390s

2012-08-30 16:19:30.968: [ RACG][4041843184] [20878][4041843184][ora.bo2dbp.vip]: checkIf: Default gateway is not defined (host=bo2dbp)
Interface eth0 checked failed (host=bo2dbp)
Invalid parameters, or failed to bring up VIP (host=bo2dbp)

原来醉鬼祸首尽然是没有缺省网关,我倒倒倒......
记得之前的安装就碰到过在节点2上没有缺省网关导致vipca不能继续的情形。这次又碰上了。汗。。。
oracle@bo2dbp:~> cat /etc/sysconfig/network/routes
cat: /etc/sysconfig/network/routes: No such file or directory
oracle@bo2dbp:~> ssh bo2dbs cat /etc/sysconfig/network/routes
default 192.168.7.254 - -

使用root帐户为其追加网关后,一切ok.

oracle@bo2dbs:~> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.bo2dbp.gsd application ONLINE ONLINE bo2dbp
ora.bo2dbp.ons application ONLINE ONLINE bo2dbp
ora.bo2dbp.vip application ONLINE ONLINE bo2dbp
ora.bo2dbs.gsd application ONLINE ONLINE bo2dbs
ora.bo2dbs.ons application ONLINE ONLINE bo2dbs
ora.bo2dbs.vip application ONLINE ONLINE bo2dbs

8、问题环境
sles 10/sp3 + Oracle 10g(10.2.0.4)

9、小结:
a、细心细心再细心
b、根据相应的日志快速定位故障

你可能感兴趣的:(c)