如何正确手工启动Windows下的Oracle RAC数据库

     这是一则来自于某德国客户生产环境的RAC数据库启动出现故障的案例,记录下来一是用于对自己的警醒,二是可以同广大网友分享。

           操作系统环境:Windows Server 2008 R2 Enterprise version 6.1(Build 7601:Service Pack 1)

           数据库环境:10gR2 10.2.0.5.0的64位RAC双节点数据库;

           上周末,在顺利地对RAC数据库的几张分区表做调整之后,正常关闭RAC数据库,并重新启动2台Windows 2008 R2的操作系统之后,尝试启动Oracle CRS时,发现报错:

           1 在任何一个节点的服务项里,启动OracleCRService服务时,报错,其中OracleCSService的状态一直停留在Starting状态,其它服务项无任何变化;

           2 重启Windows服务器后,使用$CRS_HOME\bin\crsctl start crs在命令行尝试启动CRS时,依然报错;

           3 接下去,开始检查CRS的错误日志:在C:\oracle\product\10.2.0\crs\log\dehamora002\crsd\crsd.log日志文件中看到下述报错信息:

             

view source
print ?
1 2012-12-08 11:52:52.606: [  OCRMAS][3876]th_master:13: I AM THE NEW OCR MASTER at incar 2. Node Number 2
2 2012-12-08 11:52:52.606: [  OCROSD][3876]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.
3   
4 2012-12-08 11:52:52.621: [  OCROSD][3876]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.
5   
6 2012-12-08 11:52:52.637: [  OCRRAW][3876]proprioo: for disk 0 (\\.\ocrcfg), id match (1), my id set (1381592635,1028247821) total id sets (1), 1st set (1381592635,1028247821), 2nd set (0,0) my votes (2), total votes (2)
7 2012-12-08 11:52:52.715: [  OCRMAS][3876]th_master: Deleted ver keys from cache (master)

从上可以看出,问题应该是出现在服务器访问共享存储时出现的。果然,在远程联系德国汉堡客户IT人员检查后,发现是服务器同存储间出现了问题,协调并解决该错误。

 

           4 再次重启Windows,并尝试启动CRS时,C:\oracle\product\10.2.0\crs\log\dehamora002\cssd\cssdOUT.log日志文件中看到下述报错信息:

view source
print ?
01 Oracle Database 10g CSS Release 10.2.0.5.0 Production Copyright 1996, 2004, Oracle.  All rights reserved.
02 12/08/12 12:02:06  ssmain_run_css:  launching boot check 1 with c:\oracle\product\10.2.0\crs\bin\crsctl.exe check boot
03 OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [The system cannot find the file specified.
04   
05 ] [2]
06 12/08/12 12:02:06  ssmain_run_css:  boot check returned 8, looping
07 12/08/12 12:02:07  ssmain_run_css:  launching boot check 2 with c:\oracle\product\10.2.0\crs\bin\crsctl.exe check boot
08 OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [The system cannot find the file specified.
09   
10 ] [2]
11 12/08/12 12:02:07  ssmain_run_css:  boot check returned 8, looping
12 12/08/12 12:02:08  ssmain_run_css:  launching boot check 3 with c:\oracle\product\10.2.0\crs\bin\crsctl.exe check boot
13 OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [The system cannot find the file specified.
14   
15 ] [2]

           通过查询Metalink:

Can not Start CRS on Windows Cluster [ID 1115153.1]

How to Start (or stop) 10gR2 or 11gR1 Oracle Clusterware Services Manually in Windows [ID 729512.1]

OracleCSService does not start – PROC-26 error possible [ID 305093.1]

           找到产生问题的原因:原来这套RAC环境下的所有Oracle服务都是手工启动的方式,正常情况下,手工启动OracleCRService服务时,会自动启动依赖的相关服务。而该环境下,oracle并没有如我们期待的那样去启动与OracleCRService相关的服务。

           准确定位到原因后,解决问题的办法其实也很简单,就是如Metalink文档上说明的方案,手工依次启动.  OracleObjectService OracleClusterVolumeService OracleCSServiceOracleEVMService、OracleCRService  很快,RAC数据库重新正常启动!

        启示:

      1 对于Windows环境下的RAC,最好是将OracleObjectService的启动类型置为自动启动;

        2 如果上述服务是手工启动的,那么正确手工启动Windows下的Oracle RAC数据库的顺序依次是:OracleObjectService OracleClusterVolumeService(if using OCFS) OracleCSServiceOracleEVMServiceOracleCRService      

你可能感兴趣的:(oracle)