一. 检查共享设备
一般情况下,存放OCR和Voting Disk的OCFS2 或者raw 都是自动启动的。 如果他们没有启动,RAC 肯定是启动不了。
1.1 如果使用ocfs2的
检查ocfs2 状态:/etc/init.d/o2cb status
在挂载之前,/etc/init.d/o2cb status 显示为Checking O2CB heartbeat: Not active。在格式化和挂载文件系统之前,应验证 O2CB 在两个节点上均联机;O2CB 心跳当前没有活动,因为文件系统还没有挂载 。挂载之后就会变成active。mount -t ocfs2 -o datavolume /dev/sdb1 /u02/oradata/orcl
1.2 如果使用raw device
[oracle@node1 raw]$ cd /dev/raw/ [oracle@node1 raw]$ ls -l total 0 crw-r----- 1 oracle oinstall 162, 1 Jun 14 22:20 raw1 crw-r----- 1 oracle oinstall 162, 2 Jun 14 23:56 raw2 crw-r----- 1 oracle oinstall 162, 3 Jun 14 23:56 raw3 crw-r----- 1 oracle oinstall 162, 4 Jun 14 22:20 raw4 [oracle@node1 raw]$ 或者: [root@raw1 init.d]# /etc/init.d/rawdevices status /dev/raw/raw1: bound to major 8, minor 17 /dev/raw/raw2: bound to major 8, minor 18
1.3. 检查ASM
[oracle@node1 ~]$ /etc/init.d/oracleasm listdisks
VOL1
VOL2
[oracle@node1 ~]$
二. 自动启动RAC并检查相关进程
RAC 在启动的时候crs 等进程都是自动启动的: [oracle@node1 ~]$ ls -l /etc/init.d/init.* -r-xr-xr-x 1 root root 1951 Mar 24 05:30 /etc/init.d/init.crs -r-xr-xr-x 1 root root 4719 Mar 24 05:30 /etc/init.d/init.crsd -r-xr-xr-x 1 root root 35399 Mar 24 05:30 /etc/init.d/init.cssd -r-xr-xr-x 1 root root 3195 Mar 24 05:30 /etc/init.d/init.evmd [oracle@node1 ~]$
查看一下crs 的状态,正常情况下, 进程都是online的:
[root@raw1 bin]# ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.raw.db application ONLINE ONLINE raw1 ora.raw.raw.cs application ONLINE ONLINE raw1 ora....aw1.srv application ONLINE ONLINE raw1 ora....aw2.srv application ONLINE ONLINE raw2 ora....w1.inst application ONLINE ONLINE raw1 ora....w2.inst application ONLINE ONLINE raw2 ora....SM1.asm application ONLINE ONLINE raw1 ora....W1.lsnr application ONLINE ONLINE raw1 ora.raw1.gsd application ONLINE ONLINE raw1 ora.raw1.ons application ONLINE ONLINE raw1 ora.raw1.vip application ONLINE ONLINE raw1 ora....SM2.asm application ONLINE ONLINE raw2 ora....W2.lsnr application ONLINE ONLINE raw2 ora.raw2.gsd application ONLINE ONLINE raw2 ora.raw2.ons application ONLINE ONLINE raw2 ora.raw2.vip application ONLINE ONLINE raw2
如果出现以下情况:UNKNOWN状态
[root@rac2 bin]# ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.rac.db application ONLINE UNKNOWN rac1 ora....orcl.cs application ONLINE UNKNOWN rac1 ora....ac1.srv application OFFLINE OFFLINE ora....ac2.srv application OFFLINE OFFLINE ora....c1.inst application ONLINE UNKNOWN rac1 ora....c2.inst application ONLINE UNKNOWN rac2 ora....SM1.asm application ONLINE ONLINE rac1 ora....C1.lsnr application ONLINE UNKNOWN rac1 ora.rac1.gsd application ONLINE UNKNOWN rac1 ora.rac1.ons application ONLINE ONLINE rac1 ora.rac1.vip application ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE UNKNOWN rac2 ora.rac2.gsd application ONLINE UNKNOWN rac2 ora.rac2.ons application ONLINE ONLINE rac2 ora.rac2.vip application ONLINE ONLINE rac2
解决方法:
1. 用crs_stat 查看进程全部信息: [root@rac2 bin]# ./crs_stat NAME=ora.rac.db TYPE=application TARGET=ONLINE STATE=ONLINE on rac2 NAME=ora.rac1.LISTENER_RAC1.lsnr TYPE=application TARGET=ONLINE STATE=UNKNOWN on rac1 NAME=ora.rac1.gsd TYPE=application TARGET=ONLINE STATE=UNKNOWN on rac1 NAME=ora.rac2.LISTENER_RAC2.lsnr TYPE=application TARGET=ONLINE STATE=UNKNOWN on rac2 ... ... 2. 对于offline 的进程,我们可以直接手动的启动它 [root@rac2 bin]# ./crs_start ora.rac.orcl.rac1.srv Attempting to start `ora.rac.orcl.rac1.srv` on member `rac1` Start of `ora.rac.orcl.rac1.srv` on member `rac1` succeeded. 3. 对于UNKNOWN 的进程,我们可以先stop 它, 在start。 [root@rac2 bin]# ./crs_stop ora.rac2.gsd Attempting to stop `ora.rac2.gsd` on member `rac2` Stop of `ora.rac2.gsd` on member `rac2` succeeded. [root@rac2 bin]# ./crs_start ora.rac2.gsd Attempting to start `ora.rac2.gsd` on member `rac2` Start of `ora.rac2.gsd` on member `rac2` succeeded. 4. 如果crs_stop不能结束,crs_start 不能启动的进程,我们有2中方法来解决: 4.1)是用crs_stop -f 参数把crs中状态是UNKNOWN的服务关掉,然后再用crs_start -f (加一个-f的参数)启动所 有的服务就可以。要分别在两个节点上执行; [oracle@rac2 ~]$ crs_start -f ora.ora9i.ora9i2.inst Attempting to start `ora.ora9i.ora9i2.inst` on member `rac2` Start of `ora.ora9i.ora9i2.inst` on member `rac2` succeeded. [oracle@rac2 ~]$ crs_stop -f ora.ora9i.db Attempting to stop `ora.ora9i.db` on member `rac2` Stop of `ora.ora9i.db` on member `rac2` succeeded. 4.2)转换到root用户下用/etc/init.d/init.crs stop先禁用crs,然后再用/etc/init.d/init.crs start去启用crs, 启用crs后会自动启动crs的一系列服务,注意此种方法需要在两台节点上都执行; 5. 可以用命令一次启动和关闭相关进程 [root@rac2 bin]# ./crs_stop -all [root@rac2 bin]# ./crs_start -all
三. 手动启动RAC
一般情况下每次节点启动的时候,所有服务都会自动启动,如果需要关闭或者启动某个节点,如下所示
停止RAC: emctl stop dbconsole (EM控制台,根据自己是否安装EM 执行) srvctl stop instance -d <dbname> -i <instance_name1> srvctl stop instance -d <dbname> -i <instance_name2> srvctl stop asm -n <node_name1> srvctl stop asm -n <node_name2> srvctl stop nodeapps -n <node_name1> srvctl stop nodeapps -n <node_name2> 启动RAC: 和上面的步骤正好相反即 srvctl start nodeapps -n <node_name1> srvctl start nodeapps -n <node_name2> srvctl start asm -n <node_name1> srvctl start asm -n <node_name2> srvctl start instance -d <dbname> -i <node_name2> srvctl start instance -d <dbname> -i <node_name1> emctl start dbconsole 使用 SRVCTL 启动/停止所有实例及其启用的服务。 srvctl start database -d <dbname> srvctl stop database -d <dbname>
注:CRS Resource 包括GSD(Global Serveice Daemon),ONS(Oracle Notification Service),VIP, Database, Instance 和 Service. 这些资源被分成2类:GSD,ONS,VIP 和 Listener 属于Noteapps类;Database,Instance 和Service 属于 Database-Related Resource 类。
示例:
[root@raw1 bin]# ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.raw.db application ONLINE ONLINE raw1 ora.raw.raw.cs application ONLINE ONLINE raw1 ora....aw1.srv application ONLINE ONLINE raw1 ora....aw2.srv application ONLINE ONLINE raw2 ora....w1.inst application ONLINE ONLINE raw1 ora....w2.inst application ONLINE ONLINE raw2 ora....SM1.asm application ONLINE ONLINE raw1 ora....W1.lsnr application ONLINE ONLINE raw1 ora.raw1.gsd application ONLINE ONLINE raw1 ora.raw1.ons application ONLINE ONLINE raw1 ora.raw1.vip application ONLINE ONLINE raw1 ora....SM2.asm application ONLINE ONLINE raw2 ora....W2.lsnr application ONLINE ONLINE raw2 ora.raw2.gsd application ONLINE ONLINE raw2 ora.raw2.ons application ONLINE ONLINE raw2 ora.raw2.vip application ONLINE ONLINE raw2 --关闭 [oracle@node1 bin]$ ./srvctl stop instance -d RACDB -i RACDB1 [oracle@node1 bin]$ ./srvctl stop instance -d RACDB -i RACDB2 [oracle@node1 bin]$ ./srvctl stop asm -n node1 [oracle@node1 bin]$ ./srvctl stop asm -n node2 [oracle@node1 bin]$ ./srvctl stop nodeapps -n node1 [oracle@node1 bin]$ ./srvctl stop nodeapps -n node2 [oracle@node1 bin]$ ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....B1.inst application OFFLINE OFFLINE ora....B2.inst application OFFLINE OFFLINE ora.RACDB.db application OFFLINE OFFLINE ora....SM1.asm application OFFLINE OFFLINE ora....E1.lsnr application OFFLINE OFFLINE ora.node1.gsd application OFFLINE OFFLINE ora.node1.ons application OFFLINE OFFLINE ora.node1.vip application OFFLINE OFFLINE ora....SM2.asm application OFFLINE OFFLINE ora....E2.lsnr application OFFLINE OFFLINE ora.node2.gsd application OFFLINE OFFLINE ora.node2.ons application OFFLINE OFFLINE ora.node2.vip application OFFLINE OFFLINE [oracle@node1 bin]$
四. 在启动的过程中最好检测着crs、ASM和数据库的日志:
crs日志: [oracle@node1 node1]$ tail -f /opt/ora10g/product/10.2.0/crs_1/log/node1/alertnode1.log ASM日志: [oracle@rac1 ~]$ tail -f /opt/ora10g/admin/+ASM/bdump/alert_+ASM1.log 数据库日志: [oracle@rac1 ~]$ tail -f /opt/ora10g/admin/RACDB/bdump/alert_RACDB1.log