今天开虚拟机,rac一台实例突然无法启动:
[oracle@rac2 bdump]$ crs_stat -t
HA Resource Target State
--------------------------------------------------------------------------
[oracle@rac2 bdump]$ crs_stat
HA Resource Target State
--------------------------------------------------------------------------
ora.rac1.ASM1.asm ONLINE ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr ONLINE ONLINE on rac1
ora.rac1.gsd ONLINE ONLINE on rac1
ora.rac1.ons ONLINE ONLINE on rac1
ora.rac1.vip ONLINE ONLINE on rac1
ora.rac2.ASM2.asm ONLINE ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr ONLINE ONLINE on rac2
ora.rac2.gsd ONLINE ONLINE on rac2
ora.rac2.ons ONLINE ONLINE on rac2
ora.rac2.vip ONLINE ONLINE on rac2
ora.racdb.db ONLINE ONLINE on rac2
ora.racdb.hchservice1.cs ONLINE ONLINE on rac1
ora.racdb.hchservice1.racdb2.srv ONLINE ONLINE on rac1
ora.racdb.racdb1.inst ONLINE ONLINE on rac1
ora.racdb.racdb2.inst OFFLINE OFFLINE
第一反应就是去手动启动下实例:
SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/racdb/spfileracdb.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/racdb/spfileracdb.ora
ORA-15077: could not locate ASM instance serving a required diskgroup
看这里就发现,从spfile启动开始就有问题了:
猜测是asm出问题:
[oracle@rac2 bdump]$ export ORACLE_SID=+ASM2
[oracle@rac2 bdump]$ sqlplus / as sysdba;
SQL*Plus: Release 10.2.0.5.0 - Production on Thu Nov 22 15:12:18 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> set line 200;
SQL> select name,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA1 MOUNTED
DATA2 MOUNTED
看到这里就明白了,这里还有一个DATA没有挂载成功,在正常启动的rac1中:
[oracle@rac1 raw]$ export ORACLE_SID=+ASM1
[oracle@rac1 raw]$ sqlplus / as sysdba;
SQL*Plus: Release 10.2.0.5.0 - Production on Thu Nov 22 15:14:09 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> select name ,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA1 MOUNTED
DATA2 MOUNTED
DATA MOUNTED
手动重启下rac2的asm实例,看下会报什么错:
SQL> shutdown immediate;
ASM diskgroups dismounted
ASM instance shutdown
SQL> startup
ASM instance started
Total System Global Area 130023424 bytes
Fixed Size 2094544 bytes
Variable Size 102763056 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
这里显示data磁盘组加载出现问题,查看对应的alert:
NOTE: cache dismounting group 3/0x79818E72 (DATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA was not mounted
NOTE: cache opening disk 0 of grp 1: DATA1_0000 path:/dev/raw/raw8
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 1/0x5A318E70 (DATA1)
Thu Nov 22 15:15:17 CST 2012
对比实例rac1的日志可以发现,rwa/raw6裸设备没有加载:
[oracle@rac2 raw]$ ls
raw1 raw2 raw3 raw4 raw5 raw7 raw8
从这里可以看出来确实少了裸设备6:
[root@rac2 /]# cat /etc/sysconfig/rawdevices
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw1 /dev/sde1
/dev/raw/raw2 /dev/sdf1
/dev/raw/raw3 /dev/sdg1
/dev/raw/raw4 /dev/sdh1
/dev/raw/raw5 /dev/sdi1
/dev/raw/raw6 /dev/sdb1
/dev/raw/raw7 /dev/sdc1
/dev/raw/raw8 /dev/sdd1
查看资源情况:
手动启动下裸设备配置:
[root@rac2 /]# service rawdevices restart
Assigning devices:
/dev/raw/raw1 --> /dev/sde1
Error setting raw device (Device or resource busy)
/dev/raw/raw2 --> /dev/sdf1
Error setting raw device (Device or resource busy)
/dev/raw/raw3 --> /dev/sdg1
Error setting raw device (Device or resource busy)
/dev/raw/raw4 --> /dev/sdh1
Error setting raw device (Device or resource busy)
/dev/raw/raw5 --> /dev/sdi1
Error setting raw device (Device or resource busy)
/dev/raw/raw6 --> /dev/sdb1
/dev/raw/raw6: bound to major 8, minor 17
/dev/raw/raw7 --> /dev/sdc1
Error setting raw device (Device or resource busy)
/dev/raw/raw8 --> /dev/sdd1
Error setting raw device (Device or resource busy)
done
[root@rac2 /]# cd /dev/raw
[root@rac2 raw]# ls
raw1 raw2 raw3 raw4 raw5 raw6 raw7 raw8
可以看到,现在有了raw6:
进入asm实例,mount下就可以了:
SQL> select name ,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA1 DISMOUNTED
DATA2 DISMOUNTED
DATA DISMOUNTED
SQL> ALTER DISKGROUP ALL MOUNT
2 ;
Diskgroup altered.
SQL> select name ,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA1 MOUNTED
DATA2 MOUNTED
DATA MOUNTED