oracle 磁盘组掉了,Oracle 11g RAC ASM磁盘全部丢失后的恢复(一)

一、环境描述

(1) 11.2.0.3 RAC ON Oracle Linux 6 x86_64,只有一个ASM外部冗余磁盘组——DATA;

(2)OCR,VOTEDISK,DATAFILE,CONTROLFILE,SPFILE全部位于这个磁盘组上;

二、故障描述

(1)存储故障导致ASM磁盘丢失。

(2)CRS因为OCR和VOTEDISK的丢失,除了OHAS还联机外,CLUSTERWARE服务都已经停止。

三、备份情况

(1)RMAN备份:包括controlfile,database,spfile,archivelog,

(2)OCR备份:没有进行过人工备份,在$CRS_HOME/cdata目录下有CRS自动备份文件。

四、操作步骤

说明:准使用CRS自动备份的文件恢复OCR,使用RMAN备份来恢复数据库;准备恢复数据的同时,调整ASM磁盘组,将OCR,VOTEDISK同数据库文件分开存放。

推荐阅读:

Oracle 11g从入门到精通 PDF+光盘源代码

12.04(amd64)安装完Oracle 11gR2后各种问题解决方法

4.1 恢复OCR和VOTEDISK

(1) 在所有RAC节点上停止CRS服务

[root@rac1 ~]# crsctl stop has -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'

CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'

CRS-2673: Attempting to stop 'ora.crf'on'rac1'

CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded

CRS-2677: Stop of'ora.crf'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'

CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'

CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed

CRS-4133: Oracle High Availability Services has been stopped.

[root@rac2 ~]# crsctl stop has -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac2'

CRS-2673: Attempting to stop 'ora.mdnsd'on'rac2'

CRS-2673: Attempting to stop 'ora.crf'on'rac2'

CRS-2677: Stop of'ora.mdnsd'on'rac2' succeeded

CRS-2677: Stop of'ora.crf'on'rac2' succeeded

CRS-2673: Attempting to stop 'ora.gipcd'on'rac2'

CRS-2677: Stop of'ora.gipcd'on'rac2' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd'on'rac2'

CRS-2677: Stop of'ora.gpnpd'on'rac2' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac2' has completed

CRS-4133: Oracle High Availability Services has been stopped.

(2) 在一个节点上以NOCRS方式启动CRS,此操作会启动ASM实例。

[root@rac1 ~]# crsctl start crs -excl -nocrs

CRS-4123: Oracle High Availability Services has been started.

CRS-2672: Attempting to start 'ora.mdnsd'on'rac1'

CRS-2676: Start of'ora.mdnsd'on'rac1' succeeded

CRS-2672: Attempting to start 'ora.gpnpd'on'rac1'

CRS-2676: Start of'ora.gpnpd'on'rac1' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor'on'rac1'

CRS-2672: Attempting to start 'ora.gipcd'on'rac1'

CRS-2676: Start of'ora.cssdmonitor'on'rac1' succeeded

CRS-2676: Start of'ora.gipcd'on'rac1' succeeded

CRS-2672: Attempting to start 'ora.cssd'on'rac1'

CRS-2672: Attempting to start 'ora.diskmon'on'rac1'

CRS-2676: Start of'ora.diskmon'on'rac1' succeeded

CRS-2676: Start of'ora.cssd'on'rac1' succeeded

CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip'on'rac1'

CRS-2672: Attempting to start 'ora.ctssd'on'rac1'

CRS-2681: Clean of'ora.cluster_interconnect.haip'on'rac1' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip'on'rac1'

CRS-2676: Start of'ora.ctssd'on'rac1' succeeded

CRS-2676: Start of'ora.cluster_interconnect.haip'on'rac1' succeeded

CRS-2672: Attempting to start 'ora.asm'on'rac1'

CRS-2676: Start of'ora.asm'on'rac1' succeeded

(3) 新添加了三块磁盘,已经使用UDEV进行了绑定,查看磁盘状态。

[root@rac1 ~]# su - grid

[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 17:41:49 2013

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> select group_number group#, disk_number disk#, OS_MB, state, path, header_status from v$asm_disk orderby 1,2;

GROUP# DISK# OS_MB STATE PATH HEADER_STATUS

---------- ---------- ---------- ---------- -------------------- ----------------------

0 0 1024 NORMAL /dev/asm-diskc CANDIDATE

0 1 5120 NORMAL /dev/asm-diskd CANDIDATE

0 2 20480 NORMAL /dev/asm-diskb CANDIDATE

(4) 创建三个磁盘组,SYSTEMDG给CRS使用,用于存放OCR,VOTEDISK和ASM实例的SPFILE。其余两个给ORACLE使用,DATADG用于存放datafile,controlfile,redolog,spfile;ARCLOGDG存放archivelog。

SQL> create diskgroup SYSTEMDG external redundancy

2 disk '/dev/asm-diskc'

3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup DATADG external redundancy

2 disk '/dev/asm-diskb'

3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup ARCLOGDG external redundancy

2 disk '/dev/asm-diskd'

3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

(5) 准备恢复OCR和VOTEDISK,/etc/oracle/ocr.loc中记录了OCR路径,修改ocrconfig_loc的值,以便将OCR恢复到新的磁盘组中。

[root@rac1 ~]# more /etc/oracle/ocr.loc

ocrconfig_loc=+DATA

local_only=FALSE

[root@rac1 ~]# vi /etc/oracle/ocr.loc

ocrconfig_loc=+SYSTEMDG

local_only=FALSE

(6) 恢复OCR

[root@rac1 ~]# ocrconfig -showbackup

PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy

rac1 2013/07/05 12:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr

rac1 2013/07/05 08:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr

rac1 2013/07/05 04:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup02.ocr

rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr

rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr

PROT-25: Manual backups for the Oracle Cluster Registry are not available

[root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr

[root@rac1 ~]#

[root@rac1 ~]# ocrcheck

Status of Oracle Cluster Registry isas follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 2840

Available space (kbytes) : 259280

ID : 59415097

Device/File Name : +SYSTEMDG

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

(7) 创建VOTEDISK

[root@rac1 ~]# crsctl replace votedisk +SYSTEMDG

CRS-4602: Failed 27 toadd voting file afb0ca0f35684f1abfd43d5ec2dc1123.

Failed toreplace voting disk groupwith +SYSTEMDG.

CRS-4000: Command Replace failed, or completed with errors.

以上报错是因为使用UDEV绑定ASM磁盘时需要更改默认磁盘搜索路径为/dev/asm*,修改ASM磁盘搜索路径

[root@rac1 ~]# su - grid

[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 19:03:25 2013

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter asm_diskstring

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

asm_diskstring string

SQL>

SQL>

SQL> alter system set asm_diskstring = '/dev/asm*';

System altered.

SQL> create spfile from memory;

create spfile from memory

*

ERROR at line 1:

ORA-00349: failure obtaining block sizefor

'+DATA/rac-cluster/asmparameterfile/registry.253.819922365'

ORA-15001: diskgroup "DATA" does not exist orisnot mounted

SQL> create spfile='+SYSTEMDG'from memory;

File created.

SQL> startup force mount;

ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance

ASM instance started

Total System Global Area 283930624 bytes

Fixed Size 2227664 bytes

Variable Size 256537136 bytes

ASM Cache 25165824 bytes

ASM diskgroups mounted

在次创建VOTEDISK,成功。

[root@rac1 init]# crsctl replace votedisk +SYSTEMDG

Successful addition of voting disk 8ebb7a63accb4fa8bfa7ab65df7a8c8a.

Successfully replaced voting disk groupwith +SYSTEMDG.

CRS-4266: Voting file(s) successfully replaced

(8) OCR和VOTEDISK都恢复完成后,重启CRS到正常模式。

[root@rac1 ~]# crsctl stop has -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'

CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'

CRS-2673: Attempting to stop 'ora.ctssd'on'rac1'

CRS-2673: Attempting to stop 'ora.asm'on'rac1'

CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded

CRS-2677: Stop of'ora.asm'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip'on'rac1'

CRS-2677: Stop of'ora.ctssd'on'rac1' succeeded

CRS-2677: Stop of'ora.cluster_interconnect.haip'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cssd'on'rac1'

CRS-2677: Stop of'ora.cssd'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'

CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'

CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed

CRS-4133: Oracle High Availability Services has been stopped.

[root@rac1 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

[root@rac1 ~]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

[root@rac1 ~]#

你可能感兴趣的:(oracle,磁盘组掉了)