2016年04月25日,由于操作失误,执行了数据库安装时的磁盘设置脚本,所有磁盘前100M数据被清除。导致数据库无法使用。4月22日对数据库进行了全库备份(仅包括数据文件),spfile,control文件均未备份。
使用CRS自动备份的文件恢复OCR ,使用备份的pfile生成spfile,启动数据库后,使用数据库自动生成的control文件的快照恢复控制文件,再用rman恢复数据。
本步骤结合网上的步骤,及现实情况,进行总结而成。如有类似情况,请参考执行。
1) 停止RAC上的CRS服务端
SMIRRDB1:/# crsctl stop has -f
SMIRRDB2:/# crsctl stop has -f
2) 在SMIRRDB1节点上以NOCRS方式启动CRS,此操作会启动ASM实例
# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Serviceshas been started.
CRS-2673: Attempting to stop 'ora.gpnpd' on'smirrdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on'smirrdb1'
CRS-2673: Attempting to stop 'ora.gipcd' on'smirrdb1'
CRS-2673: Attempting to stop 'ora.evmd' on'smirrdb1'
CRS-2789: Cannot stop resource 'ora.cssd'as it is not running on server 'smirrdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'smirrdb1'succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'smirrdb1'succeeded
CRS-2677: Stop of 'ora.gipcd' on 'smirrdb1'succeeded
CRS-2677: Stop of 'ora.evmd' on 'smirrdb1'succeeded
CRS-2672: Attempting to start 'ora.evmd' on'smirrdb1'
CRS-2672: Attempting to start 'ora.mdnsd'on 'smirrdb1'
CRS-2676: Start of 'ora.evmd' on 'smirrdb1'succeeded
CRS-2676: Start of 'ora.mdnsd' on'smirrdb1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd'on 'smirrdb1'
CRS-2676: Start of 'ora.gpnpd' on'smirrdb1' succeeded
CRS-2672: Attempting to start'ora.cssdmonitor' on 'smirrdb1'
CRS-2672: Attempting to start 'ora.gipcd'on 'smirrdb1'
CRS-2676: Start of 'ora.gipcd' on'smirrdb1' succeeded
CRS-2676: Start of 'ora.cssdmonitor' on'smirrdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on'smirrdb1'
CRS-2672: Attempting to start 'ora.diskmon'on 'smirrdb1'
CRS-2676: Start of 'ora.diskmon' on'smirrdb1' succeeded
CRS-2676: Start of 'ora.cssd' on 'smirrdb1'succeeded
CRS-2672: Attempting to start'ora.cluster_interconnect.haip' on 'smirrdb1'
CRS-2676: Start of'ora.cluster_interconnect.haip' on 'smirrdb1' succeeded
CRS-2672: Attempting to start'ora.drivers.acfs' on 'smirrdb1'
CRS-2672: Attempting to start 'ora.ctssd'on 'smirrdb1'
CRS-2676: Start of 'ora.ctssd' on'smirrdb1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on'smirrdb1' succeeded
CRS-2672: Attempting to start 'ora.asm' on'smirrdb1'
CRS-2676: Start of 'ora.asm' on 'smirrdb1'succeeded
CRS-2672: Attempting to start 'ora.storage'on 'smirrdb1'
CRS-2676: Start of 'ora.storage' on'smirrdb1' succeeded
3) 按原先的ASM设计方案创建DG
# su - grid
$ sqlplus / as sysasm
SQL> create diskgroup SYSDG externalredundancy
2 disk '/dev/rhdisk2'
3 ATTRIBUTE 'compatible.rdbms' ='12.1','compatible.asm' = '12.1';
Diskgroup created.
SQL> create diskgroup FRADG externalredundancy
2 disk '/dev/rhdisk3','/dev/rhdisk4'
3 ATTRIBUTE 'compatible.rdbms' ='12.1','compatible.asm' = '12.1';
Diskgroup created.
SQL> create diskgroup DATADG externalredundancy
2 disk '/dev/rhdisk7','/dev/rhdisk8'
3 ATTRIBUTE 'compatible.rdbms' ='12.1','compatible.asm' = '12.1';
Diskgroup created.
4) 恢复OCR
# su - root
# ocrconfig -showbackup ## 检查OCR当前的备份文件
PROT-26: Oracle Cluster Registry backuplocations were retrieved from a local copy
smirrdb1 2016/04/25 08:36:34/oracle/app/12.1.0/grid/cdata/CLR-MIRR/backup00.ocr 0
smirrdb1 2016/04/25 04:36:33 /oracle/app/12.1.0/grid/cdata/CLR-MIRR/backup01.ocr 0
smirrdb1 2016/04/25 00:36:32/oracle/app/12.1.0/grid/cdata/CLR-MIRR/backup02.ocr 0
smirrdb1 2016/04/24 04:36:27 /oracle/app/12.1.0/grid/cdata/CLR-MIRR/day.ocr 0
smirrdb2 2016/04/12 18:35:55 /oracle/app/12.1.0/grid/cdata/CLR-MIRR/week.ocr 0
PROT-25: Manual backups for the OracleCluster Registry are not available
使用最新备份的backup00.ocr。
# ocrconfig -restore/oracle/app/12.1.0/grid/cdata/CLR-MIRR/backup00.ocr ## 恢复OCR
## 检查OCR已经恢复
# ocrcheck
Status of Oracle Cluster Registry is asfollows :
Version : 4
Total space (kbytes) : 409568
Used space (kbytes) : 1432
Available space (kbytes) : 408136
ID : 998319658
Device/File Name : +SYSDG
Device/Fileintegrity check succeeded
Device/File Name : +DATADG
Device/Fileintegrity check succeeded
Device/Filenot configured
Device/File not configured
Device/Filenot configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
5) 生成spfile
# su - grid
$ sqlplus / as sysasm
SQL> create spfile from memory;
注:如果指定的目录未创建,请登录创建。
$ asmcmd
SQL> mkdir +SYSDG/CLR-MIRR/ASMPARAMETERFILE
6) 创建VOTEDISK
# crsctl replace votedisk +SYSDG
Successful addition of voting disk 375366a405384f81bfefde626f7d5cc8.
Successfully replaced voting disk groupwith +SYSDG.
CRS-4266: Voting file(s) successfullyreplaced
7) OCR和VOTEDISK都恢复完成后,重启CRS到正常模式
# crsctl stop has -f
# crsctl start crs
# crsctl check crs
CRS-4638: Oracle High Availability Servicesis online
CRS-4537: Cluster Ready Services isonline
CRS-4529: Cluster Synchronization Servicesis online
CRS-4533: Event Manager is online
1) 使用备份的参数文件将数据库启动到nomount状态
SMIRRDB1:/oracle/app/oracle/product/12.1.0/dbhome_1/dbs$sqlplus/ as sysdba
SQL*Plus: Release 12.1.0.2.0 Production onMon Apr 25 11:10:32 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup nomountpfile='dbmirr_pfile01.ora'
ORACLE instance started.
Total System Global Area 4294967296bytes
Fixed Size 5365496 bytes
Variable Size 3590324488 bytes
Database Buffers 687865856 bytes
Redo Buffers 11411456 bytes
2) 创建启动文件spfile
SQL> createspfile='+DATADG/DBMIRR/spfileDBMIRR.ora' from pfile='dbmirr_pfile01.ora';
File created.
## 注,如创建目录不存在,请使用asmcmd命令创建
## 数据库应当有备份的spfile文件,可通过rman恢复,但本rac未备份spfile。
## 使用spfile启动数据库
SQL> shutdown immedate;
SQL> startup nomount force;
3) 恢复control文件
由于本数据库未备份control文件,只能通过control快照来恢复。
快照路径:/oracle/app/oracle/product/12.1.0/dbhome_1/dbs/snapcf_DBMIRR1.f
# su - oracle
$ rman target /
Recovery Manager: Release 12.1.0.2.0 -Production on Mon Apr 25 12:04:35 2016
Copyright (c) 1982, 2014, Oracle and/or itsaffiliates. All rights reserved.
connected to target database: DBMIRR (notmounted)
RMAN> restore controlfile from'snapcf_DBMIRR1.f';
Starting restore at 25-APR-16
using target database control file insteadof recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1991instance=DBMIRR1 device type=DISK
channel ORA_DISK_1: copied control filecopy
output filename=+DATADG/ORAMIRRS/CONTROLFILE/current.257.910094701
output filename=+FRADG/ORAMIRRS/CONTROLFILE/current.256.910094701
Finished restore at 25-APR-16
4) 恢复数据
SQL>alter database mount; ## mount数据库
SMIRRDB1:/soft/orabackup$rmantarget /
Recovery Manager: Release 12.1.0.2.0 -Production on Mon Apr 25 12:07:52 2016
Copyright (c) 1982, 2014, Oracle and/or itsaffiliates. All rights reserved.
connected to target database: DBMIRR(DBID=2598089823, not open)
RMAN>restore database;
Starting restore at 25-APR-16
Starting implicit crosscheck backup at25-APR-16
using target database control file insteadof recovery catalog
allocated channel: ORA_DISK_1
Crosschecked 12 objects
Finished implicit crosscheck backup at25-APR-16
Starting implicit crosscheck copy at25-APR-16
using channel ORA_DISK_1
Finished implicit crosscheck copy at25-APR-16
searching for all files in the recoveryarea
cataloging files...
no files cataloged
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafilebackup set restore
channel ORA_DISK_1: specifying datafile(s)to restore from backup set
channel ORA_DISK_1: restoring datafile00001 to +DATADG/DBMIRR/DATAFILE/system.260.909054177
channel ORA_DISK_1: restoring datafile00003 to +DATADG/DBMIRR/DATAFILE/sysaux.262.909054201
channel ORA_DISK_1: restoring datafile 00005to +DATADG/DBMIRR/DATAFILE/undotbs1.264.909054211
channel ORA_DISK_1: restoring datafile00006 to +DATADG/DBMIRR/DATAFILE/undotbs2.267.909054305
channel ORA_DISK_1: restoring datafile00007 to +DATADG/DBMIRR/DATAFILE/users.268.909054307
channel ORA_DISK_1: restoring datafile00011 to +DATADG/ORAMIRR/CDC/icdcdat01.dbf
channel ORA_DISK_1: reading from backuppiece /soft/orabackup/Primary_bkp_for_stndby_0br3mcrq_1_1
channel ORA_DISK_1: ORA-19870: error whilerestoring backup piece /soft/orabackup/Primary_bkp_for_stndby_0
ORA-19504: failed to create file"+DATADG/ORAMIRR/CDC/icdcdat01.dbf"
ORA-17502: ksfdcre:4 Failed to create file+DATADG/ORAMIRR/CDC/icdcdat01.dbf
ORA-15173: entry 'ORAMIRR' does not existin directory '/'
channel ORA_DISK_1: starting datafilebackup set restore
channel ORA_DISK_1: specifying datafile(s)to restore from backup set
channel ORA_DISK_1: restoring datafile00008 to +DATADG/oramirr_system.dbf
channel ORA_DISK_1: restoring datafile00009 to +DATADG/oramirr_sysaux.dbf
channel ORA_DISK_1: restoring datafile00010 to +DATADG/oramirr_users01.dbf
channel ORA_DISK_1: reading from backuppiece /soft/orabackup/Primary_bkp_for_stndby_0cr3mcsu_1_1
channel ORA_DISK_1: piecehandle=/soft/orabackup/Primary_bkp_for_stndby_0cr3mcsu_1_1tag=TAG20160422T15574
channel ORA_DISK_1: restored backup piece1
channel ORA_DISK_1: restore complete,elapsed time: 00:00:15
channel ORA_DISK_1: starting datafilebackup set restore
channel ORA_DISK_1: specifying datafile(s)to restore from backup set
channel ORA_DISK_1: restoring datafile00002 to +DATADG/DBMIRR/30560331E9710080E0536F010141495F/DATAFILE/s
channel ORA_DISK_1: restoring datafile00004 to +DATADG/DBMIRR/30560331E9710080E0536F010141495F/DATAFILE/s
channel ORA_DISK_1: reading from backuppiece /soft/orabackup/Primary_bkp_for_stndby_0dr3mcte_1_1
channel ORA_DISK_1: piece handle=/soft/orabackup/Primary_bkp_for_stndby_0dr3mcte_1_1tag=TAG20160422T15574
channel ORA_DISK_1: restored backup piece1
channel ORA_DISK_1: restore complete,elapsed time: 00:00:15
failover to previous backup
creating datafile file number=1name=+DATADG/DBMIRR/DATAFILE/system.260.909054177
RMAN-00571:===========================================================
RMAN-00569: =============== ERROR MESSAGE STACKFOLLOWS ===============
RMAN-00571:===========================================================
RMAN-03002: failure ofrestore command at 04/25/2016 12:08:49
ORA-01180: can not createdatafile 1
ORA-01110: data file 1:'+DATADG/DBMIRR/DATAFILE/system.260.909054177'
由于磁盘都已损坏,并不能完全恢复数据,
借助隐藏参数不要再恢复了,也不要redo了,直接启库
SQL> alter system set"_allow_resetlogs_corruption"=true scope=spfile;
SQL> shutdown immediate
SQL> startup
SQL> alter database open RESETLOGS;
SQL> alter system set"_allow_resetlogs_corruption"=false scope=spfile;
SQL> shutdown immediate;
将SMIRRDB1中新的OCR位置信息拷贝至SMIRRDB2 /etc/oracle/ocr.loc中
再两台机器重新启动,启动数据库集群。
在本案例中MGMTDB未恢复,需要进行重建。