Data Guard缺少部分归档日志解决办法

发现问题

测试环境搭建了一套RAC+DG的系统。数据库版本11203。
RAC双节点,DG单节点。
有段时间把DG节点关机,只开主节点RAC。(毕竟并没有线上业务,且大多用来做实验或数据统计用。)
一段时间之后,打开DG节点,发现无法open read only。
检查发现DG缺少部分归档日志,且该部分归档日志已经被删掉,没有备份。
此时有两种解决办法:
1.重新搭建DG环境。
2.不重新搭建DG环境,利用其他办法解决。
本文详细介绍了利用RMAN增量备份解决DG gap的问题。

操作步骤

1.确认问题存在,分析原因。

查看DG日志应用情况,确定日志应用进度。

thread 1: sequence# 214
thread 2: sequence# 112

SQL> select NAME,THREAD#,SEQUENCE#,FIRST_CHANGE#,NEXT_CHANGE#,ARCHIVED,APPLIED,DELETED from v$archived_log;

NAME                                                  THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE# ARC APPLIED DEL
-------------------------------------------------- ---------- ---------- ------------- ------------ --- --------- ---
/u01/app/oracle/archARC_1_0000000211_885898750.arc          1        211       5889618      5889672 YES YES       NO
/u01/app/oracle/archARC_2_0000000111_885898750.arc          2        111       5889621      5889675 YES YES       NO
/u01/app/oracle/archARC_1_0000000212_885898750.arc          1        212       5889672      5936547 YES YES       NO
/u01/app/oracle/archARC_1_0000000213_885898750.arc          1        213       5936547      5942602 YES YES       NO
/u01/app/oracle/archARC_1_0000000214_885898750.arc          1        214       5942602      5953350 YES YES       NO
/u01/app/oracle/archARC_2_0000000112_885898750.arc          2        112       5889675      5954190 YES NO        NO
/u01/app/oracle/archARC_1_0000000274_885898750.arc          1        274       9065115      9144171 YES NO        NO
/u01/app/oracle/archARC_1_0000000275_885898750.arc          1        275       9144171      9226083 YES NO        NO
/u01/app/oracle/archARC_1_0000000276_885898750.arc          1        276       9226083      9252006 YES NO        NO
/u01/app/oracle/archARC_2_0000000154_885898750.arc          2        154       9155657      9228653 YES NO        NO
..................
..................
/u01/app/oracle/archARC_2_0000000249_885898750.arc          2        249      14437971     14442095 YES NO        NO
/u01/app/oracle/archARC_2_0000000252_885898750.arc          2        252      14459559     14461551 YES NO        NO
/u01/app/oracle/archARC_1_0000000373_885898750.arc          1        373      14459562     14461565 YES NO        NO

155 rows selected.

SQL> 
SQL> 

从以上可以看出DG节点日志应用SCN位于5953350和5954190之间。(根据thread 1 和thread 2日志SCN)

查看DG节点归档日志gap视图

SQL> 
SQL> select * from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
         1           215            273
         2           113            152

SQL> 

查询数据库当前SCN

SQL> select current_scn from v$database; CURRENT_SCN -----------
 5953349

确认主库是否有添加数据文件
DG节点关机期间并未添加数据文件
相关SQL如下:

SQL> select FILE#,name from v$datafile where CREATION_CHANGE#> =5953349;

no rows selected

主库增量备份
增量备份的SCN要比当前数据库SCN小。(否则拷贝到DG节点做恢复的时候不起作用,后面有详细输出。)

[oracle@racnode1 ~]$ rman target /

RMAN> backup incremental from SCN 5953344  DATABASE FORMAT '/home/oracle/ljk_%U' TAG 'zengliangbk';

Starting backup at 22-FEB-16

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=28 instance=RACDB_1 device type=DISK
backup will be obsolete on date 29-FEB-16
archived logs will not be kept or backed up
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=+DATADG/racdb/datafile/data01.269.886180153
input datafile file number=00002 name=+DATADG/racdb/datafile/sysaux.261.885898927
input datafile file number=00001 name=+DATADG/racdb/datafile/system.260.885898771
input datafile file number=00003 name=+DATADG/racdb/datafile/undotbs1.262.885898975
input datafile file number=00007 name=+DATADG/racdb/datafile/index01.270.886180183
input datafile file number=00004 name=+DATADG/racdb/datafile/undotbs2.264.885899061
input datafile file number=00005 name=+DATADG/racdb/datafile/users.265.885899091
input datafile file number=00009 name=+DATADG/racdb/datafile/test2.301.888744571
channel ORA_DISK_1: starting piece 1 at 22-FEB-16
channel ORA_DISK_1: finished piece 1 at 22-FEB-16
piece handle=/home/oracle/ljk_11quirn3_1_1 tag=ZENGLIANGBK comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:03:06

using channel ORA_DISK_1
backup will be obsolete on date 29-FEB-16
archived logs will not be kept or backed up
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 22-FEB-16
channel ORA_DISK_1: finished piece 1 at 22-FEB-16
piece handle=/home/oracle/ljk_12quirt0_1_1 tag=ZENGLIANGBK comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 22-FEB-16

备份控制文件(一定要有,否则在DG节点应用完增量备份之后,控制文件不会发生改变,跟应用增量之前一样。)

RMAN> backup current controlfile for standby format '/home/oracle/ljk_ctl.bak';

Starting backup at 22-FEB-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=158 instance=RACDB_1 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including standby control file in backup set
channel ORA_DISK_1: starting piece 1 at 22-FEB-16
channel ORA_DISK_1: finished piece 1 at 22-FEB-16
piece handle=/home/oracle/ljk_ctl.bak tag=TAG20160222T151222 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 22-FEB-16

将主库的增量备份和控制文件备份拷贝到DG节点

[oracle@stdby ~]$ ll
total 2137660
drwxr-xr-x 8 oracle oinstall       4096 Sep 22  2011 database
-rw-r----- 1 oracle oinstall 1065369600 Feb 22 15:28 ljk_11quirn3_1_1 -rw-r----- 1 oracle oinstall 18743296 Feb 22 15:28 ljk_12quirt0_1_1 -rw-r----- 1 oracle oinstall 18743296 Feb 22 15:11 ljk_ctl.bak

在DG节点应用增量备份
先把拷贝过来的备份加入到RMAN管理中

RMAN> catalog backuppiece "/home/oracle/ljk_11quirn3_1_1","/home/oracle/ljk_12quirt0_1_1";

cataloged backup piece
backup piece handle=/home/oracle/ljk_11quirn3_1_1 RECID=23 STAMP=904491054 cataloged backup piece backup piece handle=/home/oracle/ljk_12quirt0_1_1 RECID=24 STAMP=904491057

RMAN> catalog start with '/home/oracle';
List of Cataloged Files =======================
File Name: /home/oracle/ljk_ctl.bak

应用增量备份

RMAN> recover database noredo;

Starting recover at 22-FEB-16
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=10 device type=DISK
RMAN-06900: WARNING: unable to generate V$RMAN_STATUS or V$RMAN_OUTPUT row
RMAN-06901: WARNING: disabling update of the V$RMAN_STATUS and V$RMAN_OUTPUT rows
ORACLE error from target database: 
ORA-19922: there is no parent row with id 0 and level 1

channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/RACSTD/system.260.885898771
destination for restore of datafile 00002: /u01/app/oracle/oradata/RACSTD/sysaux.261.885898927
destination for restore of datafile 00003: /u01/app/oracle/oradata/RACSTD/undotbs1.262.885898975
destination for restore of datafile 00004: /u01/app/oracle/oradata/RACSTD/undotbs2.264.885899061
destination for restore of datafile 00005: /u01/app/oracle/oradata/RACSTD/users.265.885899091
destination for restore of datafile 00006: /u01/app/oracle/oradata/RACSTD/data01.269.886180153
destination for restore of datafile 00007: /u01/app/oracle/oradata/RACSTD/index01.270.886180183
destination for restore of datafile 00009: /u01/app/oracle/oradata/RACSTD/test2.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/ljk_11quirn3_1_1
channel ORA_DISK_1: piece handle=/home/oracle/ljk_11quirn3_1_1 tag=ZENGLIANGBK
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:25

Finished recover at 22-FEB-16

应用备份的控制文件


RMAN> RESTORE STANDBY CONTROLFILE FROM '/home/oracle/ljk_ctl.bak';

Starting restore at 22-FEB-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=10 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
output file name=/u01/app/oracle/oradata/RACSTD/control01.ctl
output file name=/u01/app/oracle/oradata/RACSTD/control02.ctl
Finished restore at 22-FEB-16

遇到的问题

1.增量备份开始SCN如果小于当前DG的SCN,则增量备份无法应用。

RMAN> backup incremental from SCN 5954190 DATABASE FORMAT '/home/oracle/xff_%U' TAG 'XIFENFEI';

SCN 5954190大于DG节点SCN 5953349
应用增量备份的时候

RMAN> recover database noredo;

Starting recover at 22-FEB-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK

Finished recover at 22-FEB-16

2.应用增量备份的时候报错

ORA-19573: cannot obtain exclusive enqueue for datafile 6
出错原因:
日志应用进程已经启动,正在恢复数据库(虽然没有日志)。
解决办法:recover managed standby database cancel
应用增量日志需要DG处于mount状态

RMAN> recover database noredo;

Starting recover at 22-FEB-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/RACSTD/system.260.885898771
destination for restore of datafile 00002: /u01/app/oracle/oradata/RACSTD/sysaux.261.885898927
destination for restore of datafile 00003: /u01/app/oracle/oradata/RACSTD/undotbs1.262.885898975
destination for restore of datafile 00004: /u01/app/oracle/oradata/RACSTD/undotbs2.264.885899061
destination for restore of datafile 00005: /u01/app/oracle/oradata/RACSTD/users.265.885899091
destination for restore of datafile 00006: /u01/app/oracle/oradata/RACSTD/data01.269.886180153
destination for restore of datafile 00007: /u01/app/oracle/oradata/RACSTD/index01.270.886180183
destination for restore of datafile 00009: /u01/app/oracle/oradata/RACSTD/test2.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/ljk_11quirn3_1_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 02/22/2016 15:31:13
ORA-19870: error while restoring backup piece /home/oracle/ljk_11quirn3_1_1
ORA-19573: cannot obtain exclusive enqueue for datafile 6

3恢复控制文件时出错

原因:数据库处于mount状态
解决办法:将DG启动到nomout状态

RMAN> RESTORE STANDBY CONTROLFILE FROM '/home/oracle/ljk_ctl.bak';

Starting restore at 22-FEB-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=135 device type=DISK

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 02/22/2016 15:41:48
RMAN-06496: must use the TO clause when the database is mounted or open

4.新归档的日志无法应用

查看归档日志应用情况,发现新生成的归档日志都没有应用。

SQL> select NAME,THREAD#,SEQUENCE#,ARCHIVED,APPLIED from v$archived_log;
NAME                                                  THREAD# SEQUENCE# ARC APPLIED
-------------------------------------------------- ---------- ---------- --- ---------
/u01/app/oracle/archARC_2_0000000254_885898750.arc          2        254 YES NO
/u01/app/oracle/archARC_2_0000000253_885898750.arc          2        253 YES NO
/u01/app/oracle/archARC_2_0000000255_885898750.arc          2        255 YES NO
/u01/app/oracle/archARC_2_0000000256_885898750.arc          2        256 YES NO
/u01/app/oracle/archARC_2_0000000257_885898750.arc          2        257 YES NO
/u01/app/oracle/archARC_1_0000000375_885898750.arc          1        375 YES NO
/u01/app/oracle/archARC_1_0000000376_885898750.arc          1        376 YES NO
/u01/app/oracle/archARC_1_0000000374_885898750.arc          1        374 YES NO
/u01/app/oracle/archARC_1_0000000377_885898750.arc          1        377 YES NO
/u01/app/oracle/archARC_1_0000000378_885898750.arc          1        378 YES NO
/u01/app/oracle/archARC_2_0000000258_885898750.arc          2        258 YES NO

查看alert日志寻找原因:
发现无法找到数据文件test2.301.888744571

Mon Feb 22 17:09:22 2016
alter database recover managed standby database using current logfile disconnect from session Attempt to start background Managed Standby Recovery process (RACSTD) Mon Feb 22 17:09:22 2016 MRP0 started with pid=24, OS id=11362 MRP0: Background Managed Standby Recovery process started (RACSTD) started logmerger process Mon Feb 22 17:09:27 2016 Managed Standby Recovery starting Real Time Apply Mon Feb 22 17:09:27 2016 Errors in file /u01/app/oracle/diag/rdbms/racstd/RACSTD/trace/RACSTD_dbw0_11312.trc: ORA-01157: cannot identify/lock data file 9 - see DBWR trace file ORA-01110: data file 9: '/u01/app/oracle/oradata/RACSTD/test2.301.888744571' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 3 MRP0: Background Media Recovery terminated with error 1110 Errors in file /u01/app/oracle/diag/rdbms/racstd/RACSTD/trace/RACSTD_pr00_11365.trc: ORA-01110: data file 9: '/u01/app/oracle/oradata/RACSTD/test2.301.888744571' ORA-01157: cannot identify/lock data file 9 - see DBWR trace file ORA-01110: data file 9: '/u01/app/oracle/oradata/RACSTD/test2.301.888744571' Managed Standby Recovery not using Real Time Apply Slave exiting with ORA-1110 exception Errors in file /u01/app/oracle/diag/rdbms/racstd/RACSTD/trace/RACSTD_pr00_11365.trc: ORA-01110: data file 9: '/u01/app/oracle/oradata/RACSTD/test2.301.888744571' ORA-01157: cannot identify/lock data file 9 - see DBWR trace file ORA-01110: data file 9: '/u01/app/oracle/oradata/RACSTD/test2.301.888744571' Recovery Slave PR00 previously exited with exception 1110 MRP0: Background Media Recovery process shutdown (RACSTD) Completed: alter database recover managed standby database using current logfile disconnect from session

查看文件目录和数据文件视图:

[oracle@stdby RACSTD]$ ll
total 5344984
-rw-r----- 1 oracle oinstall 18694144 Feb 22 17:33 control01.ctl -rw-r----- 1 oracle oinstall 18694144 Feb 22 17:33 control02.ctl -rw-r----- 1 oracle oinstall 1677729792 Feb 22 17:26 data01.269.886180153 -rw-r----- 1 oracle oinstall 314580992 Feb 22 17:26 index01.270.886180183 drwxr-x--- 4 oracle oinstall       4096 Aug 24 15:36 RACSTD
-rw-r----- 1 oracle oinstall 52429312 Aug 26 09:57 redo01.log -rw-r----- 1 oracle oinstall 52429312 Aug 26 09:57 redo02.log -rw-r----- 1 oracle oinstall 52429312 Aug 26 09:57 redo03.log -rw-r----- 1 oracle oinstall 52429312 Aug 26 09:57 redo04.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:27 redo05.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:33 redo06.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:27 redo07.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:33 redo08.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:27 redo09.log -rw-r----- 1 oracle oinstall 52429312 Feb 22 17:27 redo10.log -rw-r----- 1 oracle oinstall 1184899072 Feb 22 17:26 sysaux.261.885898927 -rw-r----- 1 oracle oinstall 775954432 Feb 22 17:26 system.260.885898771 -rw-r----- 1 oracle oinstall 10493952 Feb 22 15:45 test2.dbf -rw-r----- 1 oracle oinstall 671096832 Feb 22 17:26 undotbs1.262.885898975 -rw-r----- 1 oracle oinstall 209723392 Feb 22 17:26 undotbs2.264.885899061 -rw-r----- 1 oracle oinstall 61612032 Feb 22 17:26 users.265.885899091
SQL> select name from v$datafile;

NAME --------------------------------------------------------------------------------
/u01/app/oracle/oradata/RACSTD/system.260.885898771
/u01/app/oracle/oradata/RACSTD/sysaux.261.885898927
/u01/app/oracle/oradata/RACSTD/undotbs1.262.885898975
/u01/app/oracle/oradata/RACSTD/undotbs2.264.885899061
/u01/app/oracle/oradata/RACSTD/users.265.885899091
/u01/app/oracle/oradata/RACSTD/data01.269.886180153
/u01/app/oracle/oradata/RACSTD/index01.270.886180183
/u01/app/oracle/oradata/RACSTD/test2.301.888744571

修改数据文件名称(修改之前备份)

[oracle@stdby RACSTD]$ cp test2.dbf test2.dbf.bak
[oracle@stdby RACSTD]$ mv test2.dbf test2.301.888744571

之后查看归档日志应用情况

SQL> select NAME,THREAD#,SEQUENCE#,ARCHIVED,APPLIED from v$archived_log;

NAME                                                  THREAD# SEQUENCE# ARC APPLIED
-------------------------------------------------- ---------- ---------- --- ---------
/u01/app/oracle/archARC_2_0000000254_885898750.arc          2        254 YES YES
/u01/app/oracle/archARC_2_0000000253_885898750.arc          2        253 YES YES
/u01/app/oracle/archARC_2_0000000255_885898750.arc          2        255 YES YES
/u01/app/oracle/archARC_2_0000000256_885898750.arc          2        256 YES YES
/u01/app/oracle/archARC_2_0000000257_885898750.arc          2        257 YES YES
/u01/app/oracle/archARC_1_0000000375_885898750.arc          1        375 YES YES
/u01/app/oracle/archARC_1_0000000376_885898750.arc          1        376 YES YES
/u01/app/oracle/archARC_1_0000000374_885898750.arc          1        374 YES YES
/u01/app/oracle/archARC_1_0000000377_885898750.arc          1        377 YES YES
/u01/app/oracle/archARC_1_0000000378_885898750.arc          1        378 YES YES
/u01/app/oracle/archARC_2_0000000258_885898750.arc          2        258 YES YES

之后以open read only方式可以正常打开DG节点数据库。

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database open;  

Database altered.

SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

SQL> select open_mode from v$database; OPEN_MODE --------------------
READ ONLY WITH APPLY

你可能感兴趣的:(Data Guard缺少部分归档日志解决办法)