客户一套核心系统由一台Oracle Database 11.2.0.3.4单机和一台Active Data Guard组成,分别运行在两台PC服务器上,Oracle Linux 5.8 x86_64bit操作系统,两台服务器都未接存储设备;由于原有设备老旧等原因,现在要将这套Oracle数据库系统(主库和ADG库)迁移到新采购的两台服务器上,不跨版本,也不跨平台。为了最小化停机时间,我们先用目前最新的RMAN 0级备份在两台新服务器上restore database,之后将到目前为止的所有1级备份和归档日志restore和recover到两个数据库上,在主数据库正常停机之后把剩余的归档和在线Redo日志文件应用到两个新数据库,使他们的数据到最新,且是一致的,最后打开主数据库,恢复ADG的同步,整个过程从凌晨0点开始停机,一直持续到了4:20才迁移成功,之间遇到了不少小的问题,再次进行记录:
1.RMAN报错。
RMAN在应用部分归档日志之后收到如下报错:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 11/13/2014 00:03:03
ORA-00283: recovery session canceled due to errors
RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile '/oradata/bak/archivelog/2014_11_12/o1_mf_1_62193_b65oryl5_.arc'
ORA-00283: recovery session canceled due to errors
ORA-19755: could not open change tracking file
ORA-19750: change tracking file: '/u01/app/oracle/block_change_file'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
执行下面的SQL禁用block change tracking,数据库即可继续正常的应用archivelog:
SQL > alter database disable block change tracking;
Database altered.
参考文章:http://gavinsoorma.com/2009/07/rman-recovery-interrupted-due-to-block-change-tracking-file/
2.resetlogs之后检查主库和备库的日志同步情况。
通常在主库执行以下的SQL语句可以用于检查主库和备库日志同步情况:
SQL> select dest_id,thread#,max(sequence#) from v$archived_log where resetlogs_change#=936497858 group by dest_id,thread#;
DEST_ID THREAD# MAX(SEQUENCE#)
---------- ---------- --------------
2 1 9
1 1 9
由于主数据库在打开的时候执行了alter database open resetlogs,所以查询v$archived_log要跟上resetlogs_change#,确保查看的是现在数据库的归档情况,resetlogs_change#可以通过v$database.resetlogs_change#获得,另外,由于resetlogs打开了数据库,所以sequence#重新开始计数。
3.对活动的Standby LogFile的处理。
停止主数据库之后,我们是想将原来的所有Online Redo Logfile和Standby Logfile都拷贝到新服务器,通过alter databae rename file ... to ...的方式进行重命名,没想到的是Active的Standby Logfile无法进行重命名(收到报错:ORA-01511: error in renaming log/data files),但又必须将Standby Logfile文件放在指定的目录下,下面是查询v$logfile的状态:
SQL> select group#,member from v$logfile;
GROUP#
----------
MEMBER
--------------------------------------------------------------------------------
3
/oradata/orcl/REDO03.LOG
2
/oradata/orcl/REDO02.LOG
1
/oradata/orcl/REDO01.LOG
GROUP#
----------
MEMBER
--------------------------------------------------------------------------------
4
/u01/app/oracle/oradata/orcl/sredo01.log
5
/oradata/orcl/sredo02.log
6
/oradata/orcl/sredo03.log
GROUP#
----------
MEMBER
--------------------------------------------------------------------------------
7
/oradata/orcl/sredo04.log
7 rows selected.
group# 4是主数据库之前的Active Standby Logfile,无法对其进行alter database rename file操作。
SQL> alter database drop logfile group 4;
alter database drop logfile group 4
*
ERROR at line 1:
ORA-00315: log 4 of thread 1, wrong thread # 0 in header
ORA-00312: online log 4 thread 1: '/u01/app/oracle/oradata/orcl/sredo01.log'
尝试DROP GROUP组失败。
SQL> alter database add logfile member '/oradata/orcl/sredo01.log' to group 4;
alter database add logfile member '/oradata/orcl/sredo01.log' to group 4
*
ERROR at line 1:
ORA-16161: Cannot mix standby and online redo log file members for group 4
尝试添加成员失败。
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 4;
ALTER DATABASE CLEAR LOGFILE GROUP 4
*
ERROR at line 1:
ORA-00350: log 4 of instance orcl (thread 1) needs to be archived
ORA-00312: online log 4 thread 1: '/u01/app/oracle/oradata/orcl/sredo01.log'
由于未归档所以直接CLEAR失败。
SQL> ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 4;
Database altered.
CLEAR UNARCHIVED成功。
对Standby Logfile的处理办法和对Online Redo Logfile的处理办法一致。
SQL> select group#,thread#,status from v$standby_log;
GROUP# THREAD# STATUS
---------- ---------- ----------
4 1 UNASSIGNED
5 1 UNASSIGNED
6 1 UNASSIGNED
7 1 UNASSIGNED
SQL> select member from v$logfile;
MEMBER
--------------------------------------------------------------------------------
/oradata/orcl/REDO03.LOG
/oradata/orcl/REDO02.LOG
/oradata/orcl/REDO01.LOG
/u01/app/oracle/oradata/orcl/sredo01.log
/oradata/orcl/sredo02.log
/oradata/orcl/sredo03.log
/oradata/orcl/sredo04.log
7 rows selected.
SQL> alter database drop logfile group 4;
Database altered.
成功DROP该日志组。
SQL> select member from v$logfile;
MEMBER
--------------------------------------------------------------------------------
/oradata/orcl/REDO03.LOG
/oradata/orcl/REDO02.LOG
/oradata/orcl/REDO01.LOG
/oradata/orcl/sredo02.log
/oradata/orcl/sredo03.log
/oradata/orcl/sredo04.log
6 rows selected.
SQL> select group#,thread#,bytes/1024/1024 mb from v$standby_log;
GROUP# THREAD# MB
---------- ---------- ----------
5 1 50
6 1 50
7 1 50
SQL> ALTER DATABASE ADD STANDBY LOGFILE group 4 ('/oradata/orcl/sredo01.log') SIZE 50M;
Database altered.
添加GROUP 4新的位置。
SQL> select member from v$logfile;
MEMBER
--------------------------------------------------------------------------------
/oradata/orcl/REDO03.LOG
/oradata/orcl/REDO02.LOG
/oradata/orcl/REDO01.LOG
/oradata/orcl/sredo01.log
/oradata/orcl/sredo02.log
/oradata/orcl/sredo03.log
/oradata/orcl/sredo04.log
7 rows selected.
注意:以上的操作可能在备库上无法完成,解决方法是,在主库完成Standby Logfile迁移之后,主库在MOUNT状态下创建新的for Standby Controlfile(alter database create standby controlfile as '/tmp/controlf.ctl'; ),将新的Standby Controlfile和Standby Logfile传递到相同的位置,恢复备库到一致状态,打开备库,开始应用日志即可。可以参考文章《Oracle Active Data Guard调整案例[2]》:http://blog.itpub.net/23135684/viewspace-1262326/
Data Guard备库一定要是一致的状态才能open read only打开,否者执行alter database recover managed standby database应用日志,恢复到一致性的状态,再open read only。
4.主库和备用库无法实时同步。
原有的主数据库和备用数据库配置的时最高性能模式下的实时同步,完成迁移后无法进行实时同步,但归档切换后的同步正常。
首先主数据库的LOG_ARCHIVE_DEST_2配置好了LGWR AFFIRM SYNC参数,备库执行了如下应用命令:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
且主库和备用库的日志传输服务正常工作。
在主数据库和备用数据库执行如下SQL语句:
SQL> select group#,thread#,status,bytes/1024/1024 mb from v$standby_log;
GROUP# THREAD# STATUS MB
---------- ---------- ---------- ----------
4 0 UNASSIGNED 100
5 0 UNASSIGNED 100
6 0 UNASSIGNED 100
7 0 UNASSIGNED 100
正常情况下,备用数据库Standby Logfile Group至少有一个GROUP是Active的状态,这里没有,所以不能实时同步数据,仔细观察可以发现,这里的Thread等于0,这或许就是备用数据库找不到Standby Logfile进行实时同步的原因。
将备用数据库启动到MOUNT状态,执行如下的SQL,指定Standby Logfile的Thread号:
SQL> startup mount
ORACLE instance started.
Total System Global Area 1.3683E+11 bytes
Fixed Size 2245480 bytes
Variable Size 8858373272 bytes
Database Buffers 1.2778E+11 bytes
Redo Buffers 189480960 bytes
Database mounted.
SQL> ALTER DATABASE ADD STANDBY LOGFILE thread 1 ('/oradata/orcl/sredo05.log') SIZE 100M reuse;
Database altered.
SQL> select group#,thread#,status,bytes/1024/1024 mb from v$standby_log;
GROUP# THREAD# STATUS MB
---------- ---------- ---------- ----------
4 0 UNASSIGNED 100
5 0 UNASSIGNED 100
6 0 UNASSIGNED 100
7 0 UNASSIGNED 100
8 1 UNASSIGNED 100
SQL> alter database drop logfile group 4;
Database altered.
SQL> alter database drop logfile group 5;
Database altered.
SQL> alter database drop logfile group 6;
Database altered.
SQL> alter database drop logfile group 7;
Database altered.
SQL> ALTER DATABASE ADD STANDBY LOGFILE thread 1 group 4 ('/oradata/orcl/sredo01.log') SIZE 100M reuse;
Database altered.
SQL> ALTER DATABASE ADD STANDBY LOGFILE thread 1 group 5 ('/oradata/orcl/sredo02.log') SIZE 100M reuse;
Database altered.
SQL> ALTER DATABASE ADD STANDBY LOGFILE thread 1 group 6 ('/oradata/orcl/sredo03.log') SIZE 100M reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1 group 7 ('/oradata/orcl/sredo04.log') SIZE 100M reuse;
Database altered.
SQL> select group#,thread#,status,bytes/1024/1024 mb from v$standby_log;
GROUP# THREAD# STATUS MB
---------- ---------- ---------- ----------
4 1 UNASSIGNED 100
5 1 UNASSIGNED 100
6 1 UNASSIGNED 100
7 1 UNASSIGNED 100
8 1 ACTIVE 100
之后数据同步即恢复了正常工作,通常的操作还需要在主数据库上完成,主库和备库实时同步期间会不断的在两个或多个Standby Logfile之间切换。
--end--