oracle 11GR2 dataguard SWITCHOVER FAILOVER

1. switch:用户主动切换;
2. failover:主库出现故障,强行切换;

switchover:一般switchover切换都是计划中的切换,特点是切换后,不会丢失任何数据,而且这个过程是可逆的,整个dataguard环境不会被破坏,原来dataguard环境中的所有物理和逻辑standby都可以继续工作。
在进行dataguard的物理standby切换前需要注意:
1、确认主库和从库间网络连接通畅;
2、确认没有活动的会话连接在数据库中;
3、primary数据库处于打开的状态,standby数据库处于mount状态;
4、确保standby数据库处于archivelog模式;
5、如果设置了redo应用延迟,那么将这个设置去掉;
6、确保配置了主库和从库的初始化参数,使得切换完成后,dataguard机制可以顺利的运行。
switch切换过程:先主库再备库

主库:
由于主库处于open状态,有访问的,所以v$database视图中,switch_status为session active,而由primary切换到standby需要数据库为open状态,因此,执行切换命令时,带上with session shutdown选项即可。
执行完切换命令后,关闭数据库,重新启动数据库到mount状态等待日志传输,开启日志应用。
查看alert log可以看到主库做了哪些动作:主库断开所有session(未提交事务会回滚),备份控制文件,切换日志并归档,传输日志到备库,给备库一个End-Of-REDO的信号,切换为standby,重新启动到mount。
查看switchover状态:

 代码如下 复制代码
SQL> select database_role,switchover_status from v$database;

附: A:switchover_status出现session active/not allowed
当出现session active的时候表示还有活动的session

 代码如下 复制代码
SQL> alter database commit to switchover to physical standby with session shutdown;
SQL> shutdown immediate;
SQL> startup nomount;
SQL> alter database mount standby database;

备库:
确认是否可以切换为主库,如果switchover_status为recovery needed或switchover latent,需要apply完所有归档日志才能切换。如果是sessions active则带上with session shutdown选项。apply完所有日志后,即可切换为primary,然后打开数据库。

查看alert.log可以看到备库做了哪些动作:关闭arch进程,接收主库日志,接收到主库End-Of-REDO的信号,apply完所有日志,清空online redo log以便打开数据库,切换为primary,打开数据库。

 代码如下 复制代码
SQL> select database_role,switchover_status from v$database;
SQL> alter database commit to switchover to primary with session shutdown;
ERROR at line 1:
ORA-16139: media recovery required
SQL> alter database recover managed standby database disconnect from session;
Database altered.
SQL> select database_role,switchover_status from v$database;
DATABASE_ROLE SWITCHOVER_STATUS
—————- ——————–
PHYSICAL STANDBYTO PRIMARY
SQL> alter database commit to switchover to primary with session shutdown;
Database altered.
SQL> shutdown immediate;
SQL> startup;

以上过程,由于主库断开所有session并归档,传输日志到备库,发给备库end-ofredo的信号,因此正常swithch时,是不会丢失数据的。
切换完成后可以在主库归档,验证一下是否切换成功,备库是否能正常接收日志。

开启日志应用:(主库-原备库)

 代码如下 复制代码
SQL> alter database recover managed standby database using current logfile disconnect from session;

 

FailOver:当主库当掉,无法使用时,此时的切换即为failover,如果保护模式为最大性能模式,是可能丢失数据的。
备库端:
如果是最大保护和最大可用性模式,则可以直接在备库端执行failover切换。如果是最大性能模式,为了尽可能减少数据丢失,需要检查主库是否有日志没有传输到备库,手动传输备库进行注册和恢复。注意RAC环境下,归档日志是分线程的。

1. 停止日志应用

 代码如下 复制代码
alter database recover managed standby database cancel;

2. 关闭standby日志传输

 代码如下 复制代码
alter database recover managed standby database finish force;

3. 切换到primary

 代码如下 复制代码
alter database commit to switchover to primary with session shutdown;

做这一步的时候,若存在gap,则会报ORA-16139:Switchover: Media recovery required – standby not in limbo 错误。
做测试的时候,若先起主库再起备库,且未等待备库相关日志传输完毕,就会出现这个问题。此时需要强制切换

 代码如下 复制代码
alter database activate physical standby database;

4. 重启数据库到open状态

 代码如下 复制代码
[oracle@testdb dev01]$ scp * [email protected]:/u01/archive/dev01dg

注册归档日志有如下两种方法,较为简单当然是用rman了,一次注册多个。

 代码如下 复制代码
RMAN>catalog start with ‘/u01/archive/dev01′;
SYS@dev01dg>alter database register logfile ‘/u01/archive/dev01dg/arch_e8fe6364_1_712757927_460.dbf’;
apply归档日志也有两种方法。
SYS@dev01dg>alter database recover managed standby database disconnect from session;
Database altered.
SYS@dev01dg>recover standby database;
ORA-00279: change 2863819 generated at 03/20/2010 21:58:17 needed for thread 1
ORA-00289: suggestion : /u01/archive/dev01dg/arch_e8fe6364_1_712757927_461.dbf
ORA-00280: change 2863819 for thread 1 is in sequence #461

当手动apply完所有的日志后,就可以failover切换到primary了,但是要注意的是,由于备库没有收到主库的end-of-redo的信号,所以直接转换会报错,要求介质恢复,此时需要提交命令告诉备库,日志恢复已经finish,需要进行failover切换,注意switch时千万不要带有finish选项,否则就会变成failover了。

 代码如下 复制代码
SYS@dev01dg> alter database commit to switchover to primary with session shutdown;
alter database commit to switchover to primary with session shutdown
*
ERROR at line 1:
ORA-16139: media recovery required
SYS@dev01dg> select database_role,switchover_status from v$database;
DATABASE_ROLE SWITCHOVER_STATUS
—————- ——————–
PHYSICAL STANDBYNOT ALLOWED
SYS@dev01dg>alter database recover managed standby databasefinish[force];
Database altered.
SYS@dev01dg> select database_role,switchover_status from v$database;
DATABASE_ROLE SWITCHOVER_STATUS
—————- ——————–
PHYSICAL STANDBYTO PRIMARY
SYS@dev01dg>alter database commit to switchover to primary with session shutdown;
Database altered.
SYS@dev01dg>alter database open;
Database altered.
SYS@dev01dg> select database_role,switchover_status from v$database;
DATABASE_ROLE SWITCHOVER_STATUS
—————- ——————–
PRIMARY SESSIONS ACTIVE

failover完成后,数据库其实是以resetlogs方式打开的,如果log_archive_format=’arch_%d_%t_%r_%s.dbf’,可以看到归档日志的文件名会有新的resetlogs ID和sequence number,以此与原有的归档日志进行区分。

补充11g官方文档处理顺序和操作语句
1、主库切换

 代码如下 复制代码
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
shutdown immediate;
startup nomount;
alter database mount standby database;

2、备库切换

 代码如下 复制代码
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
ALTER DATABASE OPEN;

3、开启应用(新备库–原主库)

 代码如下 复制代码
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;



Failover and Reinstate


DGMGRL> show configuration;


Configuration - mecbs_bk


  Protection Mode: MaxPerformance

  Databases:

    phub  - Primary database

    mecbs - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS


[oracle@node1 ~]$ dgmgrl sys/123123@MECBS

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production


Copyright (c) 2000, 2009, Oracle. All rights reserved.


Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> failover to mecbs;

Performing failover NOW, please wait...


[oracle@node1 ~]$ dgmgrl sys/123123@MECBS

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production


Copyright (c) 2000, 2009, Oracle. All rights reserved.


Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> failover to mecbs;

Performing failover NOW, please wait...

Failover succeeded, new primary is "mecbs


SQL> select database_role,switchover_status,open_mode from gv$database;


DATABASE_ROLE SWITCHOVER_STATUS    OPEN_MODE

---------------- -------------------- --------------------

PRIMARY NOT ALLOWED      READ WRITE

PRIMARY NOT ALLOWED      READ WRITE



DGMGRL> show configuration;


Configuration - mecbs_bk


  Protection Mode: MaxPerformance

  Databases:

    mecbs - Primary database

    phub  - Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS


reinstate 操作:

DGMGRL> reinstate database phub;

Reinstating database "phub", please wait...

Operation requires shutdown of instance "PHUB" on database "phub"

Shutting down instance "PHUB"...

Unable to connect to database

ORA-12514: TNS:listener does not currently know of service requested in connect descriptor


Failed.

Warning: You are no longer connected to ORACLE.


Please complete the following steps and reissue the REINSTATE command:

shut down instance "PHUB" of database "phub"

start up and mount instance "PHUB" of database "phub"


DGMGRL> reinstate database phub;

Reinstating database "phub", please wait...

Error: ORA-16653: failed to reinstate database

Failed.

Reinstatement of database "phub" failed


SQL> select tablespace_name,block_size,status from dba_tablespaces;


TABLESPACE_NAME       BLOCK_SIZE STATUS

------------------------------ ---------- ---------

SYSTEM     8192 ONLINE

SYSAUX     8192 ONLINE

UNDOTBS1     8192 ONLINE

TEMP     8192 ONLINE

USERS     8192 ONLINE

UNDOTBS2     8192 ONLINE

EXAMPLE     8192 ONLINE

CRM     8192 ONLINE

AIX_TRANS     8192 READ ONLY

这个表空间只读:

Errors in file /u01/app/oracle/admin/PHUB/diag/rdbms/phub/PHUB/trace/PHUB_rsm0_5310.trc  (incident=44649):

ORA-00600: internal error code, arguments: [krhpfh_03-1204], [fno =], [11], [fhfno =], [9], [], [], [], [], [], [], []

ORA-01110: data file 11: '+DATA/phub/datafile/aix_trans.283.884273723'


把AIX_TRANS表空间在线:

SQL> select tablespace_name,block_size,status from dba_tablespaces; 


TABLESPACE_NAME       BLOCK_SIZE STATUS

------------------------------ ---------- ---------

SYSTEM     8192 ONLINE

SYSAUX     8192 ONLINE

UNDOTBS1     8192 ONLINE

TEMP     8192 ONLINE

USERS     8192 ONLINE

UNDOTBS2     8192 ONLINE

EXAMPLE     8192 ONLINE

CRM     8192 ONLINE

AIX_TRANS     8192 ONLINE



注意,一定要在要建立备库,数据文件复制完成后,把备库的flashback 打开,否则,无法完成reinstate操作,


QL> alter system set dg_broker_config_file1='+DATA/phub/dr1phub.dat' scope=both;

System altered.

SQL> alter system set dg_broker_config_file2='+DATA/phub/dr2phub.dat' scope=both;

System altered.

SQL> alter system set dg_broker_start=true scope=both;

System altered.

SQL> alter database flashback on;

Database altered.

DGMGRL> failover to mecbs;

Performing failover NOW, please wait...

Operation requires a connection to instance "MECBS1" on database "mecbs"

Connecting to instance "MECBS1"...

Connected.

Failover succeeded, new primary is "mecbs"


DGMGRL> reinstate database phub;

Reinstating database "phub", please wait...

Reinstatement of database "phub" succeeded


Using STANDBY_ARCHIVE_DEST parameter default value as +RECO

Primary database is in MAXIMUM PERFORMANCE mode

RFS[1]: Assigned to RFS process 8806

RFS[1]: Selected log 9 for thread 2 sequence 6 dbid 1527329870 branch 884049437

Data Guard: Failover target was a Real Time Query standby; attempting to open this standby after reinstatement ...

ALTER DATABASE OPEN READ ONLY

Data Guard Broker initializing...

Data Guard Broker initialization complete

AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access

Sun Jul 05 22:06:40 2015

Primary database is in MAXIMUM PERFORMANCE mode

RFS[2]: Assigned to RFS process 8808

RFS[2]: Selected log 6 for thread 1 sequence 8 dbid 1527329870 branch 884049437

Sun Jul 05 22:06:40 2015

SMON: enabling cache recovery

Sun Jul 05 22:06:46 2015

Dictionary check beginning

Dictionary check complete

Database Characterset is AL32UTF8

Sun Jul 05 22:06:50 2015

RFS[3]: Assigned to RFS process 8815

RFS[3]: Selected log 5 for thread 1 sequence 7 dbid 1527329870 branch 884049437

Sun Jul 05 22:06:53 2015

RFS[4]: Assigned to RFS process 8817

RFS[4]: Selected log 8 for thread 2 sequence 5 dbid 1527329870 branch 884049437

No Resource Manager plan active

Sun Jul 05 22:06:54 2015

Archived Log entry 33 added for thread 1 sequence 7 ID 0x5c548cc4 dest 1:

Sun Jul 05 22:06:54 2015

Archived Log entry 34 added for thread 2 sequence 5 ID 0x5c548cc4 dest 1:

replication_dependency_tracking turned off (no async multimaster replication found)

Sun Jul 05 22:06:59 2015

Physical standby database opened for read only access.

Sun Jul 05 22:07:08 2015

RFS[3]: Opened log for thread 1 sequence 1 dbid 1527329870 branch 884049437

Archived Log entry 35 added for thread 1 sequence 1 rlc 884049437 ID 0x0 dest 2:

Sun Jul 05 22:07:09 2015

RFS[5]: Assigned to RFS process 8832

RFS[5]: Opened log for thread 1 sequence 2 dbid 1527329870 branch 884049437

Sun Jul 05 22:07:09 2015

Completed: ALTER DATABASE OPEN READ ONLY

Archived Log entry 36 added for thread 1 sequence 2 rlc 884049437 ID 0x5c548cc4 dest 2:

RFS[3]: Opened log for thread 1 sequence 3 dbid 1527329870 branch 884049437

Archived Log entry 37 added for thread 1 sequence 3 rlc 884049437 ID 0x5c548cc4 dest 2:

RFS[5]: Opened log for thread 1 sequence 4 dbid 1527329870 branch 884049437

RFS[3]: Opened log for thread 1 sequence 5 dbid 1527329870 branch 884049437

Archived Log entry 38 added for thread 1 sequence 4 rlc 884049437 ID 0x5c548cc4 dest 2:

Archived Log entry 39 added for thread 1 sequence 5 rlc 884049437 ID 0x5c548cc4 dest 2:

ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='PHUB';

ALTER SYSTEM SET log_archive_format='%t_%s_%r.dbf' SCOPE=SPFILE SID='PHUB';

ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';

ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';

ALTER SYSTEM SET fal_server='mecbs' SCOPE=BOTH;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE

Attempt to start background Managed Standby Recovery process (PHUB)

Sun Jul 05 22:07:15 2015

MRP0 started with pid=45, OS id=8840 

MRP0: Background Managed Standby Recovery process started (PHUB)

Sun Jul 05 22:07:17 2015

db_recovery_file_dest_size of 5120 MB is 10.96% used. This is a

user-specified limit on the amount of space that will be used by this

database for recovery-related files, and does not reflect the amount of

space available in the underlying filesystem or ASM diskgroup.

 started logmerger process

Sun Jul 05 22:07:21 2015

Managed Standby Recovery starting Real Time Apply

Parallel Media Recovery started with 4 slaves

Media Recovery start incarnation depth : 1, target inc# : 7, irscn : 43400013426

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Sun Jul 05 22:07:23 2015

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_15.286.884296763

Identified End-Of-Redo (failover) for thread 1 sequence 15 at SCN 0xa.1ad7a672

Resetting standby activation ID 1549317464 (0x5c58b558)

Media Recovery End-Of-Redo indicator encountered

Media Recovery Continuing

Sun Jul 05 22:07:26 2015

RFS[2]: Selected log 5 for thread 1 sequence 9 dbid 1527329870 branch 884049437

Sun Jul 05 22:07:26 2015

Archived Log entry 40 added for thread 1 sequence 8 ID 0x5c548cc4 dest 1:

Sun Jul 05 22:07:36 2015

RFS[1]: Selected log 8 for thread 2 sequence 7 dbid 1527329870 branch 884049437

Sun Jul 05 22:07:36 2015

Archived Log entry 41 added for thread 2 sequence 6 ID 0x5c548cc4 dest 1:

Sun Jul 05 22:07:36 2015

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_1.260.884297229

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_2.280.884297229

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_1.289.884296713

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_3.279.884297229

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_2.288.884296713

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_4.278.884297233

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_3.290.884296713

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_5.291.884297235

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_4.287.884296755

Sun Jul 05 22:07:49 2015

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_6.275.884296775

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_5.271.884297215

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_7.259.884297213

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_2_seq_6.276.884297255

Media Recovery Log +RECO/phub/archivelog/2015_07_05/thread_1_seq_8.274.884297245

Media Recovery Waiting for thread 1 sequence 9 (in transit)

Recovery of Online Redo Log: Thread 1 Group 5 Seq 9 Reading mem 0

  Mem# 0: +DATA/phub/onlinelog/group_5.260.884293665

  Mem# 1: +RECO/phub/onlinelog/group_5.619.884293671

Media Recovery Waiting for thread 2 sequence 7 (in transit)

Recovery of Online Redo Log: Thread 2 Group 8 Seq 7 Reading mem 0

  Mem# 0: +DATA/phub/onlinelog/group_8.264.884293703

  Mem# 1: +RECO/phub/onlinelog/group_8.617.884293707


  DGMGRL> show configuration;

Configuration - mecbs_bk

  Protection Mode: MaxPerformance

  Databases:

    mecbs - Primary database

    phub  - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:

SUCCESS