备注:文末包含 Failover 步骤及示例。
1、环境检查阶段
-- 确认是否安装最新的 PSU/Bundle 补丁:Primary Note for Database Proactive Patch Program (Doc ID 888.1)
-- 安装和配置检查:确认主备的数据库版本一致、确认 alert 日志无报错。
-- 在主备库执行以下查询无报错记录
select * from v$database_block_corruption;
select * from v$nonlogged_block;
-- 在备库确认是否同步( 检查 apply lag 和 transport lag 均为 null )
col source_db_unique_name for a10
col name for a15
col value for a15
col unit for a30
col time_computed for a20
col datum_time for a20
set lines 300
--select * from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
select name,value from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
-- 检查主库 redo 日志传输状态
col dest_name for a20
col destination for a20
col error for a10
col alternate for a10
col type for a10
col status for a10
col valid_type for a15
col valid_role for a15
set lines 1000
select dest_name,destination,error,alternate,type,status,valid_type,valid_role from v$archive_dest where status <>'INACTIVE';
-- 检查主库最新的归档日志
select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
-- 检查备库收到的主库最新的归档日志( RAC 显示每一个 thread 的结果 )
select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
-- 检查备库应用的最新归档日志序列号
select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
-- 检查主备库初始化参数
log_archive_config : should include primary and standby database
(if multiple standby databases are existing, then all the standby database details should be included)
fal_server : remote server from where archivelog can be fetched
db_unique_name : unique name under this configuration
log_archive_dest_n : for remote database to send archives
compatible :主备库设置一致
如果主备库的数据文件和 redo 文件的路径不同则需要设置参数 db_file_name_convert 和 log_file_name_convert。
2、Pre-Switchover 阶段:
-- 检查备库 redo 和 归档日志应用正常无 gap
检查备库应用的最新归档日志序列号( 可能不包括主库当前的 sequence 号 详见视图 v$archived_log )
select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
检查 MRP( Managed Recovery Process )进程状态
select inst_id,process from gv$managed_standby where process like 'MRP%';
或
host ps -ef|grep -i mrp|grep -v grep
检查数据文件和临时文件的状态( 主备库 )
SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
SQL> ALTER DATABASE DATAFILE 'datafile-name' ONLINE;
select tf.name filename, bytes, ts.name tablespace from v$tempfile tf, v$tablespace ts where tf.ts#=ts.ts#;
检查 Online 和 standby 的 redo 文件
set lines 150
col member for a50
select a.thread#,a.group#,a.bytes,a.blocksize,b.type,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
select s.thread#,s.group#,s.status,s.bytes,l.type,l.member from v$logfile l,v$standby_log s where s.group#=l.group#;
注意 Standby redo 文件状态是 UNASSIGNED 或 ACTIVE。
主库检查归档日志状态( 备库是否存在 gap )
select status, gap_status from v$archive_dest_status where dest_id = 2;
如果延时被设置则需修改为实时
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;
3、Switchover 阶段:
主库:
Switchover校验
alter database switchover to
如果上述校验执行成功则可以进行以下 Switchover 步骤:
(1) 在当前主库执行
alter database switchover to
注意切换完成后当前主库为停库状态,即 idle instance。
(2) 在新主库执行
conn / as sysdba
select database_role,open_mode from v$database; -- 注意切换完成后新主库为 Primary 的 mounted 状态
alter database open;
select database_role,open_mode from v$database;
alter pluggable database all open;
show pdbs
(3) 老主库
ADG 模式:
conn / as sysdba
startup;
select database_role,open_mode from v$database; -- 注意切换完成后老主库为 physical standby 的 read only 状态
alter pluggable database all open;
show pdbs
DG 模式:
start mount;
(4) 在新备库启动 MRP 服务
alter database recover managed standby database using current logfile disconnect from session;
select database_role,open_mode from v$database;
4、Post Switchover 阶段:
主库:
alter system archive log current;
select dest_id,error,status from v$archive_dest where dest_id=
select max(sequence#),thread# from v$log_history group by thread#;
select max(sequence#) from v$archived_log where applied='YES' and dest_id=2;
备库:
select max(sequence#),thread# from v$archived_log group by thread#;
select name,role,instance,thread#,sequence#,action from gv$dataguard_process;
示例:
主库节点一:
SQL> select * from v$database_block_corruption;
no rows selected
SQL> select * from v$nonlogged_block;
no rows selected
SQL>
备库节点一:
SQL> select * from v$database_block_corruption;
no rows selected
SQL> select * from v$nonlogged_block;
no rows selected
SQL> col source_db_unique_name for a10
SQL> col name for a15
SQL> col value for a15
SQL> col unit for a30
SQL> col time_computed for a20
SQL> col datum_time for a20
SQL> set lines 300
SQL> --select * from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
SQL> select name,value from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
NAME VALUE
--------------- ---------------
apply lag +00 00:00:00
transport lag +00 00:00:00
SQL>
主库节点一:
SQL> col dest_name for a20
SQL> col destination for a20
SQL> col error for a10
SQL> col alternate for a10
SQL> col type for a10
SQL> col status for a10
SQL> col valid_type for a15
SQL> col valid_role for a15
SQL> set lines 1000
SQL> select dest_name,destination,error,alternate,type,status,valid_type,valid_role from v$archive_dest where status <>'INACTIVE';
DEST_NAME DESTINATION ERROR ALTERNATE TYPE STATUS VALID_TYPE VALID_ROLE
-------------------- -------------------- ---------- ---------- ---------- ---------- --------------- ---------------
LOG_ARCHIVE_DEST_1 +ARCHDG NONE PUBLIC VALID ALL_LOGFILES ALL_ROLES
LOG_ARCHIVE_DEST_2 fweidb_std NONE PUBLIC VALID ONLINE_LOGFILE PRIMARY_ROLE
SQL> select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb
2 where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
THREAD# Last Primary Seq Generated
---------- --------------------------
1 55
SQL>
备库节点一:
SQL> select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb
2 where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
THREAD# Last Standby Seq Received
---------- -------------------------
1 55
SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
2 where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
THREAD# Last Standby Seq Applied
---------- ------------------------
1 55
SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
2 where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
THREAD# Last Standby Seq Applied
---------- ------------------------
1 55
SQL> host ps -ef|grep -i mrp|grep -v grep
oracle 7022 1 0 14:19 ? 00:00:51 ora_mrp0_fweidb1
SQL>
主库节点一:
SQL> select status, gap_status from v$archive_dest_status where dest_id = 2;
STATUS GAP_STATUS
---------- ------------------------
VALID NO GAP
SQL> alter database switchover to fweidb_std verify;
Database altered.
SQL> alter database switchover to fweidb_std;
Database altered.
SQL> conn / as sysdba
Connected to an idle instance.
SQL>
备库节点一:
SQL> conn / as sysdba
ERROR:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 104 Serial number: 33863
Connected.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PRIMARY MOUNTED
SQL> alter database open;
Database altered.
SQL> alter pluggable database all open;
Pluggable database altered.
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB READ WRITE NO
SQL>
主库节点一:
SQL> startup
ORACLE instance started.
Total System Global Area 1862268840 bytes
Fixed Size 9136040 bytes
Variable Size 620756992 bytes
Database Buffers 1224736768 bytes
Redo Buffers 7639040 bytes
Database mounted.
Database opened.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB MOUNTED
SQL> alter pluggable database all open;
Pluggable database altered.
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB READ ONLY NO
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
SQL> select inst_id,process from gv$managed_standby where process like 'MRP%';
INST_ID PROCESS
---------- ---------
1 MRP0
SQL> host ps -ef|grep -i mrp|grep -v grep
oracle 105791 1 1 17:15 ? 00:00:00 ora_mrp0_fweidb1
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY WITH APPLY
SQL>
备库节点一:
SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 57
SQL> alter system switch logfile;
System altered.
SQL> alter system archive log current;
System altered.
SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 59
SQL>
主库节点一:
SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 59
SQL> select * from gv$archive_gap;
no rows selected
SQL> col value for a18
SQL> col name for a25
SQL> col source_db_unique_name for a10
SQL> set lines 200
SQL> select * from v$dataguard_stats;
SOURCE_DBID SOURCE_DB_ NAME VALUE UNIT TIME_COMPUTED DATUM_TIME CON_ID
----------- ---------- ------------------------- ------------------ ------------------------------ ------------------------------ ------------------------------ ----------
0 transport lag +00 00:00:00 day(2) to second(0) interval 04/14/2023 17:17:29 04/14/2023 17:17:28 0
0 apply lag +00 00:00:00 day(2) to second(0) interval 04/14/2023 17:17:29 04/14/2023 17:17:28 0
0 apply finish time day(2) to second(3) interval 04/14/2023 17:17:29 0
0 estimated startup time 82 second 04/14/2023 17:17:29 0
SQL>
附录 - ADG 灾备 Failover 步骤( 均在备库执行 ):
-- 取消主备同步:
alter database recover managed standby database cancel;
-- 如果成功则执行:
alter database failover to
-- 如果报错则执行:
alter database activate physical standby database;
-- 打开新主库:
alter database open;
示例:
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY MOUNTED
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter database open;
Database altered.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
SQL> alter pluggable database all open;
Pluggable database altered.
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB READ ONLY NO
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter database failover to fweidb_std;
Database altered.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PRIMARY MOUNTED
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED MOUNTED
3 PDB MOUNTED
SQL> alter database open;
Database altered.
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE OPEN_MODE
---------------- --------------------
PRIMARY READ WRITE
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB READ WRITE NO
SQL>