Oracle 19c Physical Standby Switchover 最佳实践

备注:文末包含 Failover 步骤及示例。

1、环境检查阶段

-- 确认是否安装最新的 PSU/Bundle 补丁:Primary Note for Database Proactive Patch Program (Doc ID 888.1)

-- 安装和配置检查:确认主备的数据库版本一致、确认 alert 日志无报错。
    
-- 在主备库执行以下查询无报错记录
    
select * from v$database_block_corruption;

select * from v$nonlogged_block;
        
-- 在备库确认是否同步( 检查 apply lag 和 transport lag 均为 null )
    
col source_db_unique_name for a10
col name for a15
col value for a15
col unit for a30
col time_computed for a20
col datum_time for a20
set lines 300
--select * from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
select name,value from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;

-- 检查主库 redo 日志传输状态

col dest_name for a20
col destination for a20
col error for a10
col alternate for a10
col type for a10
col status for a10
col valid_type for a15
col valid_role for a15
set lines 1000
select dest_name,destination,error,alternate,type,status,valid_type,valid_role from v$archive_dest where status <>'INACTIVE';

-- 检查主库最新的归档日志

select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

-- 检查备库收到的主库最新的归档日志( RAC 显示每一个 thread 的结果 )

select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

-- 检查备库应用的最新归档日志序列号

select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;  

-- 检查主备库初始化参数

log_archive_config : should include primary and standby database 
                    (if multiple standby databases are existing, then all the standby database details should be included)
fal_server         : remote server from where archivelog can be fetched
db_unique_name     : unique name under this configuration
log_archive_dest_n : for remote database to send archives
compatible         :主备库设置一致

如果主备库的数据文件和 redo 文件的路径不同则需要设置参数 db_file_name_convert 和 log_file_name_convert。

2、Pre-Switchover 阶段:

-- 检查备库 redo 和 归档日志应用正常无 gap

检查备库应用的最新归档日志序列号( 可能不包括主库当前的 sequence 号 详见视图 v$archived_log )

select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
              
检查 MRP( Managed Recovery Process )进程状态

select inst_id,process from gv$managed_standby where process like 'MRP%';      

host ps -ef|grep -i mrp|grep -v grep

检查数据文件和临时文件的状态( 主备库 )

SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
SQL> ALTER DATABASE DATAFILE 'datafile-name' ONLINE;

select tf.name filename, bytes, ts.name tablespace from v$tempfile tf, v$tablespace ts where tf.ts#=ts.ts#;

检查 Online 和 standby 的 redo 文件

set lines 150
col member for a50
select a.thread#,a.group#,a.bytes,a.blocksize,b.type,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
select s.thread#,s.group#,s.status,s.bytes,l.type,l.member from v$logfile l,v$standby_log s where s.group#=l.group#;

注意 Standby redo 文件状态是 UNASSIGNED 或 ACTIVE。

主库检查归档日志状态( 备库是否存在 gap )

select status, gap_status from v$archive_dest_status where dest_id = 2;

如果延时被设置则需修改为实时
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;

3、Switchover 阶段:

主库:

Switchover校验

alter database switchover to verify;

如果上述校验执行成功则可以进行以下 Switchover 步骤:

(1) 在当前主库执行

alter database switchover to ;

注意切换完成后当前主库为停库状态,即 idle instance。

(2) 在新主库执行

conn / as sysdba
select database_role,open_mode from v$database; -- 注意切换完成后新主库为 Primary 的 mounted 状态
alter database open;
select database_role,open_mode from v$database;
alter pluggable database all open;
show pdbs

(3) 老主库

ADG 模式:

conn / as sysdba
startup;
select database_role,open_mode from v$database; -- 注意切换完成后老主库为 physical standby 的 read only 状态
alter pluggable database all open;
show pdbs

DG 模式:

start mount;

(4) 在新备库启动 MRP 服务

alter database recover managed standby database using current logfile disconnect from session;

select database_role,open_mode from v$database;

4、Post Switchover 阶段:

主库:

alter system archive log current;
select dest_id,error,status from v$archive_dest where dest_id=>;
select max(sequence#),thread# from v$log_history group by thread#;
select max(sequence#)  from v$archived_log where applied='YES' and dest_id=2;

备库:

select max(sequence#),thread# from v$archived_log group by thread#;
select name,role,instance,thread#,sequence#,action from gv$dataguard_process;

示例:


主库节点一:

SQL> select * from v$database_block_corruption;

no rows selected

SQL> select * from v$nonlogged_block;

no rows selected

SQL>

备库节点一:

SQL> select * from v$database_block_corruption;

no rows selected

SQL> select * from v$nonlogged_block;

no rows selected

SQL> col source_db_unique_name for a10
SQL> col name for a15
SQL> col value for a15
SQL> col unit for a30
SQL> col time_computed for a20
SQL> col datum_time for a20
SQL> set lines 300
SQL> --select * from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;
SQL> select name,value from gv$dataguard_stats where name in ('apply lag','transport lag') order by 1;

NAME            VALUE
--------------- ---------------
apply lag       +00 00:00:00
transport lag   +00 00:00:00

SQL>

主库节点一:

SQL> col dest_name for a20
SQL> col destination for a20
SQL> col error for a10
SQL> col alternate for a10
SQL> col type for a10
SQL> col status for a10
SQL> col valid_type for a15
SQL> col valid_role for a15
SQL> set lines 1000
SQL> select dest_name,destination,error,alternate,type,status,valid_type,valid_role from v$archive_dest where status <>'INACTIVE';

DEST_NAME            DESTINATION          ERROR      ALTERNATE  TYPE       STATUS     VALID_TYPE      VALID_ROLE
-------------------- -------------------- ---------- ---------- ---------- ---------- --------------- ---------------
LOG_ARCHIVE_DEST_1   +ARCHDG                         NONE       PUBLIC     VALID      ALL_LOGFILES    ALL_ROLES
LOG_ARCHIVE_DEST_2   fweidb_std                      NONE       PUBLIC     VALID      ONLINE_LOGFILE  PRIMARY_ROLE

SQL> select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb
  2  where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

   THREAD# Last Primary Seq Generated
---------- --------------------------
         1                         55

SQL>

备库节点一:

SQL> select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb
  2  where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

   THREAD# Last Standby Seq Received
---------- -------------------------
         1                        55

SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
  2  where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;  

   THREAD# Last Standby Seq Applied
---------- ------------------------
         1                       55

SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb
  2  where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;

   THREAD# Last Standby Seq Applied
---------- ------------------------
         1                       55

SQL> host ps -ef|grep -i mrp|grep -v grep
oracle     7022      1  0 14:19 ?        00:00:51 ora_mrp0_fweidb1

SQL>

主库节点一:

SQL> select status, gap_status from v$archive_dest_status where dest_id = 2;

STATUS     GAP_STATUS
---------- ------------------------
VALID      NO GAP

SQL> alter database switchover to fweidb_std verify;

Database altered.

SQL> alter database switchover to fweidb_std;

Database altered.

SQL> conn / as sysdba
Connected to an idle instance.
SQL> 

备库节点一:

SQL> conn / as sysdba
ERROR:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 104 Serial number: 33863


Connected.
SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PRIMARY          MOUNTED

SQL> alter database open;

Database altered.

SQL> alter pluggable database all open;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
SQL> 

主库节点一:

SQL> startup
ORACLE instance started.

Total System Global Area 1862268840 bytes
Fixed Size                  9136040 bytes
Variable Size             620756992 bytes
Database Buffers         1224736768 bytes
Redo Buffers                7639040 bytes
Database mounted.
Database opened.
SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED
SQL> alter pluggable database all open;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ ONLY  NO
SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

SQL> select inst_id,process from gv$managed_standby where process like 'MRP%';

   INST_ID PROCESS
---------- ---------
         1 MRP0

SQL> host ps -ef|grep -i mrp|grep -v grep
oracle   105791      1  1 17:15 ?        00:00:00 ora_mrp0_fweidb1

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY WITH APPLY

SQL> 

备库节点一:

SQL> select thread#,max(sequence#) from v$archived_log group by thread#;

   THREAD# MAX(SEQUENCE#)
---------- --------------
         1             57

SQL> alter system switch logfile;

System altered.

SQL> alter system archive log current;

System altered.

SQL> select thread#,max(sequence#) from v$archived_log group by thread#;

   THREAD# MAX(SEQUENCE#)
---------- --------------
         1             59

SQL> 

主库节点一:

SQL> select thread#,max(sequence#) from v$archived_log group by thread#;

   THREAD# MAX(SEQUENCE#)
---------- --------------
         1             59

SQL> select * from gv$archive_gap;

no rows selected

SQL> col value for a18
SQL> col name for a25
SQL> col source_db_unique_name for a10
SQL> set lines 200
SQL> select * from v$dataguard_stats;

SOURCE_DBID SOURCE_DB_ NAME                      VALUE              UNIT                           TIME_COMPUTED                  DATUM_TIME                             CON_ID
----------- ---------- ------------------------- ------------------ ------------------------------ ------------------------------ ------------------------------ ----------
          0            transport lag             +00 00:00:00       day(2) to second(0) interval   04/14/2023 17:17:29            04/14/2023 17:17:28                         0
          0            apply lag                 +00 00:00:00       day(2) to second(0) interval   04/14/2023 17:17:29            04/14/2023 17:17:28                         0
          0            apply finish time                            day(2) to second(3) interval   04/14/2023 17:17:29               0
          0            estimated startup time    82                 second                         04/14/2023 17:17:29               0

SQL> 

附录 - ADG 灾备 Failover 步骤( 均在备库执行 ):

-- 取消主备同步:

alter database recover managed standby database cancel;

-- 如果成功则执行:

alter database failover to ;

-- 如果报错则执行:

alter database activate physical standby database;

-- 打开新主库:

alter database open;

示例:

SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY MOUNTED

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database open;

Database altered.

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY

SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

SQL> alter pluggable database all open;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ ONLY  NO
SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database failover to fweidb_std;

Database altered.

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PRIMARY          MOUNTED

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       MOUNTED
         3 PDB                            MOUNTED
SQL> alter database open;

Database altered.

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PRIMARY          READ WRITE

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
SQL> 
 

你可能感兴趣的:(oracle,数据库,sql)