18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)
APPLIES TO:
Oracle Database - Enterprise Edition - Version 18.3.0.0.0 and later
Information in this document applies to any platform.
GOAL
This Document explain about switchover steps for 18c and 19c 本文档说明有关18c和19c的切换步骤
SOLUTION
Prerequisites 先决条件
Latest psu/bundle patches 最新的psu/bundle补丁
Master Note for Database Proactive Patch Program (Doc ID 756671.1)
Setup/configuration verification Setup/configuration验证
- Primary & standby should be running with same version of RDBMS 主库和备库版本要相同
- Verify the alert logfiles and make sure there are no erorrs 验证警报日志文件,并确保没有错误
- Run select on v$database_block_corruption & v$nonlogged_block from primary and standby and make sure there are no corruption 在主服务器和备用服务器上查询v$database_block_corruption和v$nonlogged_block,确保没有损坏
- Make sure primary and physical standby configuration are good and there are no errors in redo transport and redo apply. 确保主库和备库配置都正确,并且redo传输和redo应用没有错误。
Verify the Physical Standby Database Is Performing Properly
You can also optionally, use the below queries to check the redo transport and apply status 您也可以选择使用以下查询来检查重做传输并应用状态
On primary
To check the remote redo transport status and if there are any errors, V$ARCHIVE_DEST.ERROR will show the details
要检查远程redo传输状态以及是否有任何错误,V$ARCHIVE_DEST.ERROR将显示详细信息
SQL> col DEST_NAME for a20 SQL> col DESTINATION for a25 SQL> col ERROR for a15 SQL> col ALTERNATE for a20 SQL> set lines 1000 SQL> select DEST_NAME,DESTINATION,ERROR,ALTERNATE,TYPE,status,VALID_TYPE,VALID_ROLE from V$ARCHIVE_DEST where STATUS <>'INACTIVE';
To check the last archivelog created at the primary: 要检查在主数据库上创建的最后一个归档日志:
SQL> select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
On Standby:
Using the below query, check the last received Archivelog from primary database (if database is RAC, then result will be displayed for each thread)
使用以下查询,检查从主数据库最后收到的Archivelog(如果数据库是RAC,则将显示每个线程的结果)
Query output is: last archive log sequence received by standby: 查询输出为:备用数据库接收到的最后一个归档日志序列:
SQL> select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
Query output is: last archive log sequence Applied by standby 查询输出为:上次批量日志序列由备用
SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
Verify Initialization Parameters 验证初始化参数
Mainly below parameter should have configured correctly 主要是以下参数应该已正确配置
log_archive_config : should include primary and standby database (if multiple standby databases are existing, then all the standby database details should be included) 应包括主数据库和备用数据库(如果存在多个备用数据库,则应包括所有备用数据库详细信息)
fal_server : remote server from where archivelog can be fetched 可从中获取archivelog的远程服务器
db_unique_name : uniuque name under this configuration 在此配置下的唯一名称
log_archive_dest_n: for remote database to set archives. 用于远程数据库设置存档。
Once switchover completes, new primary will have log_archive_dest_n configuration to sent archive logs/redo
切换完成后,新的主数据库将具有log_archive_dest_n配置以发送存档日志/重做
Ensure 'compatible' is set to same value at primary and standby 确保在主数据库和备用数据库上将 'compatible'设置为相同的值
If the file locations are different between primary and standby, use db_file_name_convert & log_file_name_convert for datafiles and redo logfiles respectively 如果主数据库和备用数据库之间的文件位置不同,请分别对数据文件和重做日志文件使用db_file_name_convert和log_file_name_convert
Refer: Set Primary Database Initialization Parameters 参考:设置主数据库初始化参数
Understand and Test Fallback Options 了解和测试Fallback选项
Check: A.4 Problems Switching Over to a Physical Standby Database
Pre-Switchover 切换前
Ensure Prerequisites are completely verified & Along with Prerequisites follow the below guidance to have sucessful swithover 确保先决条件得到完全验证,并且与先决条件一起遵循以下指导,以确保成功完成
These steps should be executed before real planned outtage starts and make sure there are no issue 这些步骤。应在实际计划的中断开始之前执行这些步骤,并确保没有问题
Verify Redo/Archive log apply is goof and there are no gap 确认Redo/Archive日志apply,并且没有gap
run the below query in physical standby to check last archive log sequence received and applied from all the thread, This will not include current sequence as the SQL is extracing details from v$archived_log
在物理备用数据库中运行以下查询以检查从所有thread接收和应用的最后一个归档日志序列,由于SQL是从v$archived_log提取详细信息,因此将不包括当前序列
SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
Check the MRP process status (it should be started running and applying the logs) 检查MRP进程状态(应该开始运行并应用日志)
SQL> select * from gv$dataguard_process;
Commands to stop & start the managed recovery process: 停止和启动managed recovery process的命令
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
For any reason, If standby database recovery (MRP) started with delay OR if the standby always maintained with lag then switchover will consume time to apply the logs to be sync. 出于任何原因,如果备用数据库恢复(MRP)延迟启动,或者如果备用数据库始终保持滞后,则切换将消耗时间来应用日志进行同步
Before switchover, try to maintain minimal archive log apply lag, which will reduce the total switchover time window. 在切换之前,请尝试保持最小的归档日志应用滞后时间,这将减少总的切换时间窗口。
Verify the apply delay configurations 验证应用延迟配置
If archive log gap is huge then 如果存档日志gap很大,则
1) Monitoring Redo Transport Services to make sure it there are no transport log 监视重做传输服务以确保没有传输日志
2) Standby can also be recovered using incremental backup taken from primary 也可以使用从主数据库获取的增量备份来恢复备用数据库
Restoring and Recovering Files Over the Network 通过网络还原和恢复文件
Check the datafiles & Tempfiles status 检查数据文件和临时文件状态
Expected all the datafiles should be online in primary and standby, Incase if there are files offline (OR) NOT in online status, then restore the file and recover to make sure the standby database files are same as primary database files.
期望所有数据文件在主数据库和备用数据库中都应处于联机状态,以防万一如果存在脱机(OR)不在联机状态的文件,请还原该文件并进行恢复以确保备用数据库文件与主数据库文件相同。
If there files made offline and after switchover if those files are needed to be in online after switchver, then make the files online
如果有文件脱机并且在切换后又需要在切换后将这些文件置于联机状态,则使这些文件联机
SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’; SQL> ALTER DATABASE DATAFILE 'datafile-name' ONLINE;
For Tempfiles: 对于临时文件
SQL> select tf.name filename, bytes, ts.name tablespace from v$tempfile tf, v$tablespace ts where tf.ts#=ts.ts#;
The listed tempfiles are good enough for the application, it should be fine. 列出的临时文件对于应用程序已经足够了,应该没问题
If more tempfiles needs to be added, then check in primary as well and add additional files. 如果需要添加更多临时文件,则也请检入主文件并添加其他文件
Online and standby redo logfile configuration 联机和备用重做日志文件配置
Online redologfile:
set lines 150 col member for a50 select a.thread#,a.group#,a.bytes,a.blocksize,b.type,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
From primary when the above command executed, we may get a.status in (INACTIVE,ACTIVE,CURRENT) 从上面的命令执行时,我们可以从(primary)的(INACTIVE,ACTIVE,CURRENT)中获得一个状态。
Expected a.status from Standby is UNUSED, CLEARING or CLEARING_CURRENT, if output has different result, then manually redo logfiles needs to be cleared.
Standby预期的a.status为UNUSED,CLEARING或CLEARING_CURRENT,如果输出结果不同,则需要清除手动重做日志文件。
For Standby redo logfile(SRL):
select s.thread#,s.group#,s.status,s.bytes,l.type,l.member from v$logfile l,v$standby_log s where s.group#=l.group#;
Standby redo logfile status would be in UNASSIGNED OR ACTIVE.
Command to clear ORL group: 清除ORL组的命令
SQL> ALTER DATABASE CLEAR LOGFILE GROUP;
If ORL or SRL needs to be cleared in the standby, Managed recovery process has to be stopped.
如果需要在备用数据库中清除ORL或SRL,则必须停止托管恢复过程。
If the ORLs are not cleared till switchover time, then SWITCHOVER command will clear the ORLs and the start the database. But switchover will be consuming time to complete. 如果在切换时间之前未清除ORL,则SWITCHOVER命令将清除ORL并启动数据库。但是切换将花费时间来完成。
If the wait is longer (more than 15 min) then due to timeout session will get killed for oracle process, if the switchover is terminated due to timeout, retry again until switchover is sucessful. 如果等待时间更长(超过15分钟),则由于超时会话而被oracle进程杀死,如果切换因超时而终止,请重试,直到切换成功为止。
If database is configured to use OMF files for Redologfile OR log_file_name_convert is set, then Online redo logfiles would get cleared automatically with the managed recovery process is started. 如果将数据库配置为对重做日志文件使用OMF文件,或者设置了log_file_name_convert,则将在启动托管恢复过程时自动清除联机重做日志文件。
If the file locations are same at primary and standby, then configure log_file_name_convert with same value as replacing string 如果文件位置在主数据库和备用数据库中相同,则将log_file_name_convert配置为与替换字符串相同的值
Example: log_file_name_convert='dummy','dummy'
To manage standby redo logfiles Refer: Managing Standby Redo Logs
Checking the alert logfiles 检查警报日志文件
1) from primary alert logfile:
* Check are there any issue reported for redo transport ? 检查是否报告了有关重做传输的问题
* There is no password file issue? 没有密码文件问题
* There are no TNS or connection issue 没有TNS或连接问题
2) From Standby database make sure,
* There are no error related to Managed recovery 没有与托管恢复有关的错误
* Recovery is moving forward by applying the archive log / redo log 通过应用归档日志/重做日志恢复正在向前发展
* There are no TNS or connection issue 没有TNS或连接问题
* There are no I/O issue or corurption issue 没有I / O问题或损坏问题
select * from v$database_block_corruption; -- it returns no rows 它不返回任何行
select * from v$nonlogged_block; -- it returns no rows 不返回任何行
Check Archive log GAP & Redo Delay apply 检查存档日志GAP和重做延迟是否适用
You must configure the LOG_ARCHIVE_DEST_n and LOG_ARCHIVE_DEST_STATE_n parameters for each standby database so that when a switchover or failover occurs, all standby sites continue to receive redo data from the new primary database
您必须为每个备用数据库配置LOG_ARCHIVE_DEST_n和LOG_ARCHIVE_DEST_STATE_n参数,以便在发生切换或故障转移时,所有备用站点继续从新的主数据库接收重做数据。在主数据库中执行以下命令
You execute the below command in primary database:
Considering log_archive_dest_2 is configured for the redo shipping. 考虑为重做配置了log_archive_dest_2运输
SQL> SELECT STATUS, GAP_STATUS FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID = 2;
STATUS should be Valid
GAP_STATUS should be NO GAP
If different result is reported, then switchover should NOT be tried. 如果报告了不同的结果,则不应尝试切换
If the delay configured, stop the managed recovery process and the start the process without delay 如果配置了延迟,请立即停止托管恢复过程并立即启动该过程
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;
If the delay is not removed, then switchover will take longer time. 如果不消除延迟,则切换将花费更长的时间
Specifying a Time Delay for the Application of Archived Redo Log Files 指定应用重做日志文件的时间延迟
Switchover:
While doing switchover, if standby connection needs to be maintained without disconnecting, then set the parameter STANDBY_DB_PRESERVE_STATES to SESSION or ALL
进行切换时,如果需要保持备用连接而不断开连接,则将参数STANDBY_DB_PRESERVE_STATES设置为SESSION或ALL
Verify the switchover 验证切换
If this operation had been successful, a Database Altered message should be returned (execute the below SQL in the primary) 如果此操作成功,则应返回数据库更改消息(在主数据库中执行以下SQL)
SQL> ALTER DATABASE SWITCHOVER TOVERIFY;
In case of error, fix an issue and then rerun switchover verify command. 如果发生错误,请解决问题,然后重新运行switchover verify命令
Switchover steps 切换步骤
If switchover verify is successful, then execute the command to switchover the database. 如果切换验证成功,则执行命令来切换数据库
1) Execute in the current primary 在当前主库执行
SQL> ALTER DATABASE SWITCHOVER TO;
if the step 1 is successful, then follow step 2 open the new primary database in open mode 如果步骤1成功,则按照步骤2在打开模式下打开新的主数据库
2) execute in new primary database 在新的主数据库中执行
SQL> ALTER DATABASE OPEN;
3) Old primary (current/new standby) should be mounted Or opened depends on the case . 应根据情况mount或open旧的主库(当前/新备库)
If standby is Oracle Active data guard physical standby: 如果备用数据库是Oracle Active Data Guard物理备用数据库
SQL> STARTUP;
If standby is NOT Oracle Active data guard physical standby: 如果备用数据库不是Oracle Active Data Guard物理备用数据库
SQL> STARTUP MOUNT;
4) start redo apply in new standby
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
Post Switchover 切换后
In primary:
Check is the archivelogs are being transferred to the standby and getting applied 检查归档日志是否正在传输到备用数据库并得到应用
SQL> alter system archive log current; SQL>select dest_id,error,status from v$archive_dest where dest_id=>; SQL>select max(sequence#),thread# from v$log_history group by thread#;
If remote log_Archive_destination is 2 i.e log_archive_dest_2. 如果远程log_Archive_destination为2,即log_archive_dest_2
SQL>select max(sequence#) from v$archived_log where applied='YES' and dest_id=2;
In standby:
Verify the archivelog availability and the application of the archivelog file 验证存档日志可用性和存档日志文件的应用
SQL>select max(sequence#),thread# from v$archived_log group by thread#; SQL> select name,role,instance,thread#,sequence#,action from gv$dataguard_process;
Additionally, Alert logfiles can be verified to confirm the archivelog transfer and archivelog apply in standby 此外,可以验证警报日志文件以确认归档日志传输和归档日志在备用数据库中应用