Doc ID:
271448.1
Type:
BULLETIN
Modified Date:
11-JUL-2008
Status:
PUBLISHED
Best Practices - Oracle Data Guard Switchover & Failover
Introduction
Oracle Data Guard offers two easy-to-use methods to handle planned and unplanned outages of the production site. These methods are called switchover and failover respectively. They can be easily initiated directly through SQL, or the Data Guard Manager GUI or Data Guard Broker’s command line interface (DGMGRL). This article highlights best practices for role transitions using Oracle9
i Data Guard and Oracle 10
g Data Guard.
Results Summary
Adequate understanding and testing of Data Guard switchover and failover are
key to meeting service availability requirements. If a role transition does not have stringent time restrictions, then Enterprise Manager is recommended because of its inherent ease-of-use. With physical standby database, we have observed switchover and failover times of less than a minute in 9
i and 10
g using the best practices. With logical standby database, we have observed switchover and failover times of less than 30 seconds in 10
g.
Role Management Best Practices - Details
These best practices were derived after testing on Oracle9
i release 9.2.0.3 databases as part of the ongoing studies within the
Maximum Availability Architecture (MAA) best practices and recommendations. Additionally, tests were performed with Oracle Database 10
g release (10.1.0.2). The 10
g tests also included testing with Enterprise Manager Grid Control. Detailed comparisons of the timings are provided below. It may be noted that in 10
g, Data Guard offers the new real-time apply feature, and automates the pre-clearing of the standby online redo logs (the standby database’s online redo logs and not the standby redo logs), or SRLs by setting the LOG_FILE_NAME_CONVERT database parameter, which further optimizes the role transition.
Switchover Best Practices
If the original production database is still accessible, you should always consider a Data Guard switchover. Switchover is a planned operation, switching database roles between the production and standby databases without needing to instantiate any of the databases. In contrast to switchover, a failover requires that the initial production database either be reinstantiated as a new standby database (in 9
i), or flashed back (in 10
g) and then brought back as a standby database.
Switchover can occur whenever a production database is started, the target standby database is available, and all the archived redo logs are available. It is useful in the following situations:
Scheduled maintenance such as hardware maintenance (e.g. hardware or firmware patches) on the primary server.
Resolution of data failures when the primary database is still open.
Testing and validating the standby resources, as a means to test disaster recovery readiness.
Physical Standby Switchover Best Practices
Best Practice
9i
10g
Clear the online redo logs for a new standby (following an instantiation or a switchover).
x
Set the LOG_FILE_NAME_CONVERT database parameter. If there is no log file name conversion required then still set it to a null string since this will trigger the automatic clearing.
x
Use standby redo logs to reduce redo transfer time for unapplied redo.
x
x
Remove any apply delay (this may not be possible in 9
i depending on the delay requirements for the standby).
x
x
Use real-time apply.
x
SWITCHOVER TO PRIMARY command immediately following a successful SWITCHOVER TO STANDBY command and in parallel with the shutdown/startup of the new standby database. This eliminates the serial wait for the new standby database shutdown and startup.
x
x
Follow the
Oracle9i Media Recovery Best Practices white paper recommendations to obtain the optimal managed recovery apply rate.
x
x
Follow a pre-transition checklist: (details in the
MAA Detailed White Paper ):
Check the status of Log Transport services
Verify that there are no gaps
Record the current online redo log thread and sequence number(s) on the primary and on the standby
For a RAC database, ensure that only a single instance is running
End all jobs and sessions on the remaining active production instance
Validate the correct SWITCHOVER_STATUS
x
x
Understand the factors that affect the physical standby switchover time and test for planned outage timing estimates and use a formula for estimating the switchover time. The detailed formula is beyond the scope of this article and may be addressed in a future white paper. The factors are:
Switchover to standby
Switchover to primary
Redo generation rate
Redo apply rate
Transport delay settings
Network round trip time (may also be known as network latency)
Primary and standby database shutdown and startup time
Application shutdown time
x
x
Logical Standby Switchover Best Practices
Best Practice
9i
10g
Create database links in both directions during the logical standby creation process.
x
Follow a pre-transition checklist (details in the
MAA Detailed White Paper ):
Execute a log switch
Remove any apply delay
Logging off all users and ending all jobs will reduce the time for the COMMIT TO SWITCHOVER TO LOGICAL STANDBY command
x
x
Use real-time apply.
x
Use prepared switchover commands to prebuild the data dictionary:
On the Primary database PREPARE TO SWITCHOVER TO LOGICAL STANDBY command
On the Standby database PREPARE TO SWITCHOVER TO PRIMARY command
x
Follow the recommendations outlined in the
Oracle9i Data Guard: SQL Apply Best Practices white paper.
x
x
Failover Best Practices
Data Guard failover should be used only in the case of an emergency and should be initiated due to an unplanned outage such as:
Site disaster
User errors
Wide spread data failures
If the original production database is still accessible, you should always consider a Data Guard switchover.
Physical Standby Failover Best Practices
Best Practice
9i
10g
Clear the online redo logs for a new standby (following an instantiation or a switchover).
x
Use standby redo logs to reduce amount of data loss.
x
x
Use real-time apply.
x
If the RFS processes are still active, most likely because the primary database can still be communicated with, then either shutdown the primary database or manually kill the RFS process(es).
x
x
Follow the
Oracle9i Media Recovery Best Practices white paper recommendations to obtain the optimal managed recovery apply rate.
x
x
Logical Standby Failover Best Practices
Best Practice
9i
10g
Use standby redo logs to reduce data loss.
x
Use real-time apply.
x
Follow the recommendations outlined in the
Oracle9i Data Guard: SQL Apply Best Practices white paper.
x
x
Further Information
Oracle Maximum Availability Architecture White Papers
Note.387266.1 Oracle10g: Data Guard Switchover and Failover Best Practices
Data Guard Documentation
11g Release 1 Data Guard Concepts and Administration
11g Release 1 Data Guard Broker
10g Release 2 Data Guard Concepts and Administration
10g Release 2 Data Guard Broker
10g Release 1 Data Guard Concepts and Administration
10g Release 1 Data Guard Broker
9i Release 2 Data Guard Concepts and Administration
9i Release 2 Data Guard Broker
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/11780477/viewspace-705096/,如需转载,请注明出处,否则将追究法律责任。