Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1]

Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1]

 

Modified 19-OCT-2010     Type BULLETIN     Status PUBLISHED

 

 

PURPOSE

-------

 

When considering 9i Data Guard and the possible failure scenarios, we see that

the proper configuration for redirecting new and existing connections from the

failed instance to the new primary is crucial.  The discussion below covers two

possible failover configurations:  connect time failover and application

failover.

 

SCOPE & APPLICATION

-------------------

 

This document is intended to aid in the configuration of Connect Time Failover

and Transparent Application Failover in a Data Guard environment.

 

Connect Time Failover

----------------------

 

Connect time failover will reroute incoming connections to the instance that has

just become primary.  This type of failover should work in cases where the old

primary node is down, old primary network is down, old primary listener is down,

or old primary instance is now the standby.

 

When the old primary network is down, failover functionality is built into the

basic layer of Oracle Net.  We simply tcp timeout and fail to the next host in

the list.  Changing the tcp timeout parameters will determine the speed at which

failover occurs.  However, the basic configuration of connect time failover is

not sufficient for the remaining failure scenarios.

 

Consider the following service name:

 

   DGD =

     (DESCRIPTION =

       (ADDRESS_LIST =

         (ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu1)(PORT = 1521))

         (ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu2)(PORT = 1521))

       )

       (CONNECT_DATA =

         (SERVICE_NAME = DGD)

       )

     )

 

Using the above alias, the failover (graceful or forced will work correctly if

the old primary instance/listener is down. However it does not work correctly

for the switchover scenario.  With switchover, the old primary is now the

standby and the old standby is now the primary.  When we issue the connection

to the old primary node -- now running as a mounted standby -- we receive the

following error:

 

   ORA-01033: ORACLE initialization or shutdown in progress

 

This is expected behavior.  Connect time failover is not programmed to failover

on this error. 

 

We can solve this by setting following parameters in the init.ora files:

 

   Primary init.ora:  instance_name=DGD_P

 

   Standby init.ora:  instance_name=DGD_S

 

After a switchover, as the standby and primary databases are brought up,  PMON

will register the service_names AND the instance_name.  We also must change the

TNS service name to look for these values:

 

   DGD =

     (DESCRIPTION =

       (ADDRESS_LIST =

         (ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu1)(PORT = 1521))

         (ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu2)(PORT = 1521))

       )

       (CONNECT_DATA =

         (INSTANCE_NAME=DGD_P)

         (SERVICE_NAME = DGD)

       )

     )

 

Now when we connect to the old primary/new standby, we get the following error:

 

'ORA-12521 TNS:Listener could not resolve INSTANCE_NAME given in connect descriptor'

 

At this point the connection fails over to the second host in the address list.

This final connection attempt succeeds as the proper instance_name (DGD_P) is

present.

 

Note that the DBA must maintain two init.ora's to maintain the seperate

instance_name values or alter parameter with the alter system command once the

instance has opened.

 

Application Failover:

----------------------

 

For application failover, all existing connections from the current primary

must failover to the new primary.  One of the biggest obstacles to overcome is

the lag time from when the standby databse becomes the primary database. 

 

Client connections should continue to retry the failover until the standby has

been opened as the new production.  This can be configured by having an alias

similar to the following:

 

   DGD_TAF=

     (DESCRIPTION=

      (address_list=

       (load_balance=off)

       (failover=on)

       (ADDRESS=(PROTOCOL=TCP)(Host=hasunclu1)(Port=1521))

       (ADDRESS=(PROTOCOL=TCP)(Host=hasunclu2)(Port=1521))

      )

       (CONNECT_DATA=

           (SERVICE_NAME=DGD)

           (instance_name=DGD_P)

       (FAILOVER_MODE=

           (TYPE=session)

           (METHOD=BASIC)

           (RETRIES=180)

           (DELAY =5)))

       )

 

With this alias, TAF will try to failover to the second node in the address_list.

If it cannot connect, it will wait five seconds and retry again.  It will retry

a total of 180 times.  This delay will provide the DBA with enough time to

perform a switchover or activate the standby as the new production. 

 

This timing can be adjusted to suit your environment and should be tested

accordingly.

 

RELATED DOCUMENTS

-------------

 

 

 

 

 

------------------------------------------------------------------------------

你可能感兴趣的:(Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1])