PURPOSE
-------
The purpose of this document is to explain how fast reconfiguration
occurs in an Oracle Real Application Clusters environment.
SCOPE & APPLICATION
-------------------
This document is intended for Oracle Real Application Clusters database
administrators that would like to understand fast reconfiguration, a new
feature in 9i RAC.
Fast Reconfiguration in an Oracle Real Application Clusters Environment
-----------------------------------------------------------------------
Fast Reconfiguration is an enhancement added to 9i Real Application Clusters that
is designed to increase availabality time for RAC instances when a reconfiguration
occurs. In Version 8 OPS reconfiguration of open dlm locks/resources takes place
under the following conditions:
1. An instance joins the cluster
2. An instance fails or leaves the cluster
3. A node is halted
In Version 8 this operation could be relatively instantaneous or could be delayed
for several minutes while "lock remastering" takes place. In an example where an instance leaves the cluster, lock remastering occurs and all open global locks/resources are deleted from the departing instance and all locks/resources on all instances are distributed evenly across surviving instances. During this time no lock operations can occur on the database with freeze_db_for_fast_instance_recovery = true (which is the default) set. The amount of time that reconfiguration took depended primarily on the number of open dlm locks/resources (usually higher with fixed locking) and hardware resources (memory, interconnect speed, cpu's). Reconfigurations often caused performance bottlenecks as there was a perceived hang when reconfiguration was occuring and the lock database was frozen. These issues have been addressed in 9i RAC where we attempt
to:
1. Decrease the amount of time it takes to complete reconfiguration. 2. Allow some processes to continue work during reconfiguration.
Decreasing Reconfiguration Time in RAC:
There are several methods that RAC can decrease reconfiguration time. The first
change is that in 9i RAC fixed locks are no more. This will speed up startup time
because locks will not need to be allocated at that time. Secondly, instead of
remastering all locks/resources across all nodes we use an alogorithm called "lazy
remastering" to only remaster a minimal number of locks/resources during a
reconfiguration. For a departing instance we determine how we can best distribute
only the locks/resources from the departing instance and a minimal number from the
surviving instances. For example:
Instance 1, 2, and 3 are currently up and running and master dlm resources
---------------- ----------------- -----------------
| | | | | |
| Instance 1 | | Instance 2 | | Instance 3 |
| | | | | |
| Masters: | | Masters: | | Masters: |
| R1, R2 | | R3, R4 | | R5, R6 |
| | | | | |
---------------- ----------------- -----------------
Instance 3 crashes, in this example instance 1 and 2 keep their resources
and master those of the departing instance. Instance 1 inherits resource 5 and
Instance 2 inherits resource 6.
---------------- ----------------- -----------------
| | | | | |
| Instance 1 | | Instance 2 | | |
| | | | | Instance 3: |
| Masters: | | Masters: | | Down |
| R1, R2, R5 | | R3, R4, R6 | | |
| | | | | |
---------------- ----------------- -----------------
So, instead of removing all resources and remastering them evenly across instances
RAC will only remaster the resources necessary (in this case those owned by the
departing instance) thus using a more efficent means of reconfiguration.
With an instance joining the cluster RAC will remaster a limited number of
resources from the other instances to the new instance. For Example:
---------------- ----------------- -----------------
| | | | | |
| Instance 1 | | Instance 2 | | |
| | | | | Instance 3: |
| Masters: | | Masters: | | Down |
| R1, R2, R3 | | R4, R5, R6 | | |
| | | | | |
---------------- ----------------- -----------------
Instance 3 joins the cluster, in this example a minimal number of resources
are remastered to Instance 3 from the other instances. This is much faster
than the 8i behavior of redistributing all resources. Instance 3 takes on
resources 3 and 6 from instances 1 and 2:
---------------- ----------------- -----------------
| | | | | |
| Instance 1 | | Instance 2 | | Instance 3 |
| | | | | |
| Masters: | | Masters: | | Masters: |
| R1, R2 | | R4, R5 | | R3, R6 |
| | | | | |
---------------- ----------------- -----------------
These examples show a very even distribution of resources but these examples
assume that RAC is not using the primary/secondary node feature where primary
instances master all of the resources (see Note 76632.1 for more on this).
What reconfiguration looks like in the alert log:
Wed Apr 25 16:56:14 2001
Reconfiguration started
List of nodes: 1,
Lock DB frozen
one node partition
Communication channels reestablished
Server queues filtered
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and locks cleaned out
Resources remastered 424
606 PCM locks traversed, 0 cancelled, 11 closed
311 PCM resources traversed, 0 cancelled
2989 PCM resources on freelist, 3300 on array, 3300 allocated
set master node info
606 PCM locks traversed, 0 replayed, 11 unopened
Submitted all remote-lock requests
Update rdomain variables
0 write requests issued in 595 PCM resources
0 PIs marked suspect, 0 flush PI msgs
Dwn-cvts replayed, VALBLKs dubious
All grantable locks granted
Wed Apr 25 16:56:14 2001
Reconfiguration complete
What reconfiguration looks like in an LMON trace:
*** 2001-04-25 16:56:14.565
Reconfiguration started
Synchronization timeout interval: 600 sec
List of nodes: 1,
Lock DB frozen
node 1
* kjshashcfg: I'm the only node in the cluster (node 1)
Active Sendback Threshold = 100 %
Communication channels reestablished
Server queues filtered
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and locks cleaned out
Resources remastered 424
606 PCM locks traversed, 0 cancelled, 11 closed
311 PCM resources traversed, 0 cancelled
2989 PCM resources on freelist, 3300 on array, 3300 allocated
set master node info
606 PCM locks traversed, 0 replayed, 11 unopened
Submitted all remote-lock requests
Update rdomain variables
0 write requests issued in 595 PCM resources
0 PIs marked suspect, 0 flush PI msgs
Dwn-cvts replayed, VALBLKs dubious
All grantable locks granted
*** 2001-04-25 16:56:14.775
Reconfiguration complete
What reconfiguration looks like in a DIAG trace:
Reconfiguration starts [incarn=5]
I'm the master node
Reconfiguration completes [incarn=5]
Allowing Some Processes to Resume Work During Reconfiguration:
As previously described, with the concept of lazy remastering, instances keep many
of their locks/resources during a reconfiguration whereas in previous versions all
were deleted from all instances. Because of this concept, many processes can
resume active work during a reconfiguration because their locks/resources do not
have to be moved.
RELATED DOCUMENTS
-----------------
Note 114566.1 - OPS - Reconfiguration and Instance Startup in OPS
Note 139436.1 - Understanding 9i Real Application Clusters Cache Fusion
Note 144152.1 - Understanding 9i Real Application Clusters Cache Fusion Recovery