


Oracle RAC 同时具备HA(High Availiablity) 和LB(LoadBalance). 而其高可用性的基础就是Failover(故障转移). 它指集群中任何一个节点的故障都不会影响用户的使用,连接到故障节点的用户会被自动转移到健康节点,从用户感受而言, 是感觉不到这种切换。

Oracle 10g RAC 的Failover 可以分为3种:
1. Client-Side Connect time Failover
2. Client-Side TAF
3. Service-Side TAF





There are many options when configuring Oracle Real Application Clusters (RAC) and Transparent Application Failover (TAF), many variations in hardware infrastructures and networks that support these configurations, and wide variation in customer workflows. ArcGIS supports RAC. TAF configurations vary widely, however, so a blanket statement cannot be made that ArcGIS supports TAF in all possible implementations. 


An important consideration for TAF failover is that ArcGIS connections obtain an in-memory lock that is session specific and managed by Oracle. This lock is used to ensure any schema or state locks are valid. When a failover occurs, the connection made to the surviving node is a new session and therefore results in the loss of the original lock. This can cause unexpected behavior in some editing scenarios. Loss of the original lock also allows modifications to the schema in read-only operations, which can result in unexpected behavior if schema locking is disabled on ArcGIS Server services that reference the data. Therefore, it is very important that thorough testing be done before implementing Oracle TAF with ArcGIS. 

Oracle RAC is in use at several Esri customer sites using geodatabases, and Esri has tested basic RAC and TAF functionality for failover behavior. The results of that testing are described in this article. 

 Regardless of the configuration used, Esri strongly recommends that each customer perform thorough testing to ensure that all workflows and applications work as expected during failover scenarios.





Oracle RAC provides clustering and high-availability (HA) for Oracle databases, allowing Oracle relational database management system software on multiple server nodes to manage a single Oracle database, thereby providing a resilient architecture for the database services. This is typically combined with a resilient storage tier and Oracle client configuration to provide failover in the case of a server node failure. TAF is typically part of an Oracle RAC configuration, providing client-side functionality that allows clients to reconnect to surviving databases in the event of a failure of a database instance. 


Client-Side TAF(Transparent Application Failover)
现在的大部分流行的应用系统(如:weblogic, Jboss),都是启动时就建立若干到数据库的长连接,在应用程序整个生命周期内重用这些
连接。 而Client-Side Connet Time Failover的工作方式使它对应用程序的可用性没有太大帮助。

所以从Oracle 8.1.5 版本只有引入了新的Failover 机制—TAF。 所谓TAF,就是连接建立以后,应用系统运行过程中,如果某个实例发
入,当然,这种透明要是有引导的,因为用户的未提交事务会回滚。 相对与Client-Side Connect Time Failover的用户程序中断(抛出

TAF 的配置也很简单,只需要在客户端的tnsnames.ora中添加FAILOVER_MODE配置项。这个条目有4个子项目需要定义。

1. METHOD: 用户定义何时创建到其实例的连接,有BASIC 和 PRECONNECT 两种可选值。
BASIC: 是指在感知到节点故障时才创建到其他实例的连接。

注意:server-side TAF的failover方式无法设置为PRECONNECT,只能设置为BASIC,10g,11g都是如此。所以,如果failover方式要设定
为PRECONNECT,就只能使用client-side TAF。

两种方法比较: BASIC方式在Failover时会有时间延迟,PRECONNECT方式虽然没有时间延迟,但是建立多个冗余连接会消耗更多资源,

2. TYPE: 用于定义发生故障时对完成的SQL 语句如何处理,其中有2种类型:session 和select.

这2种方式对于未提交的事务都会自动回滚,区别在于对select 语句的处理,对于select,用户正在执行的select语句会被转移到新的实例


显然为了实现select 方式,Oracle 必须为每个session保存更多的内容,包括游标,用户上下文等,需要更多的资源也是用资源换时间





Test results  
Esri has found that setting the TAF failover type to Select provides the most highly available behavior when used with ArcGIS. Select allows applications that began fetching rows from a cursor before failover to continue fetching rows after failover. Any active transactions are rolled back at the time of failure because TAF cannot preserve active transactions after failover. 


When using Select failover, connections in ArcGIS for Desktop and ArcGIS Server switch to a surviving node for most simple operations such as zoom, pan, or refresh. There is a delay before the connection resumes, and that delay depends upon the infrastructure supporting the RAC as well as other configuration parameters. 

Testing also found that when using Select failover, connections are switched to a surviving node during simpler non-versioned and versioned edit sessions. 


Larger bulk data loads fail, however, since active or in-progress transactions are automatically rolled back when using Select failover. 
与之相比较的是 使用select方式在执行大数据量的加载,就会出现问题,但是Oracle会进行事务回滚来保证数据的一致性





Other failover types  
TAF offers two other failover types: None and Session. When using either of these failover types, ArcGIS for Desktop connections fail if a node fails, and an error message similar to the following is returned: 
One or more layers failed to draw:
<user>.<layer>: Failure to access the DBMS server [ORA-03114: not connected to ORACLE]

Manual reconnection to the surviving node from ArcGIS for Desktop is required. 
If the primary node fails, ArcGIS for Server connects automatically to the surviving node when the next ArcGIS Server operation is performed, although there is a pause while the failover connection is made.
