也就是 FAST FAILE OVER
http://docs.oracle.com/cd/B28359_01/server.111/b28295/cli.htm#BABEIIHD
You can enable fast-start failover from any site, including theobserver site, while connected to any database in the broker configuration.Enabling fast-start failover does not trigger a failover. Instead, it allowsthe observer to begin observing the primary and standby databases and initiatea fast-start failover should conditions warrant a failover.
This section describes the steps to enable fast-start failoverand start the observer where the configuration property mode is to be set to:
· Ensure standby redo logs areconfigured on the primary and target standby databases.
· Ensure the LogXptMode Property is setto SYNC.
· Set the FastStartFailoverTargetconfiguration property.
· Upgrade the protection mode to MAXAVAILABILITY, if necessary.
· Enable Flashback Database on theprimary and target standby databases, if necessary.
· Start the observer.
· Enable fast start failover.
· Verify the fast-start failoverconfiguration.
DG FAILOVER 操作
FAILOVER 有SQL操作, DG BROKER 手工操作和DG BROKER 自动操作 FAST-FAILOVER
要开启数据库的FLASH BACK 功能目的是在于原来主库恢复的时候REINSTATE
现在把主库给关了 SHUTDOWN IMMEDIATE
在备库上
SQL> ALTER DATABASE RECOVERMANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASEFINISH;
alter database recover managed standby databasedisconnect from session;
Alter database commit to switchover TO PRIMARYwith session shutdown;
SHUTDOWN IMMEDIATE;
STARTUP;
alter system switch logfile;
SELECTNAME,OPEN_MODE,PROTECTION_MODE,DATABASE_ROLE,SWITCHOVER_STATUS,DB_UNIQUE_NAMEFROM V$DATABASE;
关闭主库
1. SQL> shutdown abort
2. DGMGRL> failover to ‘DBSALVE’;
重新启动原主库dg2
1. SQL> startup
2. ORACLE instance started.
3. ORA-16649: possible failover to another database prevents this database from
4. being
5. opened
6. SQL> select open_mode,database_role,db_unique_name,flashback_on from v$database;
7. OPEN_MODE DATABASE_ROLE DB_UNIQUE_NAME FLASHBACK_ON
8. -------------------- ---------------- -------------------- ------------------
9. MOUNTED PRIMARY DBMAST YES
这个时候DMON DG BROKER 发现有两个主库原来的主库只能启动在MOUNT状态下
DGMGRL> reinstate database ‘DBMAST’;
DGMGRL> connecT sys/oracle@DBMAST;
已连接。
DGMGRL> show configuration;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
DBSALVE - 物理备用数据库
快速启动故障转移: DISABLED
配置状态:
SUCCESS
DGMGRL> enablefast_start failover;
错误: ORA-16651: 不能满足启用快速启动故障转移的要求
这个要开启FLASHBACK ON 不用重启,也不用MOUNT下
SQL> alter database flashback on;
数据库已更改。
Set linesize 1000
col db_unique_nameformat a15
col open_modeformat a20
col flashback_onformat a15
col database_roleformat a20
coldataguard_broker format a20
col protection_modeformat a25
colswitchover_status format a25
SQL> selectopen_mode,database_role,db_unique_name,flashback_on from v$database;
OPEN_MODE DATABASE_ROLE DB_UNIQUE_NAME FLASHBACK_ON
-------------------- ----------------------------------- ---------------
READ WRITE PRIMARY DBMAST YES
备用库下先取消恢复
SQL> ALTER DATABASE RECOVER MANAGEDSTANDBY DATABASE CANCEL;
数据库已更改。
SQL> alter database flashback on;
数据库已更改。
SQL> alter database recover managedstandby database using current logfile disconnect;
数据库已更改。
SQL> select flashback_on fromv$database;
FLASHBACK_ON
---------------
YES
再次有效FAST_START FAILOVER;
DGMGRL> connect sys/oracle@dbmast;
已连接。
DGMGRL> show configuration;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
DBSALVE - 物理备用数据库
快速启动故障转移: DISABLED
配置状态:
SUCCESS
DGMGRL> enable fast_start failover;
已启用。
DGMGRL> show configuration;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
警告: ORA-16819: 未启动快速启动故障转移观察程序
DBSALVE - (*) 物理备用数据库
警告: ORA-16819: 未启动快速启动故障转移观察程序
快速启动故障转移: ENABLED
配置状态:
WARNING
DGMGRL> start observer;
观察程序已启动
启动observer的dgmgrl需要一直挂着,所以最好放在后台启动,如:
nohup dgmgrl sys/oracle@db1 'start observer' &
默认情况下,observer会创建一个二进制的文件 fsfo.dat来保存主库和备库的连接信息。 这个文件会在调用dgmgrl命令的当前窗口下生成。
开启新的窗口来看吧
DGMGRL> connect sys/oracle@dbmast;
已连接。
DGMGRL> show configuration;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
DBSALVE - (*) 物理备用数据库
快速启动故障转移: ENABLED
配置状态:
SUCCESS
看下主库的信息,主备库都一样
col FS_FAILOVER_OBSERVER_HOST for a30
select fs_failover_observer_present,fs_failover_observer_host,fs_failover_threshold from v$database;
FS_FAILOVER_OBSERVER_FS_FAILOVER_OBSERVER_HOST FS_FAILOVER_THRESHOLD
--------------------------------------------------- ---------------------
YES DB-MASTER 30
DGMGRL>SHOW FAST_START FAILOVER;
快速启动故障转移: ENABLED
阈值: 30 秒
目标: DBSALVE
观察程序: DB-MASTER
滞后限制: 30 秒 (未使用)
关闭主数据库: TRUE
自动恢复: TRUE
观察程序重新连接: (无)
观察程序覆盖: FALSE
可配置的故障转移条件
健康状况:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle 错误条件:
(无)
DGMGRL> show configuration verbose;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
DBSALVE - (*) 物理备用数据库
(*) 快速启动故障转移目标
属性:
FastStartFailoverThreshold ='30'
OperationTimeout ='30'
FastStartFailoverLagLimit ='30'
CommunicationTimeout ='180'
ObserverReconnect ='0'
FastStartFailoverAutoReinstate ='TRUE'
FastStartFailoverPmyShutdown ='TRUE'
BystandersFollowRoleChange ='ALL'
ObserverOverride ='FALSE'
ExternalDestination1 =''
ExternalDestination2 =''
PrimaryLostWriteAction ='CONTINUE'
快速启动故障转移: ENABLED
阈值: 30 秒
目标: DBSALVE
观察程序: DB-MASTER
滞后限制: 30 秒 (未使用)
关闭主数据库: TRUE
自动恢复: TRUE
观察程序重新连接: (无)
观察程序覆盖: FALSE
配置状态:
SUCCESS
可以修改相关的属性
edit configuration set propertyFastStartFailoverThreshold=120;
2.8 验证自动切换
在前面提到,在一下情况会发生切换:
1) Instance Failure
2) Shutdown Abort
3) Offline Datafiles due to I/O error
4) Network disconnection
所以我们这里模拟主库shutdown的情况,我们在主库执行shutdown abort 在查看主备库的情况。
(1)先在客户端配置一下TAF
在tnsnames.ora 文件里添加如下参数:
DBMAST =
(DESCRIPTION =
(LOAD_BALANCE=OFF)
(FAILOVER=on)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.200)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.202)(PORT =1521))
(CONNECT_DATA =
(SERVICE_NAME = DBMAST)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic)
(RETRIES=180)
(DELAY=5)
)
)
)
这里参数意思是 延迟5秒 重试180次然后连接到202地址上. 这里关闭了负载均衡.
这里使用SELECT 表示 开启查询的会话可以断点续传功能
FAILOVER_MODE 参数
FAILOVER_MODE 参数必须包含CONNECT_DATA 选项,也可以包含一些其他的参数,具体参数和意义参考下表:
FAILOVER_MODE Subparameter |
Description |
BACKUP |
Specify a different net service name for backup connections. A backup should be specified when using preconnect to pre-establish connections. |
TYPE |
Specify the type of failover. Three types of Oracle Net failover functionality are available by default to Oracle Call Interface (OCI) applications: · session: Set to failover the session. If a user's connection is lost, a new session is automatically created for the user on the backup. This type of failover does not attempt to recover selects. · select: Set to enable users with open cursors to continue fetching on them after failure. However, this mode involves overhead on the client side in normal select operations. · none: This is the default. No failover functionality is used. This can also be explicitly specified to prevent failover from happening. |
METHOD |
Determines how fast failover occurs from the primary node to the backup node: · basic: Set to establish connections at failover time. This option requires almost no work on the backup server until failover time. · preconnect: Set to pre-established connections. This provides faster failover but requires that the backup instance be able to support all connections from every supported instance. |
RETRIES |
Specify the number of times to attempt to connect after a failover. If DELAY is specified, RETRIES defaults to five retry attempts. Note: If a callback function is registered, then this subparameter is ignored. |
DELAY |
Specify the amount of time in seconds to wait between connect attempts. If RETRIES is specified, DELAY defaults to one second. Note: If a callback function is registered, then this subparameter is ignored. |
切换验证
主库模拟挂了
SQL> shutdown abort;
ORACLE 例程已经关闭。
在备库的窗口上
DGMGRL> connect sys/oracle@dbsalve;
已连接。
DGMGRL> show configuration verbose;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBMAST - 主数据库
DBSALVE - (*) 物理备用数据库
(*) 快速启动故障转移目标
属性:
FastStartFailoverThreshold ='120'
OperationTimeout ='30'
FastStartFailoverLagLimit ='30'
CommunicationTimeout ='180'
ObserverReconnect ='0'
FastStartFailoverAutoReinstate ='TRUE'
FastStartFailoverPmyShutdown ='TRUE'
BystandersFollowRoleChange ='ALL'
ObserverOverride ='FALSE'
ExternalDestination1 =''
ExternalDestination2 =''
PrimaryLostWriteAction ='CONTINUE'
快速启动故障转移: ENABLED
阈值: 120 秒
目标: DBSALVE
观察程序: DB-Salve
滞后限制: 30 秒 (未使用)
关闭主数据库: TRUE
自动恢复: TRUE
观察程序重新连接: (无)
观察程序覆盖: FALSE
配置状态:
ORA-01034: ORACLE 不可用
ORA-16625: 无法访问数据库 "DBMAST"
DGM-17017: 无法确定配置状态
BROKER的跟踪文件显示:
04/21/2016 09:35:56
Failed to connect to remote database DBMAST.Error is ORA-01034
Failed to send message to site DBMAST. Errorcode is ORA-01034.
04/21/2016 09:37:10
FAILOVER TO DBSALVE
Beginning failover to database DBSALVE
Notifying Oracle Clusterware to teardowndatabase for FAILOVER
04/21/2016 09:37:12
Notifying DMON of db close
DMON: Old primary "DBMAST" needsreinstatement
04/21/2016 09:37:21
Protection mode set to MAXIMUM AVAILABILITY
04/21/2016 09:37:23
Deferring associated archivelog destinationsof sites permanently disabled due to Failover
Notifying Oracle Clusterware to buildupprimary database after FAILOVER
Posting DB_DOWN alert ...
... with reason Data Guard Fast-Start Failover - Primary Disconnected
04/21/2016 09:37:24
Command FAILOVER TO DBSALVE completed
最后 OBSERVER 信息
DGMGRL> start observer;
观察程序已启动
09:37:10.65 2016年4月21日星期四
正在为数据库 "DBSALVE" 启动快速启动故障转移...
立即执行故障转移, 请稍候...
故障转移成功, 新的主数据库为 "DBSALVE"
09:37:24.91 2016年4月21日星期四
疑惑的是 应用端的进程,能立刻连上备库,没有去重试连主库,不知道这配置好像没达到自己的意图. 也就是说主库挂了,在备库等待2分钟内,应用端没有去重试主库180次,就立马连接了备库,而且只是在只读模式状态下.
再把原主库起来
SQL> startup mount
ORACLE 例程已经启动。
Total System Global Area 446775296 bytes
Fixed Size 2254104 bytes
Variable Size 360712936 bytes
Database Buffers 79691776 bytes
Redo Buffers 4116480 bytes
数据库装载完毕。
col name format a10;
col db_unique_name format a10;
col open_mode format a20;
col protection_mode format a20;
col database_role format a20;
col switchover_status format a20;
SQL> SELECTNAME,OPEN_MODE,PROTECTION_MODE,DATABASE_ROLE,SWITCHOVER_STATUS,DB_UNIQUE_NAMEFROM V$DATABASE;
NAME OPEN_MODE PROTECTION_MODE DATABASE_ROLE SWITCHOVER_STATUS DB_UNIQUE_
---------- ---------------------------------------- -------------------- -------------------- ----------
DBMAST MOUNTED MAXIMUMAVAILABILITY PRIMARY NOT ALLOWED DBMAST
还是主库角色
观察者窗口信息
DGMGRL> --start observer;
10:15:12.53 2016年4月21日星期四
正在为数据库 "DBMAST" 启动恢复过程...
正在恢复数据库 "DBMAST", 请稍候...
错误: ORA-16653: 无法恢复数据库
失败。
恢复数据库 "DBMAST" 失败
10:15:36.84 2016年4月21日星期四
DGMGRL> show configuration verbose;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBSALVE - 主数据库
警告: ORA-16817: 快速启动故障转移配置不同步
DBMAST - (*) 物理备用数据库 (禁用)
ORA-16661: 需要恢复备用数据库
(*) 快速启动故障转移目标
属性:
FastStartFailoverThreshold ='120'
OperationTimeout ='30'
FastStartFailoverLagLimit ='30'
CommunicationTimeout ='180'
ObserverReconnect ='0'
FastStartFailoverAutoReinstate ='TRUE'
FastStartFailoverPmyShutdown ='TRUE'
BystandersFollowRoleChange ='ALL'
ObserverOverride ='FALSE'
ExternalDestination1 =''
ExternalDestination2 =''
PrimaryLostWriteAction ='CONTINUE'
快速启动故障转移: ENABLED
阈值: 120 秒
目标: DBMAST
观察程序: DB-Salve
滞后限制: 30 秒 (未使用)
关闭主数据库: TRUE
自动恢复: TRUE
观察程序重新连接: (无)
观察程序覆盖: FALSE
配置状态:
WARNING
DGMGRL> show configuration;
配置 - DG_BROKER_SALVE
保护模式: MaxAvailability
数据库:
DBSALVE - 主数据库
DBMAST - (*) 物理备用数据库
快速启动故障转移: ENABLED
配置状态:
ORA-16610: 命令 "REINSTATE DATABASE DBMAST" 正在进行中
DGM-17017: 无法确定配置状态
问题待续. 目前有两个问题,
问题一是 客户端TAF 配置 不理想.
问题二 原主库 起来后无法reinstated
问题三 这个是啥配置 从哪里来的呀?
原主库的数据库跟踪文件
Fatal NI connect error 12514, connecting to:
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.202)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DBSALVE_DGB)(INSTANCE_NAME=DBSALVE)(CID=(PROGRAM=oracle)(HOST=DB-MASTER)(USER=oracle))))
VERSION INFORMATION:
TNSfor Linux: Version 11.2.0.4.0 - Production
TCP/IPNT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Time:21-4月 -2016 10:41:18
Tracing not turned on.
Tnserror struct:
nsmain err code: 12564
TNS-12564: TNS: 拒绝连接
nssecondary err code: 0
ntmain err code: 0
ntsecondary err code: 0
ntOS err code: 0
BROKER 跟踪文件
Failed to connect to remote database DBSALVE.Error is ORA-12514
Failed to send message to site DBSALVE. Errorcode is ORA-12514.
FAQ 如果你不小心关掉了 start observer 窗口后要STOP OBSERVER 然后在新窗口开启START OBSERVER