Oracle 11g Dataguard 物理备库配置系列文档
Oracle 11g Dataguard 物理备库配置(一)之duplicate创建
Oracle 11g Dataguard 物理备库配置(二)之Active Dataguard测试
Oracle 11g Dataguard 物理备库配置(三)之Dataguard broker配置
Oracle 11g Dataguard 物理备库配置(四)之broker snapshot standby测试
Oracle 11g Dataguard 物理备库配置(五)之broker switchover测试
Oracle 11g Dataguard 物理备库配置(六)之broker fastfailover测试
Oracle 11g Dataguard 配置学习小结
http://koumm.blog.51cto.com/703525/1280139
本文采用Oracle 11g Dataguard broker fastfailover测试
Oracle 11g Dataguard fast failover配置,需要主备数据库开启闪回功能,闪回功能开启本文略过。
闪回开启需要启动到mount状态时,主备库的监听不要随意关闭。
1. dgmgrl查看主备库状态
$ dgmgrl sys/oracle
DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
欢迎使用 DGMGRL, 要获取有关信息请键入 "help"。
已连接。
DGMGRL> show configuration;
配置 - dgorcldb
保护模式: MaxPerformance
数据库:
orcl - 主数据库
slave - 物理备用数据库
快速启动故障转移: DISABLED
配置状态:
SUCCESS
#快速启动故障转移是不可用状态。
#查看故障转移情况, DISABLED没有启用。
DGMGRL> show fast_start failover
快速启动故障转移: DISABLED
阈值: 30 秒
目标: (无)
观察程序: (无)
滞后限制: 30 秒
关闭主数据库: TRUE
自动恢复: TRUE
可配置的故障转移条件
健康状况:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle 错误条件:
(无)
2. 启动快速启动故障转移
DGMGRL> enable fast_start failover;
3. 启动快速启动故障转移observer观察程序
DGMGRL> start observer;
说明:start observer后,观察程序不会在后台运行,就在前台显示。
在实际使用过程中,需要单独在服务器上启动,在后台自动运行,不能关闭,否则主备库就无法自动监控运行状态。
就无法使用快速启动故障转移功能。
4. 在另一个窗口执行查看
$ dgmgrl sys/oracle
DGMGRL> show fast_start failover;
快速启动故障转移: ENABLED
阈值: 30 秒
目标: slave
观察程序: master
滞后限制: 45 秒
关闭主数据库: TRUE
自动恢复: TRUE
可配置的故障转移条件
健康状况:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle 错误条件:
(无)
DGMGRL> show configuration
配置 - dgorcldb
保护模式: MaxPerformance
数据库:
slave - 主数据库
orcl - (*) 物理备用数据库
快速启动故障转移: ENABLED
配置状态:
SUCCESS
至此dgmgrl 配置的fast_start failover已经配置好,下面模拟故障切换过程。
可以通过shutdown abort模拟数据库意外垮掉的情况,
注1:主库shutdown immediate是不会启动fast start failover功能的。
注2:oracle 11g dataguard fast start failover切换不需要主备库运行在最大可用模式。
5. 模拟测试主库意外垮掉
1)主库上:
$ sqlplus / as sysdba;
SQL> shutdown abort;
观察器显示日志:显示执行主备切换过程
00:18:57.09 2013年8月18日 星期日
正在为数据库 "slave" 启动快速启动故障转移...
立即执行故障转移, 请稍候...
故障转移成功, 新的主数据库为 "slave"
00:19:02.01 2013年8月18日 星期日
主库alter日志:
FSFP started with pid=34, OS id=8169
Sun Aug 18 00:18:24 2013
Shutting down instance (abort)
License high water mark = 16
USER (ospid: 8657): terminating the instance
Instance terminated by USER, pid = 8657
Sun Aug 18 00:18:29 2013
Instance shutdown complete
2) 备库上
登录备库,查看数据库状态已经切换回主库角色
SQL> select open_mode,database_role,db_unique_name from v$database;
OPEN_MODE DATABASE_ROLE DB_UNIQUE_NAME
-------------------- ---------------- ------------------------------
READ WRITE PRIMARY slave
SQL>
切换时备库alter日志,通过日志可以看到整个的切换过程。
[oracle@slave trace]$ tail -f alert_slave.log
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
Sat Aug 17 23:58:57 2013
RFS[2]: Assigned to RFS process 29401
RFS[2]: Selected log 5 for thread 1 sequence 22 dbid 1351417842 branch 823637109
Sat Aug 17 23:59:00 2013
Media Recovery Waiting for thread 1 sequence 23 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 23 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/slave/standby_redo04.log
Sat Aug 17 23:59:11 2013
Archived Log entry 17 added for thread 1 sequence 22 ID 0x508c9ff2 dest 1:
Sun Aug 18 00:13:22 2013
db_recovery_file_dest_size of 4122 MB is 1.21% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Sun Aug 18 00:18:28 2013
RFS[1]: Possible network disconnect with primary database
Sun Aug 18 00:18:28 2013
RFS[3]: Assigned to RFS process 29397
RFS[3]: Possible network disconnect with primary database
Sun Aug 18 00:18:29 2013
RFS[2]: Possible network disconnect with primary database
Sun Aug 18 00:19:00 2013
Attempting Fast-Start Failover because the threshold of 30 seconds has elapsed.
Sun Aug 18 00:19:00 2013
Data Guard Broker: Beginning failover
Sun Aug 18 00:19:00 2013
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Aug 18 00:19:00 2013
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/slave/slave/trace/slave_mrp0_29392.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 1136886
MRP0: Background Media Recovery process shutdown (slave)
Managed Standby Recovery Canceled (slave)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
Attempt to do a Terminal Recovery (slave)
Media Recovery Start: Managed Standby Recovery (slave)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '08/18/2013 00:19:01'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 23 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 4 Seq 23 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/slave/standby_redo04.log
Identified End-Of-Redo (failover) for thread 1 sequence 23 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 1136887 time 08/18/2013 00:18:23
Media Recovery Complete (slave)
Terminal Recovery: successful completion
Forcing ARSCN to IRSCN for TR 0:1136887
Attempt to set limbo arscn 0:1136887 irscn 0:1136887
Resetting standby activation ID 1351393266 (0x508c9ff2)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
ALTER DATABASE SWITCHOVER TO PRIMARY (slave)
Maximum wait for role transition is 15 minutes.
Sun Aug 18 00:19:02 2013
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance slave - Archival Error
ORA-16014: log 4 sequence# 23 not archived, no available destinations
ORA-00312: online log 4 thread 1: '/u01/app/oracle/oradata/slave/standby_redo04.log'
Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/slave/slave/trace/slave_rsm0_29388.trc
Standby terminal recovery start SCN: 1136886
RESETLOGS after complete recovery through change 1136887
Online log /u01/app/oracle/oradata/slave/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/oradata/slave/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/oradata/slave/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 1136885
Sun Aug 18 00:19:02 2013
Setting recovery target incarnation to 4
Switchover: Complete - Database mounted as primary
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
ALTER DATABASE OPEN
Data Guard Broker initializing...
Sun Aug 18 00:19:02 2013
Assigning activation ID 1351535899 (0x508ecd1b)
Sun Aug 18 00:19:02 2013
ARC1: Becoming the 'no SRL' ARCH
Thread 1 advanced to log sequence 2 (thread open)
Thread 1 opened at log sequence 2
Current log# 2 seq# 2 mem# 0: /u01/app/oracle/oradata/slave/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Aug 18 00:19:02 2013
SMON: enabling cache recovery
Archiver process freed from errors. No longer stopped
ARC4: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch
Sun Aug 18 00:19:02 2013
NSA2 started with pid=29, OS id=29504
[29388] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:18267234 end:18267424 diff:190 (1 seconds)
Dictionary check beginning
Sun Aug 18 00:19:03 2013
Error 1034 received logging on to the standby
Error 1034 received logging on to the standby
ARC4: Error 1034 Creating archive log file to 'orcl'
PING[ARC2]: Heartbeat failed to connect to standby 'orcl'. Error is 1034.
Dictionary check complete
Archived Log entry 18 added for thread 1 sequence 1 ID 0x508ecd1b dest 1:
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sun Aug 18 00:19:04 2013
QMNC started with pid=30, OS id=29508
LOGSTDBY: Validating controlfile with logical metadata
Thread 1 advanced to log sequence 3 (LGWR switch)
Current log# 3 seq# 3 mem# 0: /u01/app/oracle/oradata/slave/redo03.log
LOGSTDBY: Validation complete
Archived Log entry 19 added for thread 1 sequence 2 ID 0x508ecd1b dest 1:
Completed: ALTER DATABASE OPEN
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='slave';
ALTER SYSTEM SET log_archive_format='arch_%r_%t_%s.arc' SCOPE=SPFILE SID='slave';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=5 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='/u01/app/oracle/oradata/orcl/','/u01/app/oracle/oradata/slave/' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='/u01/app/oracle/oradata/orcl/','/u01/app/oracle/oradata/slave/' SCOPE=SPFILE;
ALTER SYSTEM SET log_archive_dest_state_2='RESET' SCOPE=BOTH;
Failover succeeded. Primary database is now slave.
Sun Aug 18 00:19:08 2013
Starting background process CJQ0
Sun Aug 18 00:19:08 2013
CJQ0 started with pid=35, OS id=29544
Setting Resource Manager plan SCHEDULER[0x318E]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Aug 18 00:19:11 2013
Starting background process VKRM
Sun Aug 18 00:19:11 2013
VKRM started with pid=33, OS id=29548
Sun Aug 18 00:19:17 2013
FSFP started with pid=39, OS id=29568
Sun Aug 18 00:19:25 2013
ARC2: STARTING ARCH PROCESSES
Sun Aug 18 00:19:25 2013
ARC5 started with pid=40, OS id=29572
ARC5: Archival started
ARC2: STARTING ARCH PROCESSES COMPLETE
krsk_srl_archive_int: Enabling archival of deferred physical standby SRLs
Archived Log entry 20 added for thread 1 sequence 23 ID 0x508c9ff2 dest 1:
Sun Aug 18 00:19:43 2013
Shutting down archive processes
ARCH shutting down
ARC5: Archival stopped
6. 原主库再次启动
说明:原主库再次启动时,角色不会自动切换回,除非手动切换一次switchover到orcl。
DGMGRL> show configuration
配置 - dgorcldb
保护模式: MaxPerformance
数据库:
slave - 主数据库
orcl - (*) 物理备用数据库
快速启动故障转移: ENABLED
配置状态:
SUCCESS
DGMGRL>