本文厘清了Oracle下的Redo log概念。
首先声明,这里使用的是标准的Administrator-Managed Database的2节点RAC环境。注意输出中的Database is administrator managed
:
$ srvctl config database
DB0613_d7h_nrt
$ srvctl config database -db DB0613_d7h_nrt
Database unique name: DB0613_d7h_nrt
Database name: DB0613
Oracle home: /u01/app/oracle/product/19.0.0.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/DB0613_D7H_NRT/PARAMETERFILE/spfile.262.1139391157
Password file: +DATA/DB0613_D7H_NRT/PASSWORD/pwddb0613_d7h_nrt.259.1139390533
Domain: sub07290808380.training.oraclevcn.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools:
Disk Groups: RECO,DATA
Mount point paths: /opt/oracle/dcs/commonstore
Services:
Type: RAC
Start concurrency:
Stop concurrency:
OSDBA group: dba
OSOPER group: dbaoper
Database instances: DB06131,DB06132
Configured nodes: mlrac1,mlrac2
CSS critical: no
CPU count: 0
Memory target: 0
Maximum memory: 0
Default network number for database services:
Database is administrator managed
与Administrator-Managed Database对应的是Policy-managed部署,也就是基于server pool的部署:
Policy-managed deployment is based on server pools, where database services run within a server pool as singleton or uniform across all of the servers in the server pool.
疑惑是由以下输出产生的。
在以下输出中,居然有4个状态为CURRENT,对于2节点RAC,应该有2个thread,那么应该是2个CURRENT吧:
SQL> select * from gv$log;
INST_ID GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARCHIVED STATUS FIRST_CHANGE# FIRST_TIME NEXT_CHANGE# NEXT_TIME CON_ID
2 1 1 3 1073741824 512 1 NO CURRENT 1603907 13-JUN-23 9295429630892703743 13-JUN-23 0
2 2 1 2 1073741824 512 1 YES INACTIVE 1535132 13-JUN-23 1603889 13-JUN-23 0
2 3 2 3 1073741824 512 1 YES INACTIVE 1603909 13-JUN-23 1603911 13-JUN-23 0
2 4 2 4 1073741824 512 1 NO CURRENT 1603911 13-JUN-23 9295429630892703743 0
1 1 1 3 1073741824 512 1 NO CURRENT 1603907 13-JUN-23 9295429630892703743 13-JUN-23 0
1 2 1 2 1073741824 512 1 YES INACTIVE 1535132 13-JUN-23 1603889 13-JUN-23 0
1 3 2 3 1073741824 512 1 YES INACTIVE 1603909 13-JUN-23 1603911 13-JUN-23 0
1 4 2 4 1073741824 512 1 NO CURRENT 1603911 13-JUN-23 9295429630892703743 0
8 rows selected.
然后以下输出中有8行,但不同的redo log file只有4个。这很容易让人觉得4个log文件是2个RAC实例共享的,也就是每个实例都有4个redo log file。
SQL> select * from gv$logfile order by inst_id, group#;
INST_ID GROUP# STATUS TYPE MEMBER IS_RECOVERY_DEST_FILE CON_ID
1 1 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_1.257.1139390641 NO 0
1 2 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_2.258.1139390641 NO 0
1 3 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_3.259.1139391143 NO 0
1 4 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_4.260.1139391149 NO 0
2 1 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_1.257.1139390641 NO 0
2 2 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_2.258.1139390641 NO 0
2 3 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_3.259.1139391143 NO 0
2 4 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_4.260.1139391149 NO 0
8 rows selected.
SQL> select distinct member from gv$logfile;
MEMBER
+RECO/DB0613_D7H_NRT/ONLINELOG/group_4.260.1139391149
+RECO/DB0613_D7H_NRT/ONLINELOG/group_2.258.1139390641
+RECO/DB0613_D7H_NRT/ONLINELOG/group_1.257.1139390641
+RECO/DB0613_D7H_NRT/ONLINELOG/group_3.259.1139391143
其实关键问题是在于,在RAC环境下,不应使用GV视图。具体参考Redo log threads in Real Application Clusters
简单来说,大多数情况下,v$
是针对实例的,gv$
是针对集群的,例如V$SESSION
。但redo log是个例外,对于redo log,由于RAC实例必须看到所有的redo log才能进行恢复,因此v$log
会显示所有的redo log。
因此,对于redo log,不要使用gv$log
和gv$logfile
,只应使用v$
视图。
从以下输出中可知,此2节点RAC有4个redo log file,每实例2个。每个实例各有一个状态为CURRENT,也就是说,各实例自己写自己的redo log,尽管它也可以读取其它实例的redo log:
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARCHIVED STATUS FIRST_CHANGE# FIRST_TIME NEXT_CHANGE# NEXT_TIME CON_ID
1 1 3 1073741824 512 1 NO CURRENT 1603907 13-JUN-23 9295429630892703743 13-JUN-23 0
2 1 2 1073741824 512 1 YES INACTIVE 1535132 13-JUN-23 1603889 13-JUN-23 0
3 2 3 1073741824 512 1 YES INACTIVE 1603909 13-JUN-23 1603911 13-JUN-23 0
4 2 4 1073741824 512 1 NO CURRENT 1603911 13-JUN-23 9295429630892703743 0
然后,以下输出说明整个RAC有4个redo log file:
SQL> select * from v$logfile order by group#;
GROUP# STATUS TYPE MEMBER IS_RECOVERY_DEST_FILE CON_ID
1 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_1.257.1139390641 NO 0
2 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_2.258.1139390641 NO 0
3 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_3.259.1139391143 NO 0
4 ONLINE +RECO/DB0613_D7H_NRT/ONLINELOG/group_4.260.1139391149 NO 0
在文档Redo Log File Storage in Oracle RAC中也说到:
For administrator-managed databases, each instance has its own online redo log groups. … To add a redo log group to a specific instance … Each instance must have at least two groups of redo log files. … During database recovery, all enabled instances are checked to see if recovery is needed.
在19c RAC技术架构中,也说到:
All redo log files must be accessible to all instances, each redo log file should be multiplexed as in a single instance. When using ASM with normal redundancy each redo log member is mirrored, and a second Multiplexed member is placed in a different disk group. Each instance must have at least two redo log groups also called a thread.
There should be at least 2 multiplexed controlfiles accessible to all instances, as with redo log files each multiplexed controlfile should be placed in a different disk group.
这就跟以上的输出对应上了。2节点RAC,每节点各有2个redo log file,总共就是4个。
ASMCMD> pwd
+RECO/DB0613_D7H_NRT/ONLINELOG
ASMCMD> ls -l
Type Redund Striped Time Sys Name
ONLINELOG UNPROT COARSE JUN 13 09:00:00 Y group_1.257.1139390641
ONLINELOG UNPROT COARSE JUN 13 09:00:00 Y group_2.258.1139390641
ONLINELOG UNPROT COARSE JUN 13 09:00:00 Y group_3.259.1139391143
ONLINELOG UNPROT COARSE JUN 13 09:00:00 Y group_4.260.1139391149