以下版本都是11.2.0.1
日志大小都是512MB
GROUP# THREAD# SEQUENCE# MB-rw-r----- 1 oracle oinstall 215M 2015-05-26 15:33:11.000000000 +0800 2_56429_815757575.dbf
另一台数据库同样是512MB在线日志
而单实例的 -rw-r----- 1 oracle oinstall 424M 05-29 07:27 log_1_744572_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 07:47 log_1_744573_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 07:59 log_1_744574_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 424M 05-29 08:16 log_1_744575_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 426M 05-29 08:29 log_1_744576_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 08:43 log_1_744577_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 426M 05-29 08:56 log_1_744578_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 09:10 log_1_744579_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 426M 05-29 09:26 log_1_744580_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 09:41 log_1_744581_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 425M 05-29 09:57 log_1_744582_769708468_4d561d31.arc -rw-r----- 1 oracle oinstall 424M 05-29 10:15 log_1_744583_769708468_4d561d31.arc |
archive_lag_target=0
日志切换条件
If a thread requiring a forced log switch is not open, the instance that raised the force SCN will perform the log switch on behalf of the closed thread, and the first available ARCn process in any instance will archive the log file. However, if the thread is open, its instance is prompted to perform these actions itself. This is done by taking a KK instance lock. The LGWR process in each instance holds the KK instance lock on its own thread. The id2 field identifies the thread number. When this lock is taken by another instance, LGWR recognizes that a forced log switch is required.
关于第3点中文解释
本帖最后由 面条s 于 2013-4-22 10:14 编辑 原来我一直知道是为了保证恢复用的,但是具体的细节很模糊,看了英文我明白了。 意思就是当任何一个实例的redo log被重用的时候,数据库记录的force SCN 会更新为被重用的redo log的high SCN+1,如果当前其他实例的current log file的low SCN小于刚才数据库记录的force SCN,那就必须发生切换。数据库的force SCN记录在控制文件中,即v$database.ARCHIVE_CHANGE#。 如果某个实例是关闭状态,那么发起更新force SCN的实例会帮它切换,其他第一个发起归档的实例会帮它归档,如果是打开,则自己归档。实例发生切换需要获取KK的instance lock。 |
我来翻译一下吧,看了感觉还是不知道rac不到redo大小就切换的原因,呵呵,问问版主么
为了保证可恢复性,OPS自动会使用相对空闲的线程来进行log switch,每次log被重用,记录在控制文件的forceSCN比重用的redo的 high SCN大1,如果任何线程(OPS下多个thread)当前log的low SCN落后于forceSCN,那个线程会进行强制log switch。这允许那个线程可以对log进行归档。在重用的log里的新redo不能被使用在一些恢复的场景或者standby,直到那个log归档了-----------这句话的意思就是一些情况下,只有redo被归档,才能用来做恢复,master用arch或者lgwr传输归档到standby进行恢复。
如果需要进行log switch的线程没打开,记录force SCN的实例会用那个关闭的线程的名义进行log switch,任何实例上空闲的ARC process会对switch的那个log file进行归档,但是如果线程打开了,拥有这个线程的实例就会自己完成这些工作,会在上面加KK锁,每个实例的LGWR会在它自己的线程加KK实例锁,id2表示实例号,这个锁被其他实例持有,LGWR就会知道要进行log switch了。
关于第三点 是说你的数据库开启了强制日志模式
select force_loggin from v$database;
明显我这里的节点1 产生日志量很大,切换很频繁,而节点2切换很少, 那么说节点1 会促使节点2发生切换?
下面关于redo log浪费的情况
Oracle中联机日志文件(online redo log)在大多平台上以512 字节为一个标准块。
(HPUX,Tru64 Unix上是1024bytes,SCO UNIX,Reliant UNIX上是2048bytes,而MVS,MPE/ix上是4096bytes,虽然以上许多UNIX已经不再流行,实际情况可以通过
select max(l.lebsz) log_block_size_kccle
from sys.x$kccle l
where l.inst_id = userenv(‘Instance’) 语句查询到)
LGWR后台进程写出REDO时未必能填满最后的当前日志块。举例而言,假设redo buffer中有1025字节的内容需要写出,则1025=512+512+1 共占用三个重做日志标准块,前2个标准块被填满而第三个标准块只使用了1个字节。在LGWR完成写出前,需要释放”redo allocation”闩,在此之前SGA中索引”redo buffer”信息的变量将指向未被填满块后面的一个重做块,换而言之有511字节的空间被LGWR跳过了,这就是我们说的redo wastage;我们可以通过分析v$sysstat动态视图中的redo wastage统计信息了解实例生命周期中的重做浪费量。
SQL> col name for a25 SQL> select name,value from v$sysstat where name like '%wastage%'; NAME VALUE ------------------------- ---------- redo wastage 132032为什么要浪费这部分空间呢?实际上,这种做法十分有益于LGWR的串行I/O模式。redo wastage并不是问题或者Bug,而是Oracle故意为之的。当然过量的redo wastage也不是什么好事,一般是LGWR写出过于频繁的症状表现。9i以后很少有因为隐式参数_log_io_size过小而导致的LGWR过载了,如果在您的系统中发现redo wastage的问题不小,那么无限制地滥用commit操作往往是引起问题的罪魁祸首,减少不必要的commit语句,把commit从循环中移除都将利于减少redo wastage。
SQL> set time on; 19:49:45 SQL> alter system switch logfile; /*切换日志,清理现场*/ System altered. 19:51:07 SQL> col name for a25 19:51:16 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage'); NAME VALUE ------------------------- ---------- redo size 1418793324 redo wastage 88286544 /*演示开始时的基础统计值*/ 19:51:19 SQL> begin 19:52:10 2 for i in 1..550000 loop 19:52:10 3 insert into tv values(1,'a'); 19:52:10 4 commit; 19:52:10 5 end loop; 19:52:10 6 end; 19:52:11 7 / /*匿名块中commit操作位于loop循环内,将导致大量redo wastage*/ PL/SQL procedure successfully completed. 19:53:07 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage'); NAME VALUE ------------------------- ---------- redo size 1689225404 redo wastage 112011352 /*频繁提交的匿名块产生了 1689225404-1418793324=257MB的redo,其中存在112011352-88286544=22MB的redo wastage*/
19:53:14 SQL> begin 19:53:33 2 for i in 1..550000 loop 19:53:33 3 insert into tv values(1,'a'); 19:53:33 4 end loop; 19:53:33 5 commit; 19:53:33 6 end; 19:53:34 7 / /* 此匿名块中commit操作被移除loop循环中,批量修改数据后仅在最后提交一次*/ PL/SQL procedure successfully completed. 19:53:59 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage'); NAME VALUE ------------------------- ---------- redo size 1828546240 redo wastage 112061296 /*稀疏提交的匿名块最后产生了1828546240-1689225404=132MB的重做,而redo wastage为112061296-112011352=48k*/
可能您会很奇怪前者不是只比后者多出22MB的redo浪费吗,为什么总的redo量差了那么多?
我们需要注意到commit本身也是要产生redo的,而且其所产生的还不少!就以上演示来看频繁提交的过程中,commit所占用的redo空间几乎接近一半(257-132-22)/257=40%,而每次commit的平均redo量为(257-132-22)*1024*1024/550000=196 bytes。
commit操作是事务ACID的基础之一,合理运用commit可以帮我们构建健壮可靠的应用,而滥用它必将是另一场灾难!
Archive Logs Are Created With Smaller, Uneven Size Than The Original Redo Logs. Why? [ID 388627.1] |
he created archived redologs are (significant) smaller than the related online redologfile.
CAUSE
There are 2 possible causes for this :
1. Documented and designed behaviour due to explicit forcing an archive creation before the redolog file is full
SQL> alter system switch logfile;
SQL> alter system archive log current;
RMAN> backup archivelog all;
RMAN> backup database plus archivelog;
2. Undocumented, but designed behaviour :
BUG 9272059 - REDO LOG SWITCH AT 1/8 OF SIZE DUE TO CMT CPU'S
BUG 10354739 - REDOLOGSIZE NOT COMPLETLY USED
BUG 12317474 - FREQUENT REDO LOG SWITCHES GENERATING SMALL SIZED ARCHIVELOGS
BUG 5450861 - ARCHIVE LOGS ARE GENERATED WITH A SMALLER SIZE THAN THE REDO LOG FILES
Explanation :
As per Bug: 5450861 (closed as 'Not a Bug'):
* The archive logs do not have to be even in size. This was decided a very long time ago,
when blank padding the archive logs was stopped, for a very good reason - in order to save disk space.
* The log switch does not occur when a redo log file is 100% full. There is an internal algorithm
that determines the log switch moment. This also has a very good reason - doing the log switch
at the last moment could incur performance problems (for various reasons, out of the scope of this note).
As a result, after the log switch occurs, the archivers are copying only the actual information from the
redo log files. Since the redo logs are not 100% full after the log switch and the archive logs are
not blank padded after the copy operation has finished, this results in uneven, smaller files than
the original redo log files.
There are a number of factors which combine to determine the log
switch frequency. These are the most relevant factors in this case:
a) RDBMS parameter LOG_BUFFER_SIZE
If this is not explicitly set by the DBA then we use a default;
at instance startup the RDBMS calculates the number of shared redo
strands as ncpus/16, and the size of each strand is 128Kb * ncpus
(where ncpus is the number of CPUs in the system). The log buffer
size is the number of stands multiplied by the strand size.
The calculated or specified size is rounded up to a multiple of the granule size
of a memory segment in the SGA. For 11.2 if
SGA size >= 128GB then granule size is 512MB
64GB <= SGA size < 128GB then granule size is 256MB
32GB <= SGA size < 64GB then granule size is 128MB
16GB <= SGA size < 32GB then granule size is 64MB
8GB <= SGA size < 16GB then granule size is 32MB
1GB <= SGA size < 8GB then granule size is 16MB
SGA size < 1GB then granule size is 4MB
There are some minimums and maximums enforced.
b) System load
Initially only one redo strand is used, ie the number of "active"
redo strands is 1, and all the processes copy their redo into
that one strand. When/if there is contention for that strand then
the number of active redo strands is raised to 2. As contention
for the active strands increases, the number of active strands
increases. The maxmum possible number of active redo strands is
the number of strands initially allocated in the log buffer.
(This feature is called "dynamic strands", and there is a hidden
parameter to disable it which then allows processes to use all
the strands from the outset).
c) Log file size
This is the logfile size decided by the DBA when the logfiles are created.
d) The logfile space reservation algorithm
When the RDBMS switches into a new online redo logfile, all the
log buffer redo strand memory is "mapped" to the logfile space.
If the logfile is larger than the log buffer then each strand
will map/reserve its strand size worth of logfile space, and the
remaining logfile space (the "log residue") is still available.
If the logfile is smaller than the log buffer, then the whole
logfile space is divided/mapped/reserved equally among all the
strands, and there is no unreserved space (ie no log residue).
When any process fills a strand such that all the reserved
underlying logfile space for that strand is used, AND there is
no log residue, then a log switch is scheduled.
Example : 128 CPU's so the RDBMS allocates a
log_buffer of size 128Mb containing 8 shared strands of size 16Mb.
It may be a bit larger than 128Mb as it rounds up to an SGA granule boundary.
The logfiles are 100Mb, so when the RDBMS switches into a
new online redo logfile each strand reserves 100Mb/8 = 25600 blocks
and there is no log residue. If there is low system load, only one
of the redo strands will be active/used and when 25600 blocks of
that strand are filled then a log switch will be scheduled - the created
archive logs have a size around 25600 blocks.
With everything else staying the same (128 cpu's and low load),
using a larger logfile would not really reduce the amount of
unfilled space when the log switches are requested, but it would
make that unfilled space less significant as a percentage of the
total logfile space, eg
- with a 100Mb logfile, the log switch happens with 7 x 16Mb
logfile space unfilled (ie the logfile is 10% full when the
log switch is requested)
- with a 1Gb logfile, the log switch would happen with 7 x 16Mb
logfile space unfilled (ie the logfile is 90% full when the
log switch is requested)
With a high CPU_COUNT, a low load and a redo log file size smaller than
the redolog buffer, you may see small archived log files because of log switches
at about 1/8 of the size of the define log file size.
This is because CPU_COUNT defines the number of redo strands (ncpus/16).
With a low load only a single strand may be used. With redo log file size smaller
than the redolog buffer, the log file space is divided over the available strands.
When for instance only a single active strand is used, a log switch can already occur
when that strand is filled.
SOLUTION
Check if the above is matching the behaviour you are seeing, then it is expected behaviour.
If the behaviour is different than open a Service Request to Oracle Support for futher analysis