It’s an topic where upcoming dba’s and experience people finding issues to understand what does checkpoint really do underneath. I’m not going to discuss those stuff, lot of experience people have wrote white papers one of which I like most “http://www.dbafree.net/wp-content/uploads/2011/05/CheckPoints.pdf” – by Harald. I will try today to let know the details which we saw in regular alert log message such as “Thread 1 advanced to log sequence 72 (LGWR switch)” (example). What does this statement say and lets try to debug things (Visualize things in different pattern to understand better about checkpoints and basic fundamental things (now-a-days upcoming dba’s most them doesn’t spend time with Oracle documentation stuff).
So, back to topic (as i’m going out of topic), what does the below message says,
Beginning log switch checkpoint up to RBA [0x118a.2.10], SCN: 13117648068039
Thread 1 advanced to log sequence 4490 (LGWR switch)
Current log# 2 seq# 4490 mem# 0: +DATA/admin/test/db/redot1g2m1.dbf
Current log# 2 seq# 4490 mem# 1: +RECO/admin/test/mirror/redot1g2m2.dbf
Mon Apr 01 10:54:48 2013
Archived Log entry 15770 added for thread 1 sequence 4489 ID 0x71e82a8e dest 1:
Mon Apr 01 10:59:45 2013
Completed checkpoint up to RBA [0x118a.2.10], SCN: 13117648068039
Many of the dba’s definitely say it’s an log switch to different log group perform by LGWR (either manually log switch (or) it 3 seconds passed since the last LGWR, 1/3rd full etc., you can refer to http://docs.oracle.com/cd/E14072_01/server.112/e10713/process.htm#BABJEHBC). That’s and basic visibility where oracle documentation provides the information ( let’s try to visualize in different way).
What does RBA and SCN signifies me ? It is redo byte address and SCN (start of scn of log file).
RBA [0x118a.2.10], SCN: 13117648068039
=> LGWR got advanced from sequence 4489 to 4490
=> It’s displays current logfiles which is handling (writing at present) time stamp of initiation
=> An archived log entry where ARCH process is currently archiving thread 1 (instance 1 in case of RAC) that is 4489.
=> what does it this “0x71e82a8e” says ? any idea let be elaborate
Many of dba’s known with concept of physical standby databases (fal_server and fal_client) this works mainly on this column “Activation Id”. Below is an example of transmitting of logs files across to physical standby database (I have provided example, from old noted of mine)
“ARC0: Transmitting activation ID 1911040654 (converted 0x71e82a8e to decimal)
ARC0: Transmitting activation ID 1911040654 (28911161)
ARC0: Completed archiving log 2 thread 1 sequence 526 Thu Feb 21 11:56:37 2012
LGWR: Transmitting activation ID 1911040654 (28911161)
LGWR: Beginning to archive log 2 thread 1 sequence 528”
So Activation# provides added tag, when database instance got changed/modified (ex – reset logs and new incarnation or rman cloning).
=> Completed checkpoint up to RBA [0x118a.2.10], SCN: 13117648068039
Why does it always represent start SCN why not end SCN of the log ?
If you check below query output of logfile with status “Current”
select GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS,FIRST_CHANGE#,NEXT_CHANGE# from v$log order by 1,2;
GROUP# THREAD# SEQUENCE# ARC STATUS FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ---------- --- ---------------- ------------- ------------
1 1 4489 YES INACTIVE 1.3118E+13 1.3118E+13
2 1 4490 NO CURRENT 1.3118E+13 2.8147E+14 -- this one
3 1 4483 YES INACTIVE 1.3118E+13 1.3118E+13
4 1 4484 YES INACTIVE 1.3118E+13 1.3118E+13
GROUP# THREAD# SEQUENCE# ARC STATUS FIRST_CHANGE# NEXT_CHANGE#
-------------------- -------------------- -------------------- --- ---------------- -------------------- --------------------
1 1 4489 YES INACTIVE 13117638003248 13117648068039
2 1 4490 NO CURRENT 13117648068039 281474976710655 ---- this next_change# is default limit get’s assigned which can be subject to change based on redo logs entries. You can observer this change in your oracle database instances.
one more example from windows local database on my laptop
GROUP# THREAD# SEQUENCE# ARC STATUS FIRST_CHANGE# NEXT_CHANGE#
------------------- -------------------- -------------------- --- ---------------- -------------------- --------------------
1 1 71 YES INACTIVE 2576635 2577002
2 1 72 NO CURRENT 2577002 281474976710655 --- this doesn’t get changed, some parameter effect or some mathematical calculated value by ORACLE.
So due to that in alert log, it’s just map’s the start SCN (Begin and Completed checkpoint). So, let’s see how the next_change# value get’s updated ?
Checkpoint tables are externalized with the following list of x$tables
x$activeckpt => this table provides the information about the active checkpoint (current logfile), which is holding by LGWR
x$kcccp => this table provides the information about the checkpoint progress… this important table you an query on development databases if you have for testing.
x$ckptbuf => Checkpoint process holds the buffer details (metadata about the circular buffer queue, the state of buffer and what buffers needs to write to one line redo logfile).
X$TARGETRBA => which provides the information to DBWR till what point of RBA (Redo byte address) it should to seek to advance the checkpoint.
Beginning log switch checkpoint up to RBA [0x1179.2.10], SCN: 13117480000458
Thread 1 advanced to log sequence 4473 (LGWR switch)
Current log# 1 seq# 4473 mem# 0: +DATA/admin/test/db/redot1g1m1.dbf
Current log# 1 seq# 4473 mem# 1: +RECO/admin/test/mirror/redot1g1m2.dbf
Mon Apr 01 06:00:40 2013
Archived Log entry 15695 added for thread 1 sequence 4472 ID 0x71e82a8e dest 1:
SYS@test> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 4466
Next log sequence to archive 4473
Current log sequence 4473 --- current sequence number
SYS@test> /
INST_ID CKPT_RBA_SEQ CKPT_RBA_BNO CKPT_RBA_BOF CKPT_ID OBJ_ID CKPT_TYPE --- this is from x$activeckpt
---------- ------------ ------------ ------------ ---------- ---------- ----------
1 4473 2 16 0 0 3 -- thread checkpoint at times (you can test by performing alter system switch log file)
1 4294967295 4294967295 65535 0 0 7 ---- incremental checkpoint by default
Note – By default we have 9 entries maintained /returned by table, as number of checkpoints get’s increased set of checkpoints RBA – will need to write out by DBWR.
columns CKPT_RBA_SEQ(Sequence),BNO(Block number),BOF (offset) -- CKPT_TYPE – checkpoint type …. I will come back to this topic at the end of this entry (will talk about different checkpoint types, people usually ask across this question).
4294967295 and 65535 -- are the default max values for sequence, block number and offset --- further it always default to “incremental checkpoint” (Oracle by defaults thinks to perform incremental checkpoint , that is moving set of modified buffers from checkpoint queue (flushing changes by DBWR at a time – if incremental checkpoint is enabled). So, DBWR will have less number of changes to flush at the end of log file switch by LGWR.
Example –2 --- incremental checkpoint
Beginning log switch checkpoint up to RBA [0x118b.2.10], SCN: 13117652146221
Thread 1 advanced to log sequence 4491 (LGWR switch)
Current log# 3 seq# 4491 mem# 0: +DATA/admin/test/db/redot1g3m1.dbf
Current log# 3 seq# 4491 mem# 1: +RECO/admin/test/mirror/redot1g3m2.dbf
Mon Apr 01 11:14:38 2013
Archived Log entry 15774 added for thread 1 sequence 4490 ID 0x71e82a8e dest 1:
Mon Apr 01 11:20:50 2013
Incremental checkpoint up to RBA [0x118a.7bcc2.0], current log tail at RBA [0x118b.101c8.0]
Mon Apr 01 11:25:15 2013
Completed checkpoint up to RBA [0x118b.2.10], SCN: 13117652146221
Checkpoint start - [0x118b.2.10] … incremental chk point [0x118a.7bcc2.0], [0x118b.101c8.0] completed RBA [0x118b.2.10]
SYS@test> select CPLRBA_SEQ,CPLRBA_BNO,CPLRBA_BOF,CPODR_SEQ,CPODR_BNO,CPODR_BOF,CPSDR_SEQ,CPSDR_BNO,CPSDR_ADB from X$KCCCP 2 ; CPLRBA_SEQ CPLRBA_BNO CPLRBA_BOF CPODR_SEQ CPODR_BNO CPODR_BOF CPSDR_SEQ CPSDR_BNO CPSDR_ADB -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- 4490 507074 0 4491 66967 0 4491 1 0 4022 287035 0 4022 1924808 0 4022 1 0 4026 1673 0 4026 25124 0 4026 1 0 4021 607307 0 4021 734285 0 4021 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 rows selected. SYS@test> / CPLRBA_SEQ CPLRBA_BNO CPLRBA_BOF CPODR_SEQ CPODR_BNO CPODR_BOF CPSDR_SEQ CPSDR_BNO CPSDR_ADB -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- 4490 507074 0 4491 67434 0 4491 1 0 4022 494449 0 4022 2117189 0 4022 1 0 4026 3890 0 4026 26632 0 4026 1 0 4021 608620 0 4021 734628 0 4021 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
From the X$kcccp – we get the details of current checkpoint progress, when we see incremental checkpoint we only see changes with blocks changes to columns
CPLRBA_BNO (Checkpoint Last RBA Block Number ) ,CPODR_BNO (Checkpoint ODR – not sure of this abbreviation what Oracle thought of this but, it “Current Block Number” changes).
That’s signifies that DBWR is working on incremental checkpoint which resulted to information message in alert.
If we convert and see hexadecimal to decimal we are in sync with checkpoint progress data, reported above.
Incremental checkpoint up to RBA [0x118a.7bcc2.0], current log tail at RBA [0x118b.101c8.0]
Mon Apr 01 11:25:15 2013
Completed checkpoint up to RBA [0x118b.2.10], SCN: 13117652146221
4490.507074.0 4491.65992.0
Look at the x$targetrba
SYS@test> / ADDR INDX INST_ID LOGFILESZ TOTALLOGSZ LGLOGSZ CUR_EST_RCV_READS ACTUAL_REDO_BLKS TARGET_RBA_SEQ TARGET_RBA_BNO TARGET_RBA_BOF MIN_LAG CT_LAG CI_LAG ---------------- ---------- ---------- ---------- ---------- ---------- ----------------- ---------------- -------------- -------------- -------------- ---------- ---------- ---------- FFFFFFFF7C85AF38 0 1 47574945 67125248 8390656 8892 231087 4482 2 16 554162 554162 0 SYS@test> / ADDR INDX INST_ID LOGFILESZ TOTALLOGSZ LGLOGSZ CUR_EST_RCV_READS ACTUAL_REDO_BLKS TARGET_RBA_SEQ TARGET_RBA_BNO TARGET_RBA_BOF MIN_LAG CT_LAG CI_LAG ---------------- ---------- ---------- ---------- ---------- ---------- ----------------- ---------------- -------------- -------------- -------------- ---------- ---------- ---------- FFFFFFFF7C859F00 0 1 47574945 67125248 8390656 8883 233067 4482 2 16 555352 555352 0 SYS@test> / ADDR INDX INST_ID LOGFILESZ TOTALLOGSZ LGLOGSZ CUR_EST_RCV_READS ACTUAL_REDO_BLKS TARGET_RBA_SEQ TARGET_RBA_BNO TARGET_RBA_BOF MIN_LAG CT_LAG CI_LAG ---------------- ---------- ---------- ---------- ---------- ---------- ----------------- ---------------- -------------- -------------- -------------- ---------- ---------- ---------- FFFFFFFF7C85AF38 0 1 47574945 67125248 8390656 9161 252268 4482 2 16 572719 572719 0
TARGET_RBA_SEQ and TARGET_RBA_BNO -- never gets changed across the instances
min_lag and ct_lag – refers lag which DBWR writes/flushes to disk from checkpoint queue (incremental checkpoint). As the minimum changes it refers ( the minimum instance recovery would get estimated during instance crash).
SQL> desc X$KCCCP Name Null? Type ----------------------------------------- -------- ----------------- ADDR RAW(8) INDX NUMBER INST_ID NUMBER CPTNO NUMBER CPSTA NUMBER CPFLG NUMBER CPDRT NUMBER CPRDB NUMBER CPLRBA_SEQ NUMBER CPLRBA_BNO NUMBER CPLRBA_BOF NUMBER CPODR_SEQ NUMBER CPODR_BNO NUMBER CPODR_BOF NUMBER CPODS VARCHAR2(16) CPODT VARCHAR2(20) CPODT_I NUMBER CPHBT NUMBER CPRLS VARCHAR2(16) CPRLC NUMBER CPMID NUMBER CPSDR_SEQ NUMBER CPSDR_BNO NUMBER CPSDR_ADB NUMBER
CPRLS --- reset log Id from v$database - RESETLOGS_CHANGE#
CPRLC --- restlog sid from v$database -RESETLOGS_ID
CPHBT – Reset logs heart beat, Oracle initializes the log files ( i.e., it checks the “current” log file for every less than 3 seconds, time period and increments the SCN with respect to transactions changes)
Archive log format With Checkpoint information
log_archive_format string test_%t_%s_%r.arc
+RECO/test/archivelog/2013_04_01/thread_1_seq_4494.21709.811600433 ---- how does this naming takes place ? let’s see the details from “x$kcccp”
SYS@test> / ADDR INDX INST_ID CPTNO CPSTA CPFLG CPDRT CPRDB CPLRBA_SEQ ---------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- CPLRBA_BNO CPLRBA_BOF CPODR_SEQ CPODR_BNO CPODR_BOF CPODS CPODT CPODT_I CPHBT -------------------- -------------------- -------------------- -------------------- -------------------- ---------------- -------------------- -------------------- -------------------- CPRLS CPRLC CPMID CPSDR_SEQ CPSDR_BNO CPSDR_ADB ---------------- -------------------- -------------------- -------------------- -------------------- -------------------- FFFFFFFF7C759198 0 1 1 2 0 8624 333039 4496 966276 0 4497 32142 0 13117724768875 04/01/2013 13:14:11 811602851 811558292 13057495175924 806639088 1915820108 4497 1 0 FFFFFFFF7C759198 1 1 2 2 0 1776 333039 4027 2060311 0 4028 3625 0 13117724769008 04/01/2013 13:14:12 811602852 811645935 13057495175924 806639088 1915820108 4028 1 0 FFFFFFFF7C759198 2 1 3 2 0 13626 333039 4031 698545 0 4031 848492 0 13117724769489 04/01/2013 13:14:12 811602852 811633825 13057495175924 806639088 1915820108 4031 1 0 FFFFFFFF7C759198 3 1 4 2 0 71465 333039 4027 24334 0 4027 1533790 0 13117724769648 04/01/2013 13:14:12 811602852 811626653 13057495175924 806639088 1915820108 4027 1 0 FFFFFFFF7C759198 4 1 5 0 0 0 333039 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 FFFFFFFF7C759198 5 1 6 0 0 0 333039 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 rows selected.
Please concentrate on columns CPODT (Checkpoint O Day time) conversion and CPODS (checkpoint O Day Seqeunce).
SYS@test> select scn_to_timestamp(13117724768875) as timestamp from dual;
TIMESTAMP
---------------------------------------------------------------------------
01-APR-13 01.14.11.000000000 PM
Now, oracle every time increments the heartbeat every things 3 seconds (less than 3 seconds), you can find the different in columns CPODT_I (Time _increment) and CPHBT (Heart Beat) – this makes the “Final End SCN” when it LGWR switches new log file and than SCN marks the END SCN in v$archived_log ( which is not preditab during the process of switch to new LOG File).
Different Types of Check points ? Let’s answer this question.
I see only two types of checkpoints (default is incremental and thread checkpoint which we see in alert log).
7 --- incremental checkpoint (default)
3 --- thread checkpoint (Log Switch)
Their exists other checkpoints types as per PPT which I referred to earlier. we need to debug further. I will update on next upcoming post on checkpoint queue.
Checkpoints Buffer Queue
Checkpoints buffer queue provides the information about the buffers state and number of buffer which need to flushed out by DBWR, you can see below I have 400 queue buffer with 64 each different states.
The buffer’s queue are subject to internal algorithm (yet still finding things to debug on this)
SQL> select queue_num ,count(*) from x$ckptbuf group by queue_num order by 1;
QUEUE_NUM COUNT(*) ---------- ---------- 396 64 397 64 398 64 399 64 400 64 401 rows selected.
SQL> desc x$ckptbuf Name Null? Type ----------------------------------------- -------- --------- ADDR RAW(8) INDX NUMBER INST_ID NUMBER QUEUE_NUM NUMBER SET_NUM NUMBER LATCH_NUM NUMBER BUF_PTR RAW(8) BUF_TS# NUMBER BUF_FILE# NUMBER BUF_DBARFIL NUMBER BUF_DBABLK NUMBER BUF_STATE NUMBER BUF_COUNT NUMBER BUF_RBA_SEQ NUMBER BUF_RBA_BNO NUMBER BUF_RBA_BOF NUMBER
I will discuss further on this buffer view how to map with redo log vectors if possible to debug the stuff. Stay tuned and I hope you enjoyed the stuff …