实例恢复的深入解析

什么时候会产生实例恢复呢?当你数据库服务器异常断电,重启数据库就会发生实例恢复。实例恢复是由数据库自动完成的,无须DBA的干涉。当然这里有个前提条件:数据文件、在线日志文件、控制文件不得有损坏。

     我们用实验来分析一下实例恢复的整个过程吧!

1、在关闭数据库前,我们先看一下几个检查点的SCN

SQL> select checkpoint_change# from v$database;

     CHECKPOINT_CHANGE#

       ------------------

           1455180

--控制文件中保存的数据库检查点SCN号实际上在所有数据文件头部中最小的检查点SCN

SQL> select file#,checkpoint_change# from v$datafile;

        FILE# CHECKPOINT_CHANGE#

        ---------- ------------------

         1            1455180

         2            1455180

         3            1455180

         4            1455180

         5            1455180

         6            1455180

--控制文件中保存的数据文件检查点SCN:当一个检查点动作完成之后,Oracle就把每个数据文件的scn单独存放在控制文件中

SQL> select file#,checkpoint_change# from v$datafile_header;

       FILE# CHECKPOINT_CHANGE#

        ---------- ------------------

         1            1455180

         2            1455180

         3            1455180

         4            1455180

         5            1455180

         6            1455180

--每个数据文件的文件头中的检查点SCN

这三个检查点的SCN一致,接下来模拟异常断电,重启机器

2、此命令可以模拟异常断电

SQL> shutdown abort;

ORACLE instance shut down.

3、监控告警日志

[oracle@guoyj trace]$ tail -f alert_bxocp.log

Starting background process VKRM

Tue Dec 11 22:54:41 2012

VKRM started with pid=24, OS id=12500

Tue Dec 11 22:58:11 2012

Shutting down instance (abort)

License high water mark = 3

USER (ospid: 12479): terminating the instance

Instance terminated by USER, pid = 12479

Tue Dec 11 22:58:12 2012

Instance shutdown complete

4、数据库启动到MOUNT

SQL> shutdown abort;

ORACLE instance shut down.

SQL> startup mount;

ORACLE instance started.

Total System Global Area  839282688 bytes

Fixed Size                  2233000 bytes

Variable Size             524291416 bytes

Database Buffers          310378496 bytes

Redo Buffers                2379776 bytes

Database mounted.

5、再确定一下这个时间的检查点SCN

SQL> select checkpoint_change# from v$database;

    CHECKPOINT_CHANGE#

        ------------------

         1455180

SQL> SQL>  select file#,checkpoint_change# from v$datafile;

     FILE# CHECKPOINT_CHANGE#

       ---------- ------------------

         1            1455180

         2            1455180

         3            1455180

         4            1455180

         5            1455180

         6            1455180

6 rows selected.

SQL> select file#,checkpoint_change# from v$datafile_header;

     FILE# CHECKPOINT_CHANGE#

      ---------- ------------------

         1            1455180

         2            1455180

         3            1455180

         4            1455180

         5            1455180

         6            1455180

发现与异常断电前的检查点的SCN一致,这里一致无须介质恢复。

先不着急open数据库,我们做一些dump

6dump的控制文件

alter session set events 'immediate trace name CONTROLF level 12';

取部分内容:

***************************************************************************

DATABASE ENTRY

***************************************************************************

(size = 316, compat size = 316, section max = 1, section in-use = 1,

  last-recid= 0, old-recno = 0, last-recno = 0)

(extent = 1, blkno = 1, numrecs = 1)

12/07/2012 10:36:14

DB Name "BXOCP"

Database flags = 0x00404000 0x00001000

Controlfile Creation Timestamp  12/07/2012 10:36:15

Incmplt recovery scn: 0x0000.00000000

Resetlogs scn: 0x0000.000f30dc Resetlogs Timestamp  12/07/2012 10:36:16

Prior resetlogs scn: 0x0000.00000001 Prior resetlogs Timestamp  09/17/2011 09:46:04

Redo Version: compatible=0xb200000

#Data files = 6, #Online files = 6

Database checkpoint: Thread=1 scn: 0x0000.0016344c --数据库检查点SCN16344c转成10进制为1455180


  Threads: #Enabled=1, #Open=1, Head=1, Tail=1

***************************************************************************

CHECKPOINT PROGRESS RECORDS

***************************************************************************

(size = 8180, compat size = 8180, section max = 11, section in-use = 0,

  last-recid= 0, old-recno = 0, last-recno = 0)

(extent = 1, blkno = 2, numrecs = 11)

THREAD #1 - status:0x2 flags:0x0 dirty:55

low cache rba:(0x13.3.0) on disk rba:(0x13.a6.0)

-- low cache rba:(0x13.3.0)实例恢复的起点:19号日志,第3个块,第0个字节

--on disk rba:(0x13.a6.0):实例恢复的终点:19号日志,第166个块,第0个字节


on disk scn: 0x0000.0016359c 12/11/2012 22:57:42

resetlogs scn: 0x0000.000f30dc 12/07/2012 10:36:16

heartbeat: 801789080 mount id: 848836772

THREAD #2 - status:0x0 flags:0x0 dirty:0

low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)

on disk scn: 0x0000.00000000 01/01/1988 00:00:00

resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00

heartbeat: 0 mount id: 0

***************************************************************************

DATA FILE RECORDS

***************************************************************************

(size = 520, compat size = 520, section max = 100, section in-use = 6,

  last-recid= 43, old-recno = 0, last-recno = 0)

(extent = 1, blkno = 11, numrecs = 100)

DATA FILE #1:

  name #7: /oradata/bxocp/system01.dbf

creation size=0 block size=8192 status=0xe head=7 tail=7 dup=1

tablespace 0, index=1 krfil=1 prev_file=0

unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00

Checkpoint cnt:121 scn: 0x0000.0016344c 12/11/2012 22:54:36

--控制文件中保存的数据文件检查点SCN16344c转成10进制为1455180

Stop scn: 0xffff.ffffffff 12/11/2012 22:53:05

--结束的SCN填无穷大,说明是异常关机的,重启数据库必须做实例恢复


   Creation Checkpointed at scn:  0x0000.00000007 09/17/2011 09:46:08

thread:0 rba:(0x0.0.0)

7dump数据文件头

alter session set events 'immediate trace name file_hdrs level 10';

显示数据文件头的部分内容:

V10 STYLE FILE HEADER:

        Compatibility Vsn = 186646528=0xb200000

        Db ID=848459038=0x3292751e, Db Name='BXOCP'

        Activation ID=0=0x0

        Control Seq=2099=0x833, File size=79360=0x13600

        File Number=2, Blksiz=8192, File Type=3 DATA

Tablespace #1 - SYSAUX  rel_fn:2

Creation   at   scn: 0x0000.0000088c 09/17/2011 09:46:16

Backup taken at scn: 0x0000.00000000 01/01/1988 00:00:00 thread:0

reset logs count:0x2fc45da0 scn: 0x0000.000f30dc

prev reset logs count:0x2d6c775c scn: 0x0000.00000001

recovered at 12/11/2012 22:54:36

status:0x4 root dba:0x00000000 chkpt cnt: 121 ctl cnt:120

begin-hot-backup file size: 0

Checkpointed at scn:  0x0000.0016344c 12/11/2012 22:54:36

--数据文件的文件头中的检查点SCN16344c转成10进制为1455180

thread:1 rba:(0x13.2.10)

--重做日志的地址0x13.2.10> 19号日志,第2号块,第16个字节开始恢复


注意:

从控制文件中得到重做日志恢复起始地址:

low cache rba:(0x13.3.0)19号日志,第3个块,第0个字节开始恢复

从数据文件头部得到重做日志恢复起始地址:

thread:1 rba:(0x13.2.10) 9号日志,第2号块,第16个字节开始恢复

8、最后我们打开数据库,然后监控告警日志alert_bxocp.log日志,看是怎么恢复的

[oracle@guoyj trace]$ tail -f alert_bxocp.log

alter database open

Beginning crash recovery of 1 threads

Started redo scan

Completed redo scan

read 81 KB redo, 55 data blocks need recovery

Started redo application at

Thread 1: logseq 19, block 3   --实例恢复开始的重做日志:19号日志第3个块


Recovery of Online Redo Log: Thread 1 Group 1 Seq 19 Reading mem 0

  Mem# 0: /oradata/bxocp/redo01.log

Completed redo application of 0.06MB

Completed crash recovery at

Thread 1: logseq 19, block 166, scn 1475516  --实例恢复结束点的重做日志:19号日志第166个块


  55 data blocks read, 55 data blocks written, 81 redo k-bytes read

Tue Dec 11 23:46:42 2012

Thread 1 advanced to log sequence 20 (thread open)

Thread 1 opened at log sequence 20

  Current log# 2 seq# 20 mem# 0: /oradata/bxocp/redo02.log

Successful open of redo thread 1

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

[12867] Successfully onlined Undo Tablespace 2.

Undo initialization finished serial:0 start:20725234 end:20725294 diff:60 (0 seconds)

Verifying file header compatibility for 11g tablespace encryption..

Verifying 11g file header compatibility for tablespace encryption completed

Tue Dec 11 23:46:42 2012

SMON: enabling cache recovery

SMON: enabling tx recovery

Database Characterset is ZHS16GBK

No Resource Manager plan active

replication_dependency_tracking turned off (no async multimaster replication found)

Starting background process QMNC

Tue Dec 11 23:46:43 2012

QMNC started with pid=21, OS id=13839

Completed: alter database open

Tue Dec 11 23:46:44 2012

Starting background process CJQ0

Tue Dec 11 23:46:44 2012

CJQ0 started with pid=22, OS id=13851

Setting Resource Manager plan SCHEDULER[0x318A]:DEFAULT_MAINTENANCE_PLAN via scheduler window

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Tue Dec 11 23:46:47 2012

Starting background process VKRM

Tue Dec 11 23:46:47 2012

VKRM started with pid=23, OS id=13857

9、可以看出,实例恢复的起始的重做日志是以控制文件中的low cache rba:(0x13.3.0)19号日志,第3个块,第0个字节开始恢复,而不是从文件头的thread:1 rba:(0x13.2.10)

--重做日志的地址0x13.2.10> 19号日志,第2号块,第16个字节开始恢复

10、最后总结一下实例恢复

(1)数据文件、在线日志文件、控制文件不得有损坏

(2)数据库自动恢复,无需DBA干涉

(3)恢复只需在线日志文件,无需归档日志

(4)数据库在open的时候开始实例恢复

实际上我做的这个实例恢实验的还没有写完整, 还有最后一步回滚!这个就留给你们思考!

实例恢复三步:前滚--->打开库---->后滚(也叫回滚)


其实:On Disk RBA不是Instance Recovery的终点!!!

你可能感兴趣的:(实例恢复的深入解析)