记一次存储block corrupt 导致数据库大恢复

前几天在vmware装了一次Oracle 10g +rac +asm,详见http://itspace.iteye.com/blog/631585。
停止mcrac1节点vip,注意红色字体标示,也将mcrac1节点的asm实例停止了。
引用
[oracle@mcrac1 dbs]$ crs_stop ora.mcrac1.vip -f
Attempting to stop `ora.mcrac1.LISTENER_MCRAC1.lsnr` on member `mcrac1`
Stop of `ora.mcrac1.LISTENER_MCRAC1.lsnr` on member `mcrac1` succeeded.
Attempting to stop `ora.mcrac1.ASM1.asm` on member `mcrac1`
Stop of `ora.mcrac1.ASM1.asm` on member `mcrac1` succeeded.
Attempting to stop `ora.mcrac1.vip` on member `mcrac1`
Stop of `ora.mcrac1.vip` on member `mcrac1` succeeded.

asm实例停止之后,再次在mcrac1节点启动实例的时候意外发生了:
引用
SQL> startup
ASM instance started

Total System Global Area   92274688 bytes
Fixed Size                  1217884 bytes
Variable Size              65890980 bytes
ASM Cache                  25165824 bytes
ORA-15032: not all alterations performed
ORA-15130: diskgroup "DATADG" is being dismounted

进一步后台观察alert日志,发现有Corrupt Block产生,导致ASM实例启动错误:
引用
Sat Apr  3 16:22:17 2010
SQL> ALTER DISKGROUP ALL MOUNT
Sat Apr  3 16:22:17 2010
NOTE: cache registered group DATADG number=1 incarn=0x811fef38
Sat Apr  3 16:22:17 2010
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Sat Apr  3 16:22:17 2010
ORA-15186: ASMLIB error function = [asm_open],  error = [1],  mesg = [Operation not permitted]
Sat Apr  3 16:22:17 2010
ORA-15186: ASMLIB error function = [asm_open],  error = [1],  mesg = [Operation not permitted]
Sat Apr  3 16:22:17 2010
ORA-15186: ASMLIB error function = [asm_open],  error = [1],  mesg = [Operation not permitted]
Sat Apr  3 16:22:17 2010
NOTE: Hbeat: instance not first (grp 1)
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/raw/raw1
Sat Apr  3 16:22:17 2010
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATADG_0001 path:/dev/raw/raw2
NOTE: cache opening disk 2 of grp 1: DATADG_0002 path:/dev/raw/raw3
NOTE: cache mounting (not first) group 1/0x811FEF38 (DATADG)
Sat Apr  3 16:22:17 2010
kjbdomatt send to node 1
Sat Apr  3 16:22:18 2010
NOTE: attached to recovery domain 1
Sat Apr  3 16:22:19 2010
WARNING: cache failed to read fn=3  indblk=0 from disk(s): 2
ORA-15196: invalid ASM block header [kfc.c:7910] [endian_kfbh] [3] [2147483648] [6 != 1]
NOTE: a corrupted block was dumped to the trace file
System State dumped to trace file /oracle/app/admin/+ASM/bdump/+asm1_lgwr_11493.trc
NOTE: cache initiating offline of disk 2  group 1
WARNING: offlining disk 2.3956219848 (DATADG_0002) with mask 0x3
NOTE: PST update: grp = 1, dsk = 2, mode = 0x6
Sat Apr  3 16:22:19 2010
ERROR: too many offline disks in PST (grp 1)
Sat Apr  3 16:22:20 2010
NOTE: PST not enabling heartbeating (grp 1): group dismounted
Sat Apr  3 16:22:20 2010
NOTE: halting all I/Os to diskgroup DATADG
NOTE: active pin found: 0x0x2427ccd0
NOTE: active pin found: 0x0x2427cc64
Sat Apr  3 16:22:20 2010
WARNING: instance recovery required
Sat Apr  3 16:22:20 2010
NOTE: Hbeat: instance not first (grp 1)
Sat Apr  3 16:22:20 2010
NOTE: cache mounting (not first) (retry) group 1/0x811FEF38 (DATADG)
ERROR: ORA-15130 signalled during mount of diskgroup DATADG
NOTE: cache dismounting group 1/0x811FEF38 (DATADG)
Sat Apr  3 16:22:21 2010
kjbdomdet send to node 1
detach from dom 1, sending detach message to node 1
Sat Apr  3 16:22:21 2010
Dirty detach reconfiguration started (old inc 6, new inc 6)
List of nodes:
0 1
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
35 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Sat Apr  3 16:22:21 2010
WARNING: dirty detached from domain 1
Sat Apr  3 16:22:21 2010
NOTE: PST enabling heartbeating (grp 1)
Sat Apr  3 16:22:21 2010
ERROR: diskgroup DATADG was not mounted
Sat Apr  3 16:22:21 2010
WARNING: PST-initiated MANDATORY DISMOUNT of group DATADG not performed - group not mounted
Sat Apr  3 16:22:21 2010
Errors in file /oracle/app/admin/+ASM/bdump/+asm1_b000_11614.trc:
ORA-15001: diskgroup "DATADG" does not exist or is not mounted

后台trace文件有如下提示,从提示中可以看出kfbh.endian已经为0x06了,正确的值应该为1.
引用
*** SERVICE NAME:() 2010-04-03 16:44:53.125
*** SESSION ID:(41.1) 2010-04-03 16:44:53.125
OSM metadata block dump:
kfbh.endian:                          6 ; 0x000: 0x06
kfbh.hard:                          162 ; 0x001: 0xa2
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                12617264 ; 0x004: T=0 NUMB=0xc08630
kfbh.block.obj:                  853426 ; 0x008: TYPE=0x0 NUMB=0xd05b2
kfbh.check:                    67174400 ; 0x00c: 0x04010000
kfbh.fcn.base:                    21267 ; 0x010: 0x00005313
kfbh.fcn.wrap:                        1 ; 0x014: 0x00000001
kfbh.spare1:                      51428 ; 0x018: 0x0000c8e4
kfbh.spare2:                     853426 ; 0x01c: 0x000d05b2


从理论上来讲,mcrac1和mcrac2节点用的是共享存储,一旦mcrac1上发生Corrupt Block,mcrac2节点也用该同样启动错误,但奇怪的是mcrac2节点asm实例启动正常,从alert日志中可以看出。
引用
Sat Apr  3 16:55:19 2010
SQL> ALTER DISKGROUP ALL MOUNT
Sat Apr  3 16:55:19 2010
NOTE: cache registered group DATADG number=1 incarn=0xbb92fa11
Sat Apr  3 16:55:19 2010
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Sat Apr  3 16:55:19 2010
NOTE: Hbeat: instance first (grp 1)
Sat Apr  3 16:55:23 2010
NOTE: start heartbeating (grp 1)
NOTE: cache opening disk 0 of grp 1: DATADG_0000 label:VOL1
Sat Apr  3 16:55:23 2010
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATADG_0001 label:VOL2
NOTE: cache opening disk 2 of grp 1: DATADG_0002 label:VOL3
NOTE: cache mounting (first) group 1/0xBB92FA11 (DATADG)
* allocate domain 1, invalid = TRUE
kjbdomatt send to node 0
Sat Apr  3 16:55:23 2010
NOTE: attached to recovery domain 1
Sat Apr  3 16:55:24 2010
NOTE: cache recovered group 1 to fcn 0.3049
Sat Apr  3 16:55:24 2010
NOTE: opening chunk 1 at fcn 0.3049 ABA
NOTE: seq=7 blk=214
Sat Apr  3 16:55:24 2010
NOTE: cache mounting group 1/0xBB92FA11 (DATADG) succeeded
SUCCESS: diskgroup DATADG was mounted
Sat Apr  3 16:55:24 2010
NOTE: recovering COD for group 1/0xbb92fa11 (DATADG)
SUCCESS: completed COD recovery for group 1/0xbb92fa11 (DATADG)

考虑到vmware存在bug,有时候只需要将出问题的虚拟机重启,即可解决问题,于是将mcrac1节点虚拟机重启,重启之后问题依旧。既然mcrac2节点能启动正常,于是在mcrac2节点数据库实例在mount状态下将数据库做全备。
引用
SQL> select dbid from v$database;

      DBID
----------
676148534

引用
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP on;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> backup database format '/ocfs2/backup/%d__%s_%p_%T.bkp';

Starting backup at 03-APR-10
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=+DATADG/dbrac/datafile/system.256.715082291
input datafile fno=00003 name=+DATADG/dbrac/datafile/sysaux.257.715082291
input datafile fno=00002 name=+DATADG/dbrac/datafile/undotbs1.258.715082293
input datafile fno=00005 name=+DATADG/dbrac/datafile/undotbs2.264.715082411
input datafile fno=00004 name=+DATADG/dbrac/datafile/users.259.715082293
channel ORA_DISK_1: starting piece 1 at 03-APR-10
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 04/03/2010 17:26:00
ORA-19566: exceeded limit of 0 corrupt blocks for file

+DATADG/dbrac/datafile/sysaux.257.715082291

不幸的是,3号数据文件即辅助表空间已经发生Corrupt Block。设置参数,使得能备份成功。
引用
RUN
{
  SET MAXCORRUPT FOR DATAFILE 3 TO 100000;
  BACKUP database format '/ocfs2/backup/%d__%s_%p_%T.bkp';
}

查看V$DATABASE_BLOCK_CORRUPTION可以看到3好数据文件上已经有201个Corrupt Block
引用
QL> select * from  V$DATABASE_BLOCK_CORRUPTION;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
         3      30724          1                  0 FRACTURED
         3      30725         52                  0 CORRUPT
         3      30778          3                  0 CORRUPT
         3      30782          1                  0 CORRUPT
         3      30785         14                  0 CORRUPT
         3      30800          1                  0 CORRUPT
...


         3      37341          3                  0 ALL ZERO
         3      37401          8                  0 ALL ZERO
         3      37410          2                  0 ALL ZERO
         3      37413         20                  0 ALL ZERO
         3      37437          1                  0 ALL ZERO
         3      37439          1                  0 ALL ZERO
         3      37441         40                  0 ALL ZERO
         3      37489          1                  0 ALL ZERO
         3      37493         34                  0 ALL ZERO
         3      37528          2                  0 ALL ZERO
         3      37531         11                  0 ALL ZERO

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
         3      37543         42                  0 ALL ZERO
         3      37609         56                  0 ALL ZERO
         3      37681        720                  0 ALL ZERO


201 rows selected.

进一步用dbv校验asm中3号数据文件,Corrupt Block依旧!
引用
[oracle@mcrac2 dbs]$ dbv USERID=sys/oracle blocksize=8192

file='+DATADG/dbrac/datafile/sysaux.257.715082291'

...
Page 31730 is marked corrupt
Corrupt block relative dba: 0x00c07bf2 (file 3, block 31730)
Bad header found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00c0931a
last change scn: 0x0000.00117a68 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x7a680601
check value in block header: 0xcd36
computed block checksum: 0x0

Page 31731 is marked corrupt
Corrupt block relative dba: 0x00c07bf3 (file 3, block 31731)
Bad header found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00c0931b
last change scn: 0x0000.00117a79 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x7a790601
check value in block header: 0x9425
computed block checksum: 0x0

校验1号数据文件,所幸的是没有产生任何Corrupt Block。
引用
[oracle@mcrac2 dbs]$ dbv USERID=sys/oracle blocksize=8192

file='+DATADG/dbrac/datafile/SYSTEM.256.715082291'

DBVERIFY: Release 10.2.0.1.0 - Production on Sat Apr 3 17:47:17 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = +DATADG/dbrac/datafile/SYSTEM.256.715082291


DBVERIFY - Verification complete

Total Pages Examined         : 61440
Total Pages Processed (Data) : 36661
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 6827
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1774
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 16178
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 0 (0.0)


在做好全备的前提下,将两节点的所有实例关闭,重新划盘,重做ASM DISK GROUP.
双节点关闭所有实例,一节点删除asm磁盘

引用
[root@mcrac1 init.d]# oracleasm listdisks
VOL1
VOL2
VOL3

[root@mcrac1 init.d]# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk "VOL1": [  OK  ]
[root@mcrac1 init.d]# /etc/init.d/oracleasm deletedisk VOL2
Removing ASM disk "VOL2": [  OK  ]
[root@mcrac1 init.d]# /etc/init.d/oracleasm deletedisk VOL3
Removing ASM disk "VOL3": [  OK  ]


删除磁盘之后二节点查询
引用
[root@mcrac2 init.d]# oracleasm listdisks
VOL1
VOL2
VOL3

[root@mcrac2 init.d]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Cleaning disk "VOL1"
Cleaning disk "VOL2"
Cleaning disk "VOL3"
Scanning system for ASM disks...
[root@mcrac2 init.d]# oracleasm listdisks


一节点创建磁盘
引用
[root@mcrac1 init.d]# oracleasm createdisk VOL1 /dev/sdc1
Writing disk header: done
Instantiating disk: done
[root@mcrac1 init.d]# oracleasm createdisk VOL2 /dev/sdd1
Writing disk header: done
Instantiating disk: done
[root@mcrac1 init.d]# oracleasm createdisk VOL3 /dev/sde1
Writing disk header: done
Instantiating disk: done
[root@mcrac1 init.d]# oracleasm listdisks
VOL1
VOL2
VOL3

二节点扫描磁盘
引用
[root@mcrac2 init.d]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "VOL1"
Instantiating disk "VOL2"
Instantiating disk "VOL3"
[root@mcrac2 init.d]# oracleasm listdisks
VOL1
VOL2
VOL3


一节点重新创建磁盘组
引用
[oracle@mcrac1 ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.1.0 - Production on Sat Apr 3 17:55:59 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ASM instance started

Total System Global Area   92274688 bytes
Fixed Size                  1217884 bytes
Variable Size              65890980 bytes
ASM Cache                  25165824 bytes
SQL> create diskgroup datavg external redundancy disk 'ORCL:VOL1','ORCL:VOL2','ORCL:VOL3';

Diskgroup created.
SQL> select name,PATH,STATE from v$asm_disk;

NAME
------------------------------
PATH
--------------------------------------------------------------------------------
STATE
--------
VOL1
ORCL:VOL1
NORMAL

VOL2
ORCL:VOL2
NORMAL

NAME
------------------------------
PATH
--------------------------------------------------------------------------------
STATE
--------

VOL3
ORCL:VOL3
NORMAL

SQL> select name,STATE from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
DATAVG                         MOUNTED

后台alert日志显示
引用
NOTE: detached from domain 1
Sat Apr  3 18:01:07 2010
SUCCESS: diskgroup DATAVG was created
NOTE: requesting all-instance PST refresh for group=0
NOTE: cache registered group DATAVG number=1 incarn=0x501d8e37
Sat Apr  3 18:01:07 2010
NOTE: Hbeat: instance first (grp 1)
Sat Apr  3 18:01:12 2010
NOTE: start heartbeating (grp 1)
NOTE: cache opening disk 0 of grp 1: VOL1 label:VOL1
Sat Apr  3 18:01:12 2010
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: VOL2 label:VOL2
NOTE: cache opening disk 2 of grp 1: VOL3 label:VOL3
NOTE: cache mounting (first) group 1/0x501D8E37 (DATAVG)
* allocate domain 1, invalid = TRUE
kjbdomatt send to node 1
Sat Apr  3 18:01:12 2010
NOTE: attached to recovery domain 1
Sat Apr  3 18:01:12 2010
NOTE: cache recovered group 1 to fcn 0.0
Sat Apr  3 18:01:12 2010
NOTE: opening chunk 1 at fcn 0.0 ABA
NOTE: seq=2 blk=0
Sat Apr  3 18:01:12 2010
NOTE: cache mounting group 1/0x501D8E37 (DATAVG) succeeded
SUCCESS: diskgroup DATAVG was mounted
Sat Apr  3 18:01:13 2010
NOTE: recovering COD for group 1/0x501d8e37 (DATAVG)
SUCCESS: completed COD recovery for group 1/0x501d8e37 (DATAVG)


但是mount disk时报错
引用
SQL> startup
ASM instance started

Total System Global Area   92274688 bytes
Fixed Size                  1217884 bytes
Variable Size              65890980 bytes
ASM Cache                  25165824 bytes
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"DATADG"


后台日志报错
引用
SQL> ALTER DISKGROUP ALL MOUNT
Sat Apr  3 18:07:07 2010
NOTE: cache registered group DATADG number=1 incarn=0x17ce908a
Sat Apr  3 18:07:07 2010
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Sat Apr  3 18:07:07 2010
ERROR: no PST quorum in group 1: required 2, found 0
Sat Apr  3 18:07:07 2010
NOTE: cache dismounting group 1/0x17CE908A (DATADG)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATADG was not mounted

经查是参数文件中asm_diskgroups='DATADG',而我重建的disk group为DATAVG,一字之差,使我多走了很多弯路,包括再次重划磁盘,重启主机,修改asm_diskstring为'/dev/sdc1','/dev/sdd1','/dev/sde1',使用dd命令将磁盘分区清空等尝试方法,期间甚至发生了磁盘不可写的情况。
引用
Warning: invalid flag 0x0000 of partition table

将2节点asm实例建好之后,由于没有自动备份至asm磁盘的控制文件和参数文件已经被格式化掉,只能手工编辑参数文件和控制文件。
参数文件内容如下
引用
processes = 150
sessions = 170
__shared_pool_size = 134217728
__large_pool_size = 4194304
__java_pool_size  = 4194304
__streams_pool_size= 0
#spfile= '+DATAVG/dbrac/spfiledbrac.ora'
sga_target = 436207616
control_files = '+DATAVG/dbrac/controlfile/control01.ctl'
db_block_size = 8192
__db_cache_size = 289406976
compatible= 10.2.0.1.0
db_file_multiblock_read_count= 16
#cluster_database= TRUE
cluster_database_instances= 2
db_create_file_dest = '+DATAVG'
thread= 1
instance_number= 1
undo_management= AUTO
dbrac1.undo_tablespace = 'UNDOTBS1'
dbrac2.undo_tablespace = 'UNDOTBS2'
remote_login_passwordfile= 'EXCLUSIVE'
#db_domain =
#dispatchers= '(PROTOCOL=TCP)' '(SERVICE=dbracXDB)'
#remote_listener= LISTENERS_DBRAC
job_queue_processes = 10
background_dump_dest = '/oracle/app/admin/dbrac/bdump'
user_dump_dest= '/oracle/app/admin/dbrac/udump'
core_dump_dest = '/oracle/app/admin/dbrac/cdump'
audit_file_dest = '/oracle/app/admin/dbrac/adump'
db_name= 'dbrac'
open_cursors= 300
pga_aggregate_target =144703488

控制文件内容如下
引用
CREATE CONTROLFILE REUSE DATABASE "DBRAC" RESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 1968
LOGFILE
  GROUP 1 '+DATAVG/dbrac/datafile/redo01.log'  SIZE 50M,
  GROUP 2 '+DATAVG/dbrac/datafile/redo02.log'  SIZE 50M,
  GROUP 3 '+DATAVG/dbrac/datafile/redo03.log'  SIZE 50M
-- STANDBY LOGFILE
DATAFILE
  '/tmp/system.256.715082291',
  '/tmp/sysaux.257.715082291',
  '/tmp/undotbs1.258.715082293',
  '/tmp/undotbs2.264.715082411',
  '/tmp/users.259.715082293'
CHARACTER SET ZHS16GBK;

将数据库实例启动至nomount。
引用
oracle@mcrac1 dbs]$ sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Apr 4 09:50:22 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/oracle/app/product/10.2.0/db_2/dbs/tmp.txt'
ORACLE instance started.

Total System Global Area  436207616 bytes
Fixed Size                  1219856 bytes
Variable Size             142607088 bytes
Database Buffers          289406976 bytes
Redo Buffers                2973696 bytes


在rman中恢复数据文件
引用
DECLARE
devtype varchar2(256);
done boolean;
BEGIN
devtype := dbms_backup_restore.DeviceAllocate (type => '',ident => 'testdatafile');
dbms_backup_restore.RestoreSetDatafile;
dbms_backup_restore.RestoreDatafileTo(dfnumber => 1,toname => '/tmp/system.256.715082291');
dbms_backup_restore.RestoreDatafileTo(dfnumber => 2,toname => '/tmp/undotbs1.258.715082293');
dbms_backup_restore.RestoreDatafileTo(dfnumber => 3,toname => '/tmp/sysaux.257.715082291');
dbms_backup_restore.RestoreDatafileTo(dfnumber => 4,toname => '/tmp/users.259.715082293');
dbms_backup_restore.RestoreDatafileTo(dfnumber => 5,toname => '/tmp/undotbs2.264.715082411');
dbms_backup_restore.RestoreBackupPiece(done => done,handle => '/ocfs2/backup/DBRAC__6_1_20100403.bkp', params => null);
dbms_backup_restore.DeviceDeallocate;
END;
/


接着创建控制文件,使用resetlogs打开数据库。
引用
SQL> alter database open resetlogs;

Database altered.

接下来,将数据文件转入asm实例中,再次将实例置为mount状态
引用
RMAN> RUN
2> {
3>   SET MAXCORRUPT FOR DATAFILE 3 TO 100000;
  BACKUP as copy database format  '+DATAVG';
}
4> 5>
executing command: SET MAX CORRUPT

Starting backup at 04-APR-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=153 devtype=DISK
channel ORA_DISK_1: starting datafile copy
input datafile fno=00001 name=/tmp/system.256.715082291
output filename=+DATAVG/dbrac/datafile/system.262.715427575 tag=TAG20100404T095253 recid=1

stamp=715427610
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:45
channel ORA_DISK_1: starting datafile copy
input datafile fno=00003 name=/tmp/sysaux.257.715082291
output filename=+DATAVG/dbrac/datafile/sysaux.263.715427619 tag=TAG20100404T095253 recid=2

stamp=715427635
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile fno=00002 name=/tmp/undotbs1.258.715082293
output filename=+DATAVG/dbrac/datafile/undotbs1.264.715427645 tag=TAG20100404T095253 recid=3

stamp=715427648
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile fno=00005 name=/tmp/undotbs2.264.715082411
output filename=+DATAVG/dbrac/datafile/undotbs2.265.715427651 tag=TAG20100404T095253 recid=4

stamp=715427653
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile copy
copying current control file
output filename=+DATAVG/dbrac/controlfile/backup.266.715427655 tag=TAG20100404T095253 recid=5

stamp=715427656
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile fno=00004 name=/tmp/users.259.715082293
output filename=+DATAVG/dbrac/datafile/users.267.715427657 tag=TAG20100404T095253 recid=6

stamp=715427657
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 04-APR-10


RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DATAVG/dbrac/datafile/system.262.715427575"
datafile 2 switched to datafile copy "+DATAVG/dbrac/datafile/undotbs1.264.715427645"
datafile 3 switched to datafile copy "+DATAVG/dbrac/datafile/sysaux.263.715427619"
datafile 4 switched to datafile copy "+DATAVG/dbrac/datafile/users.267.715427657"
datafile 5 switched to datafile copy "+DATAVG/dbrac/datafile/undotbs2.265.715427651"

打开的时候报corrupted
引用
RMAN> alter database open;

database opened
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 04/04/2010 09:55:36
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 59584)
ORA-01110: data file 1: '+DATAVG/dbrac/datafile/system.262.715427575'


后台日志显示:
引用
Sun Apr  4 09:55:35 2010
Hex dump of (file 1, block 59584) in trace file

/oracle/app/admin/dbrac/udump/dbrac1_ora_5410.trc
Corrupt block relative dba: 0x0040e8c0 (file 1, block 59584)
Bad header found during buffer read
Data in bad block:
type: 40 format: 2 rdba: 0x00c03fe0
last change scn: 0x0000.0005b923 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xb9232802
check value in block header: 0x4ae3
computed block checksum: 0x0
Reread of rdba: 0x0040e8c0 (file 1, block 59584) found same corrupted data


用dbv校验转asm之前的1号数据文件时,没有存在Corrupt Block
引用
[oracle@mcrac1 tmp]$ dbv file=system.256.715082291 blocksize=8192

DBVERIFY: Release 10.2.0.1.0 - Production on Sun Apr 4 10:59:39 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = system.256.715082291


DBVERIFY - Verification complete

Total Pages Examined         : 61440
Total Pages Processed (Data) : 36686
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 6827
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1774
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 16153
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 1191473 (0.1191473)

而使用rman转换之后就有Corrupt Block。
引用
[oracle@mcrac1 tmp]$ dbv USERID=sys/oracle BLOCKSIZE=8192 file='+DATAVG/dbrac/datafile/system.262.715427575'
DBVERIFY - Verification complete

Total Pages Examined         : 61440
Total Pages Processed (Data) : 36620
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 6801
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1774
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 16131
Total Pages Marked Corrupt   : 117
Total Pages Influx           : 3
Highest block SCN            : 0 (0.0)

从这里可以判定,虚拟机共享存储有问题,初步估计是划盘时没有预分配空间所致。
以上内容,仅作记录,留待以后研究!
ptw:发这篇博文时,系统提示我,这也要河蟹, 呵呵

你可能感兴趣的:(vmware,oracle,sql,linux,cache)