1:周末遭遇停电,新配置的内网测试数据库active dataguard 环境自动关闭,今早启动主库后发现报错如下
[oracle@db1 ~]$ sqlplus /nolog SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 17 17:00:57 2013 Copyright (c) 1982, 2009, Oracle. All rights reserved. SQL> conn /as sysdba Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 1.3362E+10 bytes Fixed Size 2217952 bytes Variable Size 6777997344 bytes Database Buffers 6576668672 bytes Redo Buffers 4960256 bytes Database mounted. ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], []
2:第一反应先recover下database,再尝试拉起数据库,问题依旧
SQL> recover database; Media recovery complete. SQL> alter database open; alter database open * ERROR at line 1: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], []
3:于是看alert日志和相关的trace文件,综合判断可能控制文件出现问题
[oracle@db1 ~]$ tail -f alert_db.log Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_ora_32462.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], [] Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_ora_32462.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE OPEN... Trace dumping is performing id=[cdmp_20130617170117] Mon Jun 17 17:02:13 2013 Sweep [inc][24153]: completed Sweep [inc2][24153]: completed Mon Jun 17 17:02:38 2013 ALTER DATABASE RECOVER database Media Recovery Start started logmerger process Parallel Media Recovery started with 4 slaves Mon Jun 17 17:02:38 2013 Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0 Mem# 0: /u01/app/oracle/oradata/DB/onlinelog/o1_mf_3_8vpmjgoq_.log Mem# 1: /u01/app/oracle/flash_recovery_area/DB/onlinelog/o1_mf_3_8vpmkm9x_.log Media Recovery Complete (db) Completed: ALTER DATABASE RECOVER database Mon Jun 17 17:02:54 2013 alter database open Beginning crash recovery of 1 threads parallel recovery started with 3 processes Started redo scan Completed redo scan read 152 KB redo, 0 data blocks need recovery Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_ora_32462.trc (incident=24154): ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/db1/db/incident/incdir_24154/db_ora_32462_i24154.trc Mon Jun 17 17:02:55 2013 Trace dumping is performing id=[cdmp_20130617170255] Aborting crash recovery due to error 600 Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_ora_32462.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], [] Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_ora_32462.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [6], [301353], [301354], [], [], [], [], [], [], [] ORA-600 signalled during: alter database open... Mon Jun 17 17:03:13 2013 Sweep [inc][24154]: completed Sweep [inc2][24154]: completed
4:于是生成控制文件trace控制脚本对控制文件执行恢复
SQL> select open_mode from v$database; OPEN_MODE -------------------- MOUNTED SQL> alter database backup controlfile to trace as '/tmp/1.ctl'; Database altered. [oracle@db1 ~]$ cat /tmp/1.ctl CREATE CONTROLFILE REUSE DATABASE "DB" NORESETLOGS FORCE LOGGING ARCHIVELOG MAXLOGFILES 16 MAXLOGMEMBERS 3 MAXDATAFILES 100 MAXINSTANCES 8 MAXLOGHISTORY 292 LOGFILE GROUP 1 ( '/u01/app/oracle/oradata/DB/onlinelog/o1_mf_1_8vpmdkl9_.log', '/u01/app/oracle/flash_recovery_area/DB/onlinelog/o1_mf_1_8vpmdryt_.log' ) SIZE 512M BLOCKSIZE 512, GROUP 2 ( '/u01/app/oracle/oradata/DB/onlinelog/o1_mf_2_8vpmfqmw_.log', '/u01/app/oracle/flash_recovery_area/DB/onlinelog/o1_mf_2_8vpmhk2z_.log' ) SIZE 512M BLOCKSIZE 512, GROUP 3 ( '/u01/app/oracle/oradata/DB/onlinelog/o1_mf_3_8vpmjgoq_.log', '/u01/app/oracle/flash_recovery_area/DB/onlinelog/o1_mf_3_8vpmkm9x_.log' ) SIZE 512M BLOCKSIZE 512 --STANDBY LOGFILE -- GROUP 4 ( -- '/u01/app/oracle/oradata/DB1/onlinelog/o1_mf_4_8vpq6nvy_.log', -- '/u01/app/oracle/flash_recovery_area/DB1/onlinelog/o1_mf_4_8vpq7wk8_.log' -- ) SIZE 512M BLOCKSIZE 512, -- GROUP 5 ( -- '/u01/app/oracle/oradata/DB1/onlinelog/o1_mf_5_8vpqbh6s_.log', -- '/u01/app/oracle/flash_recovery_area/DB1/onlinelog/o1_mf_5_8vpqcmbj_.log' -- ) SIZE 512M BLOCKSIZE 512, -- GROUP 6 ( -- '/u01/app/oracle/oradata/DB1/onlinelog/o1_mf_6_8vpqf3rz_.log', -- '/u01/app/oracle/flash_recovery_area/DB1/onlinelog/o1_mf_6_8vpqfv5w_.log' -- ) SIZE 512M BLOCKSIZE 512, -- GROUP 7 ( -- '/u01/app/oracle/oradata/DB1/onlinelog/o1_mf_7_8vpqgw0j_.log', -- '/u01/app/oracle/flash_recovery_area/DB1/onlinelog/o1_mf_7_8vpqhcql_.log' -- ) SIZE 512M BLOCKSIZE 512 DATAFILE '/u01/app/oracle/oradata/DB/datafile/o1_mf_system_8vpm8hf3_.dbf', '/u01/app/oracle/oradata/DB/datafile/o1_mf_sysaux_8vpm8hjq_.dbf', '/u01/app/oracle/oradata/DB/datafile/o1_mf_undotbs1_8vpm8hl4_.dbf', '/u01/app/oracle/oradata/DB/datafile/o1_mf_users_8vpm8hn1_.dbf' CHARACTER SET ZHS16GBK; SQL> shutdown immediate ORA-01109: database not open Database dismounted. ORACLE instance shut down. SQL> startup nomount ORACLE instance started. Total System Global Area 1.3362E+10 bytes Fixed Size 2217952 bytes Variable Size 6777997344 bytes Database Buffers 6576668672 bytes Redo Buffers 4960256 bytes SQL> @/tmp/1.ctl; Control file created.
5:重建控制文件后,对数据库进行recover,成功打开数据库
SQL> alter database open; alter database open * ERROR at line 1: ORA-01113: file 1 needs media recovery ORA-01110: data file 1: '/u01/app/oracle/oradata/DB/datafile/o1_mf_system_8vpm8hf3_.dbf' SQL> recover database; Media recovery complete. SQL> alter database open; Database altered. SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- READ WRITE PRIMARY
6:通过观察alert日志,添加临时表空间数据文件
[oracle@db1 ~]$ tail -f alert_db.log Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_m001_381.trc: ORA-25153: Temporary Tablespace is Empty Mon Jun 17 17:17:25 2013 Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_j007_400.trc: ORA-25153: Temporary Tablespace is Empty Errors in file /u01/app/oracle/diag/rdbms/db1/db/trace/db_j007_400.trc: ORA-12012: error on auto execute of job 12696 ORA-25153: Temporary Tablespace is Empty ORA-06512: at "DBSNMP.BSLN_INTERNAL", line 2073 ORA-06512: at line 1 Mon Jun 17 17:18:20 2013 alter tablespace temp add tempfile Completed: alter tablespace temp add tempfile SQL> alter tablespace temp add tempfile; Tablespace altered.
7:观察physical database,发现出现如下问题
[root@db2 ~]# su - oracle [oracle@db2 ~]$ sqlplus /nolog SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 17 17:18:50 2013 Copyright (c) 1982, 2009, Oracle. All rights reserved. SQL> conn /as sysdba Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 1.3362E+10 bytes Fixed Size 2217952 bytes Variable Size 6777997344 bytes Database Buffers 6576668672 bytes Redo Buffers 4960256 bytes Database mounted. ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u01/app/oracle/oradata/DB2/datafile/o1_mf_system_08oc7c38_.dbf' SQL> alter database open read only; alter database open read only * ERROR at line 1: ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u01/app/oracle/oradata/DB2/datafile/o1_mf_system_08oc7c38_.dbf'
8:于是关闭physical standby,在主库上启动system表空间的热备份,拷贝相应的文件至从库指定位置,重新同步正常
SQL> shutdown immediate ORA-01109: database not open Database dismounted. ORACLE instance shut down. SQL> alter tablespace system begin backup; Tablespace altered. [oracle@db1 datafile]$ pwd /u01/app/oracle/oradata/DB/datafile [oracle@db1 datafile]$ scp o1_mf_system_8vpm8hf3_.dbf root@db2:/tmp SQL> alter tablespace system end backup; Tablespace altered.