今天一家用户意外断电,导致数据库无法启动,启动的时候如下提示:
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
processes = 150
sga_target = 612368384
control_files = F:\ORADATA\ORCL\CONTROL01.CTL, F:\ORADATA\ORCL\CONTROL02.CTL, F:\ORADATA\ORCL\CONTROL03.CTL
db_block_size = 8192
compatible = 10.2.0.1.0
db_file_multiblock_read_count= 16
db_recovery_file_dest = d:\oracle\product\10.2.0/flash_recovery_area
db_recovery_file_dest_size= 2147483648
undo_management = AUTO
undo_tablespace = UNDOTBS1
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=orclXDB)
job_queue_processes = 10
audit_file_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\ADUMP
background_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\BDUMP
user_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\UDUMP
core_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\CDUMP
db_name = orcl
open_cursors = 300
pga_aggregate_target = 203423744
PMON started with pid=2, OS id=3476
PSP0 started with pid=4, OS id=1568
MMAN started with pid=6, OS id=1564
DBW0 started with pid=8, OS id=3460
DBW1 started with pid=10, OS id=3464
LGWR started with pid=12, OS id=1596
CKPT started with pid=14, OS id=4052
SMON started with pid=16, OS id=1572
RECO started with pid=18, OS id=3668
CJQ0 started with pid=20, OS id=4008
MMON started with pid=22, OS id=3992
MMNL started with pid=24, OS id=3984
Tue Dec 17 17:32:08 2013
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
Tue Dec 17 17:32:09 2013
ALTER DATABASE MOUNT
Tue Dec 17 17:32:14 2013
Setting recovery target incarnation to 2
Tue Dec 17 17:32:14 2013
Successful mount of redo thread 1, with mount id 1362246041
Tue Dec 17 17:32:14 2013
Database mounted in Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Dec 17 17:32:14 2013
ALTER DATABASE OPEN
Tue Dec 17 17:32:15 2013
Beginning crash recovery of 1 threads
parallel recovery started with 15 processes
Tue Dec 17 17:32:15 2013
Started redo scan
Tue Dec 17 17:32:15 2013
Completed redo scan
65573 redo blocks read, 497 data blocks need recovery
Tue Dec 17 17:32:16 2013
Started redo application at
Thread 1: logseq 32195, block 2, scn 708440896
Tue Dec 17 17:32:16 2013
Recovery of Online Redo Log: Thread 1 Group 1 Seq 32195 Reading mem 0
Mem# 0 errs 0: F:\ORADATA\ORCL\REDO01.LOG
Tue Dec 17 17:32:16 2013
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p008_1112.trc:
ORA-00600: internal error code, arguments: [2037], [33735871], [3300893190], [6], [24], [0], [684246207], [100858044]
Tue Dec 17 17:32:16 2013
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p011_4056.trc:
ORA-00600: internal error code, arguments: [2037], [67559644], [3772555782], [6], [1], [0], [688447708], [100731143]
Tue Dec 17 17:32:16 2013
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p013_2884.trc:
ORA-00600: internal error code, arguments: [2037], [8391083], [162243074], [6], [9], [0], [685705643], [34694206]
Tue Dec 17 17:32:16 2013
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p005_416.trc:
ORA-00600: internal error code, arguments: [2037], [12645188], [4081361414], [6], [2], [0], [685503300], [100728832]
这么多错误提示,重点关注ORA-00600 [2037],Metalink上有篇文章,虽然提示有所出入,但是大体原因如下:
The problem is caused by Bug 4899479 Undo/redo corruption if distributed transactions used.
Details:
Redo/undo corruption and associated dumps and/or ORA-600 errors/memory corruption can
occur from a transaction which uses "in memory undo" pool memory if the transaction also
includes distributed operations. e.g.: Insert using buffered inserts, distributed operation,
subsequent insert can lead to this problem if the first insert uses in-memory undo.
To resolve the corruption, you will need to do a point in time recovery from a good backup.
To prevent the bug from occurring, you may choose either of these options:
Recovery of Online Redo Log: Thread 1 Group 1 Seq 32195 Reading mem 0
Mem# 0 errs 0: F:\ORADATA\ORCL\REDO01.LOG
我判断是redo文件损坏,于是按照一般流程处理redo损坏的恢复方式尝试启动数据库,结果数据库启动成功,运气不错,看来oracle同样的错误,有时间错误提示会给人一些误导。