一次shutdown操作无法正常执行的处理过程
现象:数据库版本10.2.0.4,执行shutdown normal操作后,长时间数据库没有停下来,此时数据库已不能正常登录
Thu Aug 1 11:28:06 2013
Shutting down instance: further logons disabled
原因分析:
由于使用的是shutdown normal命令,此状态下,不允许新连接,且只有在所有连接断开后数据库才能正常关闭
Normal database shutdown proceeds with the following conditions:
? No new connections are allowed after the statement is issued.
? Before the database is shut down, the database waits for all currently connected users to disconnect from the database.
从数据库的trc文件也能看出,此为执行shutdown命令的进程trc文件内容
oracle@wbdb1[/oracle/admin/zwdb/udump]#more zwdb1_ora_22772.trc
/oracle/admin/zwdb/udump/zwdb1_ora_22772.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle/10.2/db
System name: HP-UX
Node name: wbdb1
Release: B.11.31
Version: U
Machine: ia64
Instance name: zwdb1
Redo thread mounted by this instance: 1
Oracle process number: 7573
Unix process pid: 22772, image: oracle@wbdb1 (TNS V1-V3)
*** ACTION NAME:() 2013-08-01 11:28:06.749
*** MODULE NAME:(sqlplus@wbdb1 (TNS V1-V3)) 2013-08-01 11:28:06.749
*** SERVICE NAME:(SYS$USERS) 2013-08-01 11:28:06.749
*** SESSION ID:(8541.31555) 2013-08-01 11:28:06.749
ksimdel: READY status 5
*** 2013-08-01 11:33:32.148
SHUTDOWN: waiting for logins to complete.
*** 2013-08-01 11:38:38.073
SHUTDOWN: waiting for logins to complete.
*** 2013-08-01 11:43:44.282
SHUTDOWN: waiting for logins to complete.
*** 2013-08-01 11:48:50.338
SHUTDOWN: waiting for logins to complete.
*** 2013-08-01 11:53:56.480
SHUTDOWN: waiting for logins to complete
解决思路:由于新连接不能正常连接,此时业务受到影响,问题必须尽快解决。 按照文档说法,shutdown命令在1个小时内如果不能成果执行,即会超时退出,但此时数据库已进入不正常状态,和业务部门协商后,建议重启解决,具体过程如下
1,shutdown immmediate命令,数据库未能关闭
2,使用os命令kill掉用户会话 ps -ef|grep "LOCAL=NO"|grep -v grep|awk '{print $2}'|xargs kill -9
3, 此时仍然不能shutdown,查看udump下的trace文件,进程14368阻止了shutdown,使用root杀掉该进程
oracle@wbdb1[/oracle/admin/zwdb/udump]#more zwdb1_ora_14108.trc
/oracle/admin/zwdb/udump/zwdb1_ora_14108.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle/10.2/db
System name: HP-UX
Node name: wbdb1
Release: B.11.31
Version: U
Machine: ia64
Instance name: zwdb1
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 14108, image: oracle@wbdb1
*** 2013-08-01 17:29:25.695
Instance termination failed to kill one or more processes
ksuitm_check: OS PID=14368 is still alive
*** 2013-08-01 17:29:25.695
Dumping diagnostic information for oracle@wbdb1 (TNS V1-V3):
OS pid = 14368
loadavg : 0.04 0.07 0.21
Swapinfo :
Avail = 373767.41Mb Used = 55700.04Mb
Swap free = 318067.38Mb Kernel rsvd = 45098.14Mb
Free Mem = 115792.06Mb
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME COMD
3001 Z dsg 14368 14300 1 178 20 e00000123c44e380 0 - Jun 14 ? 0:03 <defunct>
Attaching to program: /oracle/10.2/db/bin/oracle, process 14368
ttrace attach: No such process.
(gdb) (gdb) No stack.
(gdb)
*** 2013-08-01 17:29:27.107
4,此时已能正常关闭数据库,安装metalink的建议重启了数据库,系统恢复了正常
startup restrict
shutdown immediate
startup
参考文档:
Database Jobs Do Not Run After a Failed 'Shutdown Immediate' (文档 ID 434690.1)