一.系统环境:
ORACLE:9IR2
OS:WINDOWS 2003 SERVER
二.问题描述:
现场人员报怨报表数据不准确,查明为韩国方面数据回传中断,联系韩国相关人员后,给我发了个他那边回传程式报的一个错:ORA-02049:timeout:distributed transaction waiting for lock.
三.问题分析:
找了下有关ORA-02049错误的简短说明如下:
ORA-02049:timeout:distributed transaction waiting for lock
cause:exceeded INIT.ORA distributed_lock_timeout seconds waiting for lock.
action:treat as a deadlock.
以为系统哪做了改变,造成了死锁了,导致系统内部将回传程式所作的修改回滚了.
查看了下alert.log文件,没有发现死锁的信息.
再查看了下当前系统的锁情况:
SELECT /*+ rule */
lpad(' ', decode(l.xidusn, 0, 3, 0)) || l.oracle_username User_name,
o.owner,
o.object_name,
o.object_type,
s.sid,
s.serial#
FROM v$locked_object l, dba_objects o, v$session s
WHERE l.object_id = o.object_id
AND l.session_id = s.sid
ORDER BY o.object_id, xidusn DESC;
OWNER OBJECT_NAME OBJECT_TYPE SID SERIAL#
------ -------------- ------------ ---- --------
SEC CONSUMABLESPEC TABLE 50 3558
SEC PROCESSGROUP TABLE 41 796
SEC PROCESSGROUPHISTORY TABLE 16 2535
SEC NCDEFECTHISTORY TABLE 34 46603
对比了一下报表,上面的涉及到的4张表正好是回传中断的表.
查看了下50,41,16,34这4个session的信息,状态均为ACTIVE,LOGON_TIME却是昨天中午11点多,这与4张表里记录的最大时间相差无几.难道这4个session执行了将近半天了.
查看下v$session_wait是否有异常的等待事件,却也没有发现可疑的等待事件.
再次联系韩国方面,要求回传一下数据,以便做个跟踪.这时再查看v$session_wait,出现了eqeue等待事件.
详细查看一下系统的锁等待情况:
SELECT DECODE(request,0,'Holder: ','Waiter: ')|| sid sess, id1, id2, lmode,
request, type
FROM V$LOCK
WHERE (id1, id2, type) IN (SELECT id1, id2, type FROM V$LOCK WHERE request>0)
ORDER BY id1, request;
这时就可以看到回传程式在等待以上4个session持有的锁.
问题的原因总算有个大概了,可能是因为某些原因,昨天中午11点的那次回传程式意外中断了.但是他们所持有的表锁并没有正常释放,造成后面的回传数据发生锁等待.等待时间超出
参数distributed_lock_timeout定义的大小,ORA-02049错误也由此产生.顺便查看一下此参数在系统中的定义.
SQL> show parameter distribut
NAME TYPE VALUE
------------------------------------ ----------- ------------------
distributed_lock_timeout integer 60
呵呵...60秒.
四.问题解决:
简单有效的办法当然就是KILL这4个session了.让它们释放出所持有的表锁.
SQL> alter system kill session '50,3558';
alter system kill session '50,3558'
ORA-00031: session marked for kill
SQL> alter system kill session '41,796';
alter system kill session '41,796'
ORA-00031: session marked for kill
SQL> alter system kill session '16,2535';
alter system kill session '16,2535'
ORA-00031: session marked for kill
SQL> alter system kill session '34,46603';
alter system kill session '34,46603'
ORA-00031: session marked for kill
执行的结果并不是system altered.只是标记为KILL状态.重新查看系统的表锁情况,并没有释放.
WINDOWS平台下那就用ORAKILL吧....
先查看一下50,41,16,34这4个session在WINDOWS下对应的线程号:
select s.sid, p.spid from v$session s, v$process p where s.sid in ('50','41','16','34') and s.paddr=p.addr;
SID SPID
---- -----
50 4496
41 1436
16 6140
34 4360
利用ORAKILL工具将其KILL:
E:\>orakill samsung 4496
Kill of thread id 4496 in instance wiptest successfully signalled.
E:\>orakill samsung 1436
Kill of thread id 1436 in instance wiptest successfully signalled.
E:\>orakill samsung 6140
Kill of thread id 6140 in instance wiptest successfully signalled.
E:\>orakill samsung 4360
Kill of thread id 4360 in instance wiptest successfully signalled.
此时再查看系统的表锁信息,可以发现没有session持有这4张表的锁了.至此,问题解决.
转载至:
http://space.itpub.net/12045182/viewspace-614267