今天,在做OGG巡检时,发现存在一个死锁,于是进行查明。
alert日志:
< Mon Sep 21 15:42:58 CST 2015
< ORA-00060: Deadlock detected. More info in file /home/oracle/oracle/admin/pg51101/udump/pg51101_ora_13480.trc.
< Mon Sep 21 15:43:00 CST 2015
< ORA-00060: Deadlock detected. More info in file /home/oracle/oracle/admin/pg51101/udump/pg51101_ora_26564.trc.
< Mon Sep 21 15:48:38 CST 2015
< Thread 1 advanced to log sequence 492286 (LGWR switch)
< Current log# 6 seq# 492286 mem# 0:
trace文件:
*** 2015-09-21 15:43:00.424
*** ACTION NAME:(SQL - ) 2015-09-21 15:43:00.424
*** MODULE NAME:(PL/SQL Developer) 2015-09-21 15:43:00.424
*** SERVICE NAME:(SYS$USERS) 2015-09-21 15:43:00.424
*** SESSION ID:(110.64573) 2015-09-21 15:43:00.424
DEADLOCK DETECTED ( ORA-00060 )
[Transaction Deadlock]
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
这部分内容告诉我们发生的具体时间,SESSION ID:(1405.59555)这个会话抛出的错误,是同时告诉我们这个错误不是ORACLE错误,是应用程序SQL设计问题引起的,需要交由开发人员处理。
--------------------------------------------------------------------------------------------------
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-000b0006-00010d15 46 110 X 32 135 X
TX-00070023-0000954f 32 135 X 46 110 X
session 110: DID 0001-002E-00000059 session 135: DID 0001-0020-00000005
session 135: DID 0001-0020-00000005 session 110: DID 0001-002E-00000059
其中Resource Name 中TX说明死锁属于TX类型,常见的还有TM类型的锁,后面有时间再分析,最后一个X表示独占模式
session 110这个会话以独占模式占有资源"
DID 0001-017C-002C6A2B
"现在又需要独占资源" DID 0001-0020-00000005 "但是却被session 135会话以独占的模式占有,与此同时对于session 135这个会话来讲,他现在也想独占session 110占用的资源" TX-000b0006-00010d15",
死锁不是ORACLE错误,是应用程序设计存在问题导致死锁的发生
--------------------------------------------------------------------------------------------------
Rows waited on:
Session 135: obj - rowid = 00018894 -
AAAYiUAAiAACmlPAAL
(dictionary objn - 100500, file - 34, block - 682319, slot - 11)
Session 110: obj - rowid = 00018894 -
AAAYiUAAiAACf2UAAQ
(dictionary objn - 100500, file - 34, block - 654740, slot - 16)
Information on the OTHER waiting sessions:
Session 135:
sid: 135 ser: 47063 audsid: 8202090 user: 59/MPAC
flags: (0x100041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x8)
pid: 32 O/S info: user: oracle, term: UNKNOWN, ospid: 13480
image: oracle@SGMDS (TNS V1-V3)
O/S info: user: oracle, term: , ospid: 13476, machine: SGMDS
program: replicat@SGMDS (TNS V1-V3)
application name: OGG-REPMOM-OPEN_DATA_SOURCE, hash value=2774042825
Current SQL Statement:
UPDATE "MPAC"."D_METER" SET "STATUS_CODE" = :a1,"SYNC_TIME" = :a2,"SYNC_TYPE" = :a3,"SYNC_ORG_NO" = :a4 WHERE "METER_ID" = :b0
End of information on OTHER waiting sessions.
Current SQL statement for this session:
UPDATE D_METER D SET (D.WH_ID, D.STATUS_CODE) = (SELECT DM.WH_ID, DM.STATUS_CODE FROM D_METER@PRMOCS DM WHERE DM.METER_ID = D.METER_ID) WHERE D.METER_ID = :B1
上面时导致死锁的两个update语句
----- PL/SQL Call Stack -----
object line object
handle number name
0x55bc5d3c8 6 anonymous block
===================================================
PROCESS STATE
-------------
Process global information:
process: 0x55710c5c0, call: 0x541370650, xact: 0x54a73a3d0, curses: 0x55f1518a8, usrses: 0x55f1518a8
----------------------------------------
SO: 0x55710c5c0, type: 2, owner: (nil), flag: INIT/-/-/0x00
(process) Oracle pid=46, calls cur/top: 0x541370650/0x541318368, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 110 0 4
last post received-location: kslpsr
last process to post me: 55513a0e0 1 6
last post sent: 0 0 24
last post sent-location: ksasnd
last process posted by me: 55513a0e0 1 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0x555144050
O/S info: user: oracle, term: UNKNOWN, ospid: 26564
OSD pid info: Unix process pid: 26564, image: oracle@SGMDS
(FOB) flags=2 fib=0x5461e1968 incno=0 pending i/o cnt=0
fname=/ogg/mpac_d02.dbf
fno=34 lblksz=8192 fsiz=2560000
(FOB) flags=2 fib=0x5461dc7b8 incno=0 pending i/o cnt=0
fname=/ogg/mpac_d.dbf
fno=12 lblksz=8192 fsiz=4194302
(FOB) flags=2 fib=0x5461e20c8 incno=0 pending i/o cnt=0
fname=/ogg/undotabs2.dbf
fno=36 lblksz=8192 fsiz=4194302
(FOB) flags=2 fib=0x5461e5878 incno=0 pending i/o cnt=0
fname=/ogg/mpac_d03.dbf
fno=51 lblksz=8192 fsiz=2621440
(FOB) flags=2 fib=0x5461e5fd8 incno=0 pending i/o cnt=0
fname=/ogg/ts_index_9.dbf
fno=53 lblksz=8192 fsiz=262144
(FOB) flags=2 fib=0x5461d9ee0 incno=0 pending i/o cnt=0
fname=/home/oracle/oradata/pg51101/system01.dbf
fno=1 lblksz=8192 fsiz=67840
(FOB) flags=2 fib=0x5461e45f0 incno=0 pending i/o cnt=0
fname=/ogg/mpac03.dbf
fno=46 lblksz=8192 fsiz=2560000
(FOB) flags=2 fib=0x5461e15a0 incno=0 pending i/o cnt=0
fname=/ogg/mpac02.dbf
fno=33 lblksz=8192 fsiz=2560000
(FOB) flags=2 fib=0x5461db8e0 incno=0 pending i/o cnt=0
fname=/ogg/mpac.dbf
fno=8 lblksz=8192 fsiz=4194302
(FOB) flags=2 fib=0x5461e5c28 incno=0 pending i/o cnt=0
fname=/ogg/mpac05.dbf
fno=52 lblksz=8192 fsiz=2621440
(FOB) flags=2 fib=0x5461e54c8 incno=0 pending i/o cnt=0
fname=/ogg/mpac04.dbf
fno=50 lblksz=8192 fsiz=2621440
----------------------------------------
SO: 0x55f1518a8, type: 4, owner: 0x55710c5c0, flag: INIT/-/-/0x00
(session) sid: 110 trans: 0x54a73a3d0, creator: 0x55710c5c0, flag: (100041) USR/- BSY/-/-/-/-/-
DID: 0001-002E-00000059, short-term DID: 0000-0000-00000000
txn branch: 0x54cfe4d18
oct: 6, prv: 0, sql: 0x55abb1f30, psql: 0x55abb1f30, user: 59/MPAC
service name: SYS$USERS
O/S info: user: admin, term: LEN-NA19975982, ospid: 2640:5032, machine: WORKGROUP\LEN-NA19975982
program: plsqldev.exe
application name: PL/SQL Developer, hash value=1190136663
action name: SQL - , hash value=2127054360
last wait for 'enq: TX - row lock contention' wait_time=2.930371 sec, seconds since wait started=5
name|mode=54580006, usn<<16 | slot=70023, sequence=954f
blocking sess=0x0x55712d9e0 seq=2486
Dumping Session Wait History
for 'enq: TX - row lock contention' count=1 wait_time=2.930371 sec
name|mode=54580006, usn<<16 | slot=70023, sequence=954f
for 'SQL*Net message from dblink' count=1 wait_time=0.000151 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message to dblink' count=1 wait_time=0.000001 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message from dblink' count=1 wait_time=0.006292 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message to dblink' count=1 wait_time=0.000001 sec
driver id=28444553, #bytes=1, =0
for 'db file sequential read' count=1 wait_time=0.000019 sec
file#=33, block#=21dbe8, blocks=1
for 'SQL*Net message from dblink' count=1 wait_time=0.000196 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message to dblink' count=1 wait_time=0.000000 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message from dblink' count=1 wait_time=0.000240 sec
driver id=28444553, #bytes=1, =0
for 'SQL*Net message to dblink' count=1 wait_time=0.000001 sec
driver id=28444553, #bytes=1, =0
Sampled Session History of session 110 serial 64573
---------------------------------------------------
产生死锁原因:
1.业务逻辑设计不合理,高并发下产生死锁
2.外键列没有索引,对外键列更新时产生死锁
3.SQL执行效率低下导致死锁
4.一条update语句导致死锁,绝大部分原因是因为更新表的列有位图索引
死锁是数据库经常发生的问题,数据库一般不会无缘无故产生死锁,死锁通常都是由于我们应用程序的设计本身造成的。
会话1:
-- 创建一个测试表,插入两行
18:03:48 SCOTT>create table tab_dl (id int,name varchar2(30));
Table created.
18:05:50 SCOTT>insert into tab_dl values (1,'DeadLock 1');
1 row created.
18:05:56 SCOTT>insert into tab_dl values (2,'DeadLock 2');
1 row created.
18:06:04 SCOTT>commit;
Commit complete.
-- 分别在不同的会话对这两行进行更新
18:06:06 SCOTT>update tab_dl set name='DeadLock 3' where id=1;
1 row updated.
会话2:
18:07:14 SCOTT>update tab_dl set name='DeadLock 4' where id=2;
1 row updated.
会话1:
18:07:37 SCOTT>update tab_dl set name='DeadLock 5' where id=2; --此会话在等待,在会话2更新id=1时检测到死锁,自动终止其中一个会话
update tab_dl set name='DeadLock 5' where id=2
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
会话2:
18:08:00 SCOTT>update tab_dl set name='DeadLock 6' where id=1; --此会话一直在等等
-- 以上实验结果得出结论:两个会话同时互相阻塞对方的事务修改时,会产生死锁。
产生死锁时,如何解决呢,下面是常规的解决办法:
会话1:
1)执行下面SQL,先查看哪些表被锁住了:
18:09:22 SCOTT>select b.owner,b.object_name,a.session_id,a.locked_mode
from v$locked_object a,dba_objects b
where b.object_id = a.object_id;
OWNER OBJECT_NAME SESSION_ID LOCKED_MODE
------------------------------ -------------------- ---------- -----------
SCOTT TAB_DL 144 3
SCOTT TAB_DL 21 3
2)查看引起死锁的会话
18:09:24 SCOTT>select b.username,b.sid,b.serial#,logon_time
18:09:40 2 from v$locked_object a,v$session b
18:09:40 3 where a.session_id = b.sid order by b.logon_time;
USERNAME SID SERIAL# LOGON_TIME
------------------------------ ---------- ---------- ------------
SCOTT 21 53 27-SEP-13
SCOTT 144 369 27-SEP-13
3)查看被阻塞的会话
18:11:09 SCOTT>select * from dba_waiters;
WAITING_SESSION HOLDING_SESSION LOCK_TYPE MODE_HELD MODE_REQUESTED LOCK_ID1 LOCK_ID2
--------------- --------------- -------------------------- ---------- --------------- ---------- ----------
144 21 Transaction Exclusive Exclusive 655372 1186
4)可以提交或回滚阻塞的话,释放锁或者杀掉ORACLE进程:
ALTER SYSTEM KILL SESSION 'SID,SERIAL#'; --对应上例中的21,53
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/30430420/viewspace-1806005/,如需转载,请注明出处,否则将追究法律责任。