昨天一哥们 QQ 发来消息,说有个 insert ... 语句插入了 30 分钟还没完成,请求帮忙优化, SQL 语句如下:
 
view plaincopy to clipboardprint?
1.  INSERT INTO Temp_CheckSExit101320166  
2.    SELECT KEYCOL,  
3.           CARDNETWORK,  
4.           CARDID,  
5.           EXITSTATION,  
6.           EXITDATE,  
7.           TOTALTOLL,  
8.           PAYMETHOD,  
9.           DEALSTATUS,  
10.          FREEKIND,  
11.          SPLITTOLLINFO,  
12.          ECARDTYPE,  
13.          OWNERNUM,  
14.          0,  
15.          NULL,  
16.          NULL,  
17.          NULL FROMCHECK_SEXIT201108  
18.    WHERE  PAYMETHOD  =  2   
19.      AND  CARDNETWORK  =  3201   
20.      AND SUBSTR(EXITNETWORK, 1, 2) = 32  
21.      AND  ECARDTYPE  =  22   
22.      AND EXITDATE  < = 20110829  
23.      AND  validflag  =  1   
24.      AND  ENDFLAG  =  0   

INSERT INTO Temp_CheckSExit101320166 SELECT KEYCOL, CARDNETWORK, CARDID, EXITSTATION, EXITDATE, TOTALTOLL, PAYMETHOD, DEALSTATUS, FREEKIND, SPLITTOLLINFO, ECARDTYPE, OWNERNUM, 0, NULL, NULL, NULL FROMCHECK_SEXIT201108 WHERE PAYMETHOD = 2 AND CARDNETWORK = 3201 AND SUBSTR(EXITNETWORK, 1, 2) = 32 AND ECARDTYPE = 22 AND EXITDATE <= 20110829 AND validflag = 1 AND ENDFLAG = 0

 
这个 SQL 很简单,就是从另外一个表抽取数据到另外一个表,执行计划很简单,是全表扫描,询问得知表 CHECK_SEXIT201108 只有 0.8G ,跑 30 分钟确实不应该
让他监控等待事件,绝大部分等待是 row cache lock
处理问题的方法:
1. 由于是 insert .... select 没有 sequence ,所以 sequence 因素排除了
2. 让他检查 alert.log 搜寻十分有 ROW CACHE LOCK 关键字,搜索完毕之后没发现有该关键字,所以 ORACLE BUG 基本 排除了
3. 查看 AWR TOP 5 Timed Events ,没有发现 ROW CACHE LOCK ,所以 row cache lock 竞争也排除了
Top 5 Timed Events

Event
Waits
Time(s)
Avg Wait(ms)
% Total Call Time
Wait Class
CPU time
 
9,610
 
74.7
 
db file sequential read
660,851
2,590
4
20.1
User I/O
db file scattered read
284,346
456
2
3.5
User I/O
gc cr multi block request
448,707
112
0
.9
Cluster
gc cr disk read
296,228
97
0
.8
Cluster

SQL ordered by Elapsed Time
·         Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
·         % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100

Elapsed Time (s)
CPU Time (s)
Executions
Elap per Exec (s)
% Total DB Time
SQL Id
SQL Module
SQL Text
6,083
6,079
3,035
2.00
47.24
0hhmdwwgxbw0r
 
select obj#, type#, flags, ...
3,510
3,491
9,879
0.36
27.25
b52m6vduutr8j
delete from RecycleBin$ ...
3,445
3,377
1
3444.94
26.75
bx1r0wn35qt1p
PL/SQL Developer
begin -- Call the procedure ...
3,088
3,027
0
 
23.97
fxsxz22x0zha3
JDBC Connect Client
BEGIN DBMID.PRCSPLITALL ( :V1,...
3,074
3,019
0
23.87
71dgy4pygdq40
JDBC Connect Client
INSERT INTO Temp_CheckSExit101...
2,109
2,071
0
 
16.37
gyvwadw7ruh03
PL/SQL Developer
INSERT INTO Temp_CheckSExit211...
1,546
1,526
0
 
12.01
326g4yt56tq4z
toad.exe
INSERT INTO Temp_CheckSExit2...
1,247
1,222
2
623.44
9.68
7wsvh5zkkcxwp
TOAD 9.7.0.51
begin DBMID.PRCTRANSDATA('DBMI...
1,231
1,214
2
615.55
9.56
5gswufgngjg8r
DataTransServer.exe
INSERT INTO DBMID.TRANS_EXIT20...
848
848
2
424.16
6.59
c6fzw54pm9ra1
TOAD 9.7.0.51
select * from v$access where o...
620
602
3
206.83
4.82
16514g0b52g4g
TOAD 9.7.0.51
begin DBMID.PRCTRANSDATA('DBMI...
613
596
3
204.29
4.76
b771m0v7ndgz2
TOAD 9.7.0.51
INSERT INTO DBMID.TRANS_Entry2...
507
329
247
2.05
3.93
0pvtkmrrq8usg
 
select file#, block# from seg...
262
199
48
5.46
2.03
8nsfam0acu8tn
JDBC Thin Client
BEGIN dbmid.PRCCOSMRECORD(:1, ...
252
169
15
16.79
1.96
2n67bkp4cv12t
JDBC Thin Client
BEGIN dbmid.PrcGetCardMFlux(:1...
160
151
0
 
1.24
dbgy2zdz33dbv
TOAD 9.7.0.51
SELECT * FROM DBMTC.RAW_EXIT20...
156
156
4
39.07
1.21
cz6qdwvbvf7wm
toad.exe
select * from v$access where o...

 
通过查看 TOP SQL ,我发现 recyclebin 居然没有关闭,这实在是不应该, recyclebin 通常是需要关闭的,于是让该哥们查看表空间使用率
给了他一个脚本,结果跑半天没结果。。。遇到了 library cache lock 由于我不能登录他的数据库,所以建议该哥们关闭回收站,因为我怀疑 insert 的时候由于表空间几乎满了
导致 ORACLE 去清空回收站,从而引发 row cache lock 。果然,当那哥们关闭回收站之后, insert 报错 ORA-01653: unable to extend table.... by 128 in tablespace...
现在已经证明了是回收站引发的 row cache lock ,那个哥们添加了 2 个数据文件之后 insert 很快,最后让那个哥们 purge recyclebin
其实还有个方法可以快速定位是回收站引发的问题,这里就不说了