版本10.2.0.4 RAC 环境
昨天一哥们QQ发来消息,说有个insert ...语句插入了30分钟还没完成,请求帮忙优化,SQL语句如下:
INSERT INTO Temp_CheckSExit101320166 SELECT KEYCOL, CARDNETWORK, CARDID, EXITSTATION, EXITDATE, TOTALTOLL, PAYMETHOD, DEALSTATUS, FREEKIND, SPLITTOLLINFO, ECARDTYPE, OWNERNUM, 0, NULL, NULL, NULL FROMCHECK_SEXIT201108 WHERE PAYMETHOD = 2 AND CARDNETWORK = 3201 AND SUBSTR(EXITNETWORK, 1, 2) = 32 AND ECARDTYPE = 22 AND EXITDATE <= 20110829 AND validflag = 1 AND ENDFLAG = 0
这个SQL很简单,就是从另外一个表抽取数据到另外一个表,执行计划很简单,是全表扫描,询问得知表CHECK_SEXIT201108 只有0.8G,跑30分钟确实不应该
让他监控等待事件,绝大部分等待是row cache lock
处理问题的方法:
1.由于是insert .... select 没有 sequence,所以sequence因素排除了
2.让他检查 alert.log 搜寻十分有 ROW CACHE LOCK关键字,搜索完毕之后没发现有该关键字,所以ORACLE BUG 也基本排除了
3.查看AWR TOP 5 Timed Events,没有发现ROW CACHE LOCK,所以 row cache lock 竞争也排除了
Top 5 Timed Events
Event | Waits | Time(s) | Avg Wait(ms) | % Total Call Time | Wait Class |
---|---|---|---|---|---|
CPU time | 9,610 | 74.7 | |||
db file sequential read | 660,851 | 2,590 | 4 | 20.1 | User I/O |
db file scattered read | 284,346 | 456 | 2 | 3.5 | User I/O |
gc cr multi block request | 448,707 | 112 | 0 | .9 | Cluster |
gc cr disk read | 296,228 | 97 | 0 | .8 | Cluster |
Elapsed Time (s) | CPU Time (s) | Executions | Elap per Exec (s) | % Total DB Time | SQL Id | SQL Module | SQL Text |
---|---|---|---|---|---|---|---|
6,083 | 6,079 | 3,035 | 2.00 | 47.24 | 0hhmdwwgxbw0r | select obj#, type#, flags, ... | |
3,510 | 3,491 | 9,879 | 0.36 | 27.25 | b52m6vduutr8j | delete from RecycleBin$ ... | |
3,445 | 3,377 | 1 | 3444.94 | 26.75 | bx1r0wn35qt1p | PL/SQL Developer | begin -- Call the procedure ... |
3,088 | 3,027 | 0 | 23.97 | fxsxz22x0zha3 | JDBC Connect Client | BEGIN DBMID.PRCSPLITALL ( :V1,... | |
3,074 | 3,019 | 0 | 23.87 | 71dgy4pygdq40 | JDBC Connect Client | INSERT INTO Temp_CheckSExit101... | |
2,109 | 2,071 | 0 | 16.37 | gyvwadw7ruh03 | PL/SQL Developer | INSERT INTO Temp_CheckSExit211... | |
1,546 | 1,526 | 0 | 12.01 | 326g4yt56tq4z | toad.exe | INSERT INTO Temp_CheckSExit2... | |
1,247 | 1,222 | 2 | 623.44 | 9.68 | 7wsvh5zkkcxwp | TOAD 9.7.0.51 | begin DBMID.PRCTRANSDATA('DBMI... |
1,231 | 1,214 | 2 | 615.55 | 9.56 | 5gswufgngjg8r | DataTransServer.exe | INSERT INTO DBMID.TRANS_EXIT20... |
848 | 848 | 2 | 424.16 | 6.59 | c6fzw54pm9ra1 | TOAD 9.7.0.51 | select * from v$access where o... |
620 | 602 | 3 | 206.83 | 4.82 | 16514g0b52g4g | TOAD 9.7.0.51 | begin DBMID.PRCTRANSDATA('DBMI... |
613 | 596 | 3 | 204.29 | 4.76 | b771m0v7ndgz2 | TOAD 9.7.0.51 | INSERT INTO DBMID.TRANS_Entry2... |
507 | 329 | 247 | 2.05 | 3.93 | 0pvtkmrrq8usg | select file#, block# from seg... | |
262 | 199 | 48 | 5.46 | 2.03 | 8nsfam0acu8tn | JDBC Thin Client | BEGIN dbmid.PRCCOSMRECORD(:1, ... |
252 | 169 | 15 | 16.79 | 1.96 | 2n67bkp4cv12t | JDBC Thin Client | BEGIN dbmid.PrcGetCardMFlux(:1... |
160 | 151 | 0 | 1.24 | dbgy2zdz33dbv | TOAD 9.7.0.51 | SELECT * FROM DBMTC.RAW_EXIT20... | |
156 | 156 | 4 | 39.07 | 1.21 | cz6qdwvbvf7wm | toad.exe | select * from v$access where o... |
通过查看TOP SQL,我发现recyclebin居然没有关闭,这实在是不应该,recyclebin通常是需要关闭的,于是让该哥们查看表空间使用率
给了他一个脚本,结果跑半天没结果。。。遇到了library cache lock 由于我不能登录他的数据库,所以建议该哥们关闭回收站,因为我怀疑insert的时候由于表空间几乎满了
导致ORACLE去清空回收站,从而引发row cache lock。果然,当那哥们关闭回收站之后,insert 报错 ORA-01653: unable to extend table.... by 128 in tablespace...
现在已经证明了是回收站引发的row cache lock,那个哥们添加了2个数据文件之后insert 很快,最后让那个哥们purge recyclebin
其实还有个方法可以快速定位是回收站引发的问题,这里就不说了