1.闲着无聊在客户一套较为重要的系统中提取了一个4天的AWR,发现下面这条SQL产生最多的物理读,四天运行了159次。平均每次物理读1G左右。SQL运行时间三分钟左右。
SQL:
SELECT DECODE(B.DH, 'MAIN', :1, B.DH) DH, COUNT(DISTINCT(A.PROCESSID)) VALUE,
B.MC COMPANYNAME FROM FMIS3000.WF_ACTIVITIES A, (SELECT DH, MC FROM FMIS3000.XTDW WHERE 1=1 )
B WHERE A.ACTTYPE = 1 AND TRIM(A.DH) = TRIM(B.DH) GROUP BY B.DH, B.MC;
2.我查看了执行计划:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2953189885
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 191K| 18M| | 300K (1)| 01:10:12 |
| 1 | SORT GROUP BY | | 191K| 18M| 4006M| 300K (1)| 01:10:12 |
|* 2 | HASH JOIN | | 37M| 3516M| | 45650 (1)| 00:10:40 |
| 3 | TABLE ACCESS FULL| XTDW | 528 | 18480 | | 4 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| WF_ACTIVITIES | 7054K| 430M| | 45478 (1)| 00:10:37 |
---------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(TRIM("A"."DH")=TRIM("DH"))
4 - filter("A"."ACTTYPE"=1)
17 rows selected.
3.我注意到有两个全表扫描,于是我查找执行这个SQL时 两个表会返回的数据量
这是一个HASH连接 FMIS3000.XTDW 表为驱动表,可以看到他数据量不大,作为驱动表还是很合适的,走table access full,也比较合理。
SQL> select count(*) from FMIS3000.XTDW ;
COUNT(*)
----------
530
4.这是被驱动表,数据量比较大。我发现这儿应是一个瓶颈。
SQL> select count(*) from FMIS3000.WF_ACTIVITIES where ACTTYPE=1;
COUNT(*)
----------
12078901
5.这个查询返回1.2KW数据,整个表有1.7KW数据。走全表扫描其实并没有错,但是我在想如何解决这个瓶颈
6.接着我返回去看SQL,发现我不需要扫描整个表的数据:
经过查询这个表有26列,我可以在需要的列上面创建索引。
create index id_tex on FMIS3000.WF_ACTIVITIES(ACTTYPE,TRIM(DH),PROCESSID);
在这儿我没有考虑选择性(因为不需要,而且选择性肯定奇差),我推测这样,可以走INDEX FAST FULL SCAN,且不用回表。将会有效提升效率。
建立索引前:
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4n6ajf2fsm2x9, child number 0
-------------------------------------
SELECT DECODE(B.DH, 'MAIN','MAIN', B.DH) DH, COUNT(DISTINCT(A.PROCESSID)) VALUE, B.MC COMPANYNAME FROM
FMIS3000.WF_ACTIVITIES A, (SELECT DH, MC FROM FMIS3000.XTDW WHERE 1=1 ) B WHERE A.ACTTYPE = 1 AND
TRIM(A.DH) = TRIM(B.DH) GROUP BY B.DH, B.MC
Plan hash value: 2953189885
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes |
-----------------------------------------------------------------------------------------------------------------
| 1 | SORT GROUP BY | | 1 | 191K| 475 |00:02:51.39 | 285K| 423K| 138K|
|* 2 | HASH JOIN | | 1 | 37M| 12M|00:01:12.56 | 285K| 285K| 0 |
| 3 | TABLE ACCESS FULL| XTDW | 1 | 528 | 530 |00:00:00.02 | 10 | 8 | 0 |
|* 4 | TABLE ACCESS FULL| WF_ACTIVITIES | 1 | 7054K| 12M|00:00:48.38 | 285K| 285K| 0 |
-----------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1082 recursive calls
71 db block gets
285137 consistent gets
311743 physical reads
0 redo size
2208 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
30 rows processed
建立索引后:
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID g5pavm81dbpdt, child number 0
-------------------------------------
SELECT DECODE(B.DH, 'MAIN', 'MAIN', B.DH) DH, COUNT(DISTINCT(A.PROCESSID)) VALUE, B.MC COMPANYNAME
FROM FMIS3000.WF_ACTIVITIES A, (SELECT DH, MC FROM FMIS3000.XTDW WHERE 1=1 ) B WHERE A.ACTTYPE = 1
AND TRIM(A.DH) = TRIM(B.DH) GROUP BY B.DH, B.MC
Plan hash value: 513252325
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 191K| 18M| | 341K (2)| 01:08:17 |
| 1 | SORT GROUP BY | | 191K| 18M| 4062M| 341K (2)| 01:08:17 |
|* 2 | HASH JOIN | | 37M| 3516M| | 39892 (2)| 00:07:59 |
| 3 | TABLE ACCESS FULL | XTDW | 528 | 18480 | | 4 (0)| 00:00:01 |
|* 4 | INDEX FAST FULL SCAN| TE_DX | 7054K| 430M| | 39524 (2)| 00:07:55 |
----------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1074 recursive calls
51 db block gets
179736 consistent gets
124315 physical reads
0 redo size
2208 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
30 rows processed
这个执行计划是真实的
可以很明显的看到,不管是逻辑读和物理读都少了一倍左右。(我在执行前都flush了share pool和buffer cache)
建立索引之后 CBO自动选择了INDEX FAST FULL SCAN ,而且确实没有回表。和预估的结果一样。
总的执行时间减少了一倍,在扫描WF_ACTIVITIES表阶段 时间从48秒减少到了8秒。性能提升了6倍。
但是这个SQL还有个瓶颈是SORT GROUP BY 也就是DISTINCT排序导致的。 我也试验了,如果不要distinct,执行时间还可以减少30多秒,这个和业务相关了,就没有太大办法,而且被驱动表返回结果1.2KW行,hash不可能不消耗时间,这个确实没办法。如果知道业务或许可以改写SQL。