一个超超超烂SQL的优化

最近遇到一个超超超烂的SQL,由于SQL涉及某某税务局的数据,表名字是经过处理的。

原始SQL如下:

select * from (
 select t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH) slzl,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='0') 无评价,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='1') 满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='2') 较满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='3') 一般,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='4') 较不满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='5') 不满意
  from xxx t
  WHERE t.yxbz='Y' 
 )where slzl<>0 ; 

不用看执行计划就知道,表test_v会被扫描多次,所以这个SQL必须改写。如何改写这个SQL呢?请看下面分析:

首先,这个SQL语句是拼出来的,我们可以把复杂的问题简单化,所以我先看这个SQL:

select t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH) slzl
  from CC_ZXJBXX t
  WHERE t.yxbz='Y';  

执行计划如下:

SQL> select t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,
  2         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH) slzl
  3    from CC_ZXJBXX t
  4    WHERE t.yxbz='Y';

已选择1956行。

已用时间:  00: 00: 00.37

执行计划
----------------------------------------------------------
Plan hash value: 4286326665

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |  1956 |   105K|    11   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |           |     1 |    22 |            |          |
|*  2 |   TABLE ACCESS FULL| TEST_V    |     4 |    88 |     5   (0)| 00:00:01 |
|*  3 |  TABLE ACCESS FULL | CC_ZXJBXX |  1956 |   105K|    11   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."ZCRYZH"=TO_CHAR(:B1))
   3 - filter("T"."YXBZ"='Y')

上面的SQL语句会返回1956行,根据执行计划可以看到它进行的是filter操作,讲上面的SQL改写,初步改写如下:

SQL> select  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,count(*) from
  2  CC_ZXJBXX t,test_v a where to_char(t.zxid) = a.ZCRYZH
  3  and t.yxbz='Y' group by  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm;

已选择20行。

已用时间:  00: 00: 00.03

执行计划
----------------------------------------------------------
Plan hash value: 2833546679

---------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 32157 |  2418K|    22  (28)| 00:00:01 |
|   1 |  HASH GROUP BY      |           | 32157 |  2418K|    22  (28)| 00:00:01 |
|*  2 |   HASH JOIN         |           | 32157 |  2418K|    17   (6)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_V    |   411 |  9042 |     5   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| CC_ZXJBXX |  1956 |   105K|    11   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."ZCRYZH"=TO_CHAR("T"."ZXID"))
   4 - filter("T"."YXBZ"='Y')

 

改写之后,SQL只返回了20条数据,但是原始SQL要返回1956条数据,问题出在哪里呢? 问题在于,第一个SQL是filter,它是过滤,并且是count过滤

而第二个SQL是等值jion,第一个SQL里面的count 不管你过滤成功与否,都会返回数据,过滤失败,count=0 即可,所以对第二个SQL应该用 left join 于是SQL 更改如下:

SQL> select  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,count(*) from
  2  CC_ZXJBXX t,test_v a where to_char(t.zxid) = a.ZCRYZH(+)
  3  and t.yxbz='Y' group by  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm;

已选择1956行。

已用时间:  00: 00: 00.12

执行计划
----------------------------------------------------------
Plan hash value: 3281235561

------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           | 32157 |  2418K|    22  (28)| 00:00:01 |
|   1 |  HASH GROUP BY         |           | 32157 |  2418K|    22  (28)| 00:00:01 |
|*  2 |   HASH JOIN RIGHT OUTER|           | 32157 |  2418K|    17   (6)| 00:00:01 |
|   3 |    TABLE ACCESS FULL   | TEST_V    |   411 |  9042 |     5   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL   | CC_ZXJBXX |  1956 |   105K|    11   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."ZCRYZH"(+)=TO_CHAR("T"."ZXID"))
   4 - filter("T"."YXBZ"='Y')

这样改写之后,返回记录条数是对了,但是SQL逻辑有点点改变,第一个SQL是过滤的,count()如果不匹配会返回0,但是第二个SQL是 join的,如果不匹配,会返回1
一个超超超烂SQL的优化_第1张图片 


一个超超超烂SQL的优化_第2张图片

所以SQL需要改写如下:

select  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,decode(count(*),1,0,count(*)) from 
CC_ZXJBXX t,test_v a where to_char(t.zxid) = a.ZCRYZH(+)
and t.yxbz='Y' group by  t.zxid,t.gh,t.xm,t.bm,t.fzjgdm order by 1;

改写完了一个,后面的就简单了,所以整体的SQL可以改写如下:

select t.zxid,
       t.gh,
       t.xm,
       t.bm,
       t.fzjgdm,
       decode(count(*), 1, 0, count(*)),
       SUM(DECODE(A.MYD, '0', 1, 0)) 无评价,
       SUM(DECODE(A.MYD, '1', 1, 0)) 满意,
       SUM(DECODE(A.MYD, '2', 1, 0)) 较满意,
       SUM(DECODE(A.MYD, '3', 1, 0)) 一般,
       SUM(DECODE(A.MYD, '4', 1, 0)) 较不满意,
       SUM(DECODE(A.MYD, '5', 1, 0)) 不满意
  from CC_ZXJBXX t, test_v a
 where to_char(t.zxid) = a.ZCRYZH(+)
   and t.yxbz = 'Y'
 group by t.zxid, t.gh, t.xm, t.bm, t.fzjgdm ;

上面的SQL等价于

select t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH) slzl,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='0') 无评价,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='1') 满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='2') 较满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='3') 一般,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='4') 较不满意,
       (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='5') 不满意
  from xxx t
  WHERE t.yxbz='Y';

由于原始SQL要去掉 slzl <>0 所以再包装一下

select * from 
(select t.zxid,
       t.gh,
       t.xm,
       t.bm,
       t.fzjgdm,
       decode(count(*), 1, 0, count(*))slzl,
       SUM(DECODE(A.MYD, '0', 1, 0)) 无评价,
       SUM(DECODE(A.MYD, '1', 1, 0)) 满意,
       SUM(DECODE(A.MYD, '2', 1, 0)) 较满意,
       SUM(DECODE(A.MYD, '3', 1, 0)) 一般,
       SUM(DECODE(A.MYD, '4', 1, 0)) 较不满意,
       SUM(DECODE(A.MYD, '5', 1, 0)) 不满意
  from CC_ZXJBXX t, test_v a
 where to_char(t.zxid) = a.ZCRYZH(+)
   and t.yxbz = 'Y'
 group by t.zxid, t.gh, t.xm, t.bm, t.fzjgdm
) where slzl<>0;


原始SQL的执行计划 以及 更改后的执行计划对比:

SQL> select * from (
  2   select t.zxid,t.gh,t.xm,t.bm,t.fzjgdm,
  3         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH) slzl,
  4         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='0') 无评价,
  5         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='1') 满意,
  6         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='2') 较满意,
  7         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='3') 一般,
  8         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='4') 较不满意,
  9         (select count(a.session_id) from test_v a where to_char(t.zxid) = a.ZCRYZH and a.myd='5') 不满意
 10    from xxx t
 11    WHERE t.yxbz='Y'
 12   ) where slzl<>0;

已选择20行。

已用时间:  00: 00: 00.28

执行计划
----------------------------------------------------------
Plan hash value: 2669170809

------------------------------------------------------------------------------
| Id  | Operation           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |        |  1956 | 76284 |    17   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |        |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL | TEST_V |    16 |    64 |     5   (0)| 00:00:01 |
|   3 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|*  4 |   TABLE ACCESS FULL | TEST_V |     9 |    54 |     5   (0)| 00:00:01 |
|   5 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|*  6 |   TABLE ACCESS FULL | TEST_V |     5 |    30 |     5   (0)| 00:00:01 |
|   7 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|*  8 |   TABLE ACCESS FULL | TEST_V |     1 |     6 |     5   (0)| 00:00:01 |
|   9 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|* 10 |   TABLE ACCESS FULL | TEST_V |     2 |    12 |     5   (0)| 00:00:01 |
|  11 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|* 12 |   TABLE ACCESS FULL | TEST_V |     1 |     6 |     5   (0)| 00:00:01 |
|  13 |  SORT AGGREGATE     |        |     1 |     6 |            |          |
|* 14 |   TABLE ACCESS FULL | TEST_V |     1 |     6 |     5   (0)| 00:00:01 |
|* 15 |  FILTER             |        |       |       |            |          |
|* 16 |   TABLE ACCESS FULL | XXX    |  1956 | 76284 |    12   (0)| 00:00:01 |
|  17 |   SORT AGGREGATE    |        |     1 |     4 |            |          |
|* 18 |    TABLE ACCESS FULL| TEST_V |    16 |    64 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."ZCRYZH"=TO_CHAR(:B1))
   4 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='0')
   6 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='1')
   8 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='2')
  10 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='3')
  12 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='4')
  14 - filter("A"."ZCRYZH"=TO_CHAR(:B1) AND "A"."MYD"='5')
  15 - filter( (SELECT COUNT(*) FROM "TEST_V" "A" WHERE
              "A"."ZCRYZH"=TO_CHAR(:B1))<>0)
  16 - filter("T"."YXBZ"='Y')
  18 - filter("A"."ZCRYZH"=TO_CHAR(:B1))


统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      33261  consistent gets
          0  physical reads
          0  redo size
       2011  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

原始SQL需要消耗33261个逻辑读,扫描TEST_V表8次,更改之后的SQL

SQL> select * from
  2  (select t.zxid,
  3         t.gh,
  4         t.xm,
  5         t.bm,
  6         t.fzjgdm,
  7         decode(count(*), 1, 0, count(*))slzl,
  8         SUM(DECODE(A.MYD, '0', 1, 0)) 无评价,
  9         SUM(DECODE(A.MYD, '1', 1, 0)) 满意,
 10         SUM(DECODE(A.MYD, '2', 1, 0)) 较满意,
 11         SUM(DECODE(A.MYD, '3', 1, 0)) 一般,
 12         SUM(DECODE(A.MYD, '4', 1, 0)) 较不满意,
 13         SUM(DECODE(A.MYD, '5', 1, 0)) 不满意
 14    from CC_ZXJBXX t, test_v a
 15   where to_char(t.zxid) = a.ZCRYZH(+)
 16     and t.yxbz = 'Y'
 17   group by t.zxid, t.gh, t.xm, t.bm, t.fzjgdm
 18  ) where slzl<>0;

已选择20行。

已用时间:  00: 00: 00.01

执行计划
----------------------------------------------------------
Plan hash value: 1348291491

-------------------------------------------------------------------------------------
| Id  | Operation               | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |    98 |  4410 |    18  (12)| 00:00:01 |
|*  1 |  FILTER                 |           |       |       |            |          |
|   2 |   HASH GROUP BY         |           |    98 |  4410 |    18  (12)| 00:00:01 |
|*  3 |    HASH JOIN RIGHT OUTER|           |  1956 | 88020 |    17   (6)| 00:00:01 |
|   4 |     TABLE ACCESS FULL   | TEST_V    |   411 |  2466 |     5   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL   | CC_ZXJBXX |  1956 | 76284 |    11   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(DECODE(COUNT(*),1,0,COUNT(*))<>0)
   3 - access("A"."ZCRYZH"(+)=TO_CHAR("T"."ZXID"))
   5 - filter("T"."YXBZ"='Y')


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         62  consistent gets
          0  physical reads
          0  redo size
       1958  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

逻辑读降低到62,表TEST_V只扫描1次,其实这个SQL只是一个汇总的SQL,稍微有点开发常识的人都知道,最开始写SQL的那个人绝对是在乱写SQL。SQL的优化到这里还没完,

因为我们已经知道这个SQL的目的其实就是汇总,所以直接抛弃刚才的优化思路,直接重写SQL如下:

SQL>  SELECT *
  2    FROM (SELECT T.ZXID,
  3         T.GH,
  4         T.XM,
  5         T.BM,
  6         T.FZJGDM,
  7         SUM(1) SLZL,
  8         SUM(DECODE(A.MYD, '0', 1, 0)) 无评价,
  9         SUM(DECODE(A.MYD, '1', 1, 0)) 满意,
 10         SUM(DECODE(A.MYD, '2', 1, 0)) 较满意,
 11         SUM(DECODE(A.MYD, '3', 1, 0)) 一般,
 12         SUM(DECODE(A.MYD, '4', 1, 0)) 较不满意,
 13         SUM(DECODE(A.MYD, '5', 1, 0)) 不满意
 14    FROM CC_ZXJBXX T ,test_v A
 15  where A.ZCRYZH=T.ZXID
 16  and  T.YXBZ = 'Y'
 17   GROUP BY T.ZXID, T.GH, T.XM, T.BM, T.FZJGDM) order by 1;

已选择20行。

已用时间:  00: 00: 00.01

执行计划
----------------------------------------------------------
Plan hash value: 1317439664

---------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |   411 | 18495 |    18  (12)| 00:00:01 |
|   1 |  SORT GROUP BY      |           |   411 | 18495 |    18  (12)| 00:00:01 |
|*  2 |   HASH JOIN         |           |   411 | 18495 |    17   (6)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_V    |   411 |  2466 |     5   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| CC_ZXJBXX |  1956 | 76284 |    11   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T"."ZXID"=TO_NUMBER("A"."ZCRYZH"))
   4 - filter("T"."YXBZ"='Y')


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         62  consistent gets
          0  physical reads
          0  redo size
       1945  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         20  rows processed


有时候开发人员,为了完成任务,他才不管你SQL写得好不好呢,能完成功能就行了,相信各位DBA深有体会。




 

你可能感兴趣的:(JOIN,sql,filter,table,Access,disk)