exists和in的区别和优化

在oracle中有exists、not exists来进行一些表连接操作,在普通情况下Oracle的执行计划如果使用到semi-join、anti-join方式来进行表连接的话,速度一遍很快,否则很慢。通过使用提示hash_sj hash_aj也可使执行计划走semi-join、anti-join。
下面有两种不同的写法,都会走semi-join
SQL> create table dev.a1 as select * from dba_objects where rownum <= 10000;

Table created.

SQL> create table dev.a2 as 
  2       select * from (select t.*,rownum rd from dba_objects t) p where rd>5000 and rd<= 15000;

Table created.

SQL> create table dev.a3 as 
  2       select * from (select t.*,rownum rd from dba_objects t) p where rd>7500 and rd<= 17500;

Table created.

SQL> analyze table dev.a1 compute statistics;

Table analyzed.

SQL> analyze table dev.a2 compute statistics;

Table analyzed.

SQL> analyze table dev.a3 compute statistics;

Table analyzed.

SQL> set autotrace traceonly;      
SQL> set timing on
SQL> set linesize 177
SQL> /* Formatted on 2013/09/21 14:21:36 (QP5 v5.227.12220.39754) */
SQL>   SELECT a1.owner, COUNT (*)
  2      FROM dev.a1
  3     WHERE    EXISTS
  4                 (SELECT /*+ hash_sj(a1 a2) */
  5                        *
  6                    FROM dev.a2
  7                   WHERE a2.object_id = a1.object_id)
  8           OR EXISTS
  9                 (SELECT /*+ hash_sj(a1 a2) */
 10                        *
 11                    FROM dev.a3
 12                   WHERE a3.object_id = a1.object_id)
 13  GROUP BY a1.owner;

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
Plan hash value: 3327704572

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     5 |    60 |   124   (2)| 00:00:02 |
|   1 |  HASH GROUP BY        |         |     5 |    60 |   124   (2)| 00:00:02 |
|*  2 |   HASH JOIN SEMI      |         |  5082 | 60984 |   123   (1)| 00:00:02 |
|   3 |    TABLE ACCESS FULL  | A1      | 10000 | 80000 |    40   (0)| 00:00:01 |
|   4 |    VIEW               | VW_SQ_1 | 18422 | 73688 |    82   (0)| 00:00:01 |
|   5 |     UNION-ALL         |         |       |       |            |          |
|   6 |      TABLE ACCESS FULL| A3      |  8422 | 33688 |    38   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| A2      | 10000 | 40000 |    44   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VW_COL_1"="A1"."OBJECT_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        406  consistent gets
          0  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

SQL> 

SQL> /* Formatted on 2013/09/21 14:23:32 (QP5 v5.227.12220.39754) */
SQL>   SELECT x.owner, COUNT (*)
  2      FROM (SELECT *
  3              FROM dev.a1
  4             WHERE EXISTS
  5                      (SELECT /*+ hash_sj(a1 a2)*/
  6                             *
  7                         FROM dev.a2
  8                        WHERE a2.object_id = a1.object_id)
  9            UNION ALL
 10            SELECT *
 11              FROM dev.a1
 12             WHERE     NOT EXISTS
 13                          (SELECT /*+ hash_aj(a1 p)*/
 14                                 *
 15                             FROM dev.a2 p
 16                            WHERE p.object_id = a1.object_id)
 17                   AND EXISTS
 18                          (SELECT /*+ hash_sj(a1 a3)*/
 19                                 *
 20                             FROM dev.a3
 21                            WHERE a3.object_id = a1.object_id)) x
 22  GROUP BY x.owner;

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 195072970

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     5 |    25 |   210   (2)| 00:00:03 |
|   1 |  HASH GROUP BY           |      |     5 |    25 |   210   (2)| 00:00:03 |
|   2 |   VIEW                   |      |  5108 | 25540 |   209   (2)| 00:00:03 |
|   3 |    UNION-ALL             |      |       |       |            |          |
|*  4 |     HASH JOIN RIGHT SEMI |      |  5082 | 60984 |    85   (2)| 00:00:02 |
|   5 |      TABLE ACCESS FULL   | A2   | 10000 | 40000 |    44   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL   | A1   | 10000 | 80000 |    40   (0)| 00:00:01 |
|*  7 |     HASH JOIN ANTI       |      |    26 |   416 |   124   (2)| 00:00:02 |
|*  8 |      HASH JOIN RIGHT SEMI|      |  2587 | 31044 |    79   (2)| 00:00:01 |
|   9 |       TABLE ACCESS FULL  | A3   |  8422 | 33688 |    38   (0)| 00:00:01 |
|  10 |       TABLE ACCESS FULL  | A1   | 10000 | 80000 |    40   (0)| 00:00:01 |
|  11 |      TABLE ACCESS FULL   | A2   | 10000 | 40000 |    44   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("A2"."OBJECT_ID"="A1"."OBJECT_ID")
   7 - access("P"."OBJECT_ID"="A1"."OBJECT_ID")
   8 - access("A3"."OBJECT_ID"="A1"."OBJECT_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        686  consistent gets
          0  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

SQL> 
SQL> SELECT a1.owner, COUNT (*)
  2      FROM dev.a1
  3     WHERE    EXISTS
  4                 (SELECT /*+ hash_sj(a1 a2) */
  5                        *
  6                    FROM dev.a2
  7                   WHERE a2.object_id = a1.object_id)
  8           OR EXISTS
  9                 (SELECT /*+ hash_sj(a1 a2) */
 10                        *
 11                    FROM dev.a3
 12                   WHERE a3.object_id = a1.object_id)
 13  GROUP BY a1.owner;

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3327704572

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     5 |    60 |   124   (2)| 00:00:02 |
|   1 |  HASH GROUP BY        |         |     5 |    60 |   124   (2)| 00:00:02 |
|*  2 |   HASH JOIN SEMI      |         |  5082 | 60984 |   123   (1)| 00:00:02 |
|   3 |    TABLE ACCESS FULL  | A1      | 10000 | 80000 |    40   (0)| 00:00:01 |
|   4 |    VIEW               | VW_SQ_1 | 18422 | 73688 |    82   (0)| 00:00:01 |
|   5 |     UNION-ALL         |         |       |       |            |          |
|   6 |      TABLE ACCESS FULL| A3      |  8422 | 33688 |    38   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| A2      | 10000 | 40000 |    44   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VW_COL_1"="A1"."OBJECT_ID")


Statistics
----------------------------------------------------------
         54  recursive calls
          0  db block gets
        424  consistent gets
          1  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
          5  rows processed

SQL> 
SQL> 
SQL> /* Formatted on 2013/09/21 14:23:32 (QP5 v5.227.12220.39754) */
SQL>   SELECT x.owner, COUNT (*)
  2      FROM (SELECT *
  3              FROM dev.a1
  4             WHERE EXISTS
  5                      (SELECT /*+ hash_sj(a1 a2)*/
  6                             *
  7                         FROM dev.a2
  8                        WHERE a2.object_id = a1.object_id)
  9            UNION ALL
 10            SELECT *
 11              FROM dev.a1
 12             WHERE     NOT EXISTS
 13                          (SELECT /*+ hash_aj(a1 p)*/
 14                                 *
 15                             FROM dev.a2 p
 16                            WHERE p.object_id = a1.object_id)
 17                   AND EXISTS
 18                          (SELECT /*+ hash_sj(a1 a3)*/
 19                                 *
 20                             FROM dev.a3
 21                            WHERE a3.object_id = a1.object_id)) x
 22  GROUP BY x.owner;

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 195072970

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     5 |    25 |   210   (2)| 00:00:03 |
|   1 |  HASH GROUP BY           |      |     5 |    25 |   210   (2)| 00:00:03 |
|   2 |   VIEW                   |      |  5108 | 25540 |   209   (2)| 00:00:03 |
|   3 |    UNION-ALL             |      |       |       |            |          |
|*  4 |     HASH JOIN RIGHT SEMI |      |  5082 | 60984 |    85   (2)| 00:00:02 |
|   5 |      TABLE ACCESS FULL   | A2   | 10000 | 40000 |    44   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL   | A1   | 10000 | 80000 |    40   (0)| 00:00:01 |
|*  7 |     HASH JOIN ANTI       |      |    26 |   416 |   124   (2)| 00:00:02 |
|*  8 |      HASH JOIN RIGHT SEMI|      |  2587 | 31044 |    79   (2)| 00:00:01 |
|   9 |       TABLE ACCESS FULL  | A3   |  8422 | 33688 |    38   (0)| 00:00:01 |
|  10 |       TABLE ACCESS FULL  | A1   | 10000 | 80000 |    40   (0)| 00:00:01 |
|  11 |      TABLE ACCESS FULL   | A2   | 10000 | 40000 |    44   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("A2"."OBJECT_ID"="A1"."OBJECT_ID")
   7 - access("P"."OBJECT_ID"="A1"."OBJECT_ID")
   8 - access("A3"."OBJECT_ID"="A1"."OBJECT_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        686  consistent gets
          0  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

SQL> 


如果子句中含有group by这种排序操作,使用in来代替exists操作会大大提高效率,示例如下:
修改一下示例表:
SQL> insert into dev.a2 select * from dev.a2;
SQL> commit;
使用exists操作,无论添加什么提示都不能使其走semi-join,耗费大量时间
SQL> set linesize 177
SQL> set timing on
SQL> set autotrace on
SQL> /* Formatted on 2013/09/21 14:40:50 (QP5 v5.227.12220.39754) */
SQL> SELECT COUNT (*)
  2    FROM dev.a1 a
  3   WHERE EXISTS
  4            (SELECT /*+ hash_sj(a b) */
  5                   *
  6               FROM (  SELECT object_id
  7                         FROM dev.a2
  8                     GROUP BY object_id
  9                       HAVING COUNT (*) > 1) b
 10              WHERE a.object_id = b.object_id);

  COUNT(*)
----------
      5000

Elapsed: 00:00:16.13

Execution Plan
----------------------------------------------------------
Plan hash value: 1410689026

--------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     1 |     3 |   220K  (1)| 00:44:12 |
|   1 |  SORT AGGREGATE         |      |     1 |     3 |            |          |
|*  2 |   FILTER                |      |       |       |            |          |
|   3 |    TABLE ACCESS FULL    | A1   | 10000 | 30000 |    40   (0)| 00:00:01 |
|*  4 |    FILTER               |      |       |       |            |          |
|   5 |     SORT GROUP BY NOSORT|      |     1 |     4 |    44   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL  | A2   |     1 |     4 |    44   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( EXISTS (SELECT 0 FROM "DEV"."A2" "A2" WHERE
              "OBJECT_ID"=:B1 GROUP BY "OBJECT_ID" HAVING COUNT(*)>1))
   4 - filter(COUNT(*)>1)
   6 - filter("OBJECT_ID"=:B1)


Statistics
----------------------------------------------------------
          0  recursive calls
      10000  db block gets
    3190133  consistent gets
          0  physical reads
          0  redo size
        526  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

将exists改为in句型,就可以用到semi-join,效率大大提高

SQL> /* Formatted on 2013/09/21 14:45:40 (QP5 v5.227.12220.39754) */
SQL> SELECT COUNT (*)
  2    FROM dev.a1 a
  3   WHERE object_id IN (  SELECT object_id
  4                           FROM dev.a2 b
  5                       GROUP BY object_id
  6                         HAVING COUNT (*) > 1);

  COUNT(*)
----------
      5000

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 3434654636

-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |    16 |    86   (3)| 00:00:02 |
|   1 |  SORT AGGREGATE        |          |     1 |    16 |            |          |
|*  2 |   HASH JOIN RIGHT SEMI |          |   500 |  8000 |    86   (3)| 00:00:02 |
|   3 |    VIEW                | VW_NSO_1 |   500 |  6500 |    45   (3)| 00:00:01 |
|*  4 |     FILTER             |          |       |       |            |          |
|   5 |      HASH GROUP BY     |          |    25 |  2000 |    45   (3)| 00:00:01 |
|   6 |       TABLE ACCESS FULL| A2       | 10000 | 40000 |    44   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL   | A1       | 10000 | 30000 |    40   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"="OBJECT_ID")
   4 - filter(COUNT(*)>1)


Statistics
----------------------------------------------------------
          2  recursive calls
          1  db block gets
        454  consistent gets
          0  physical reads
          0  redo size
        526  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

你可能感兴趣的:(exists和in的区别和优化)