connect by递归查询

   对于递归查询,在11g之前只能用start with ... connnect by prior来实现,11g之后可用with as来实现,这里的讨论只基于connect by的形式。
   在10.2.0.2之后对于递归查询,CBO可以有2种选择:connect_by_filtering和no_connect_by_filtering,另外还有一些隐藏参数也影响着递归查询的执行计划。

参数

KSPPINM                             KSPPSTVL   KSPPDESC
----------------------------------- ---------- ------------------------------------------------------------
_optimizer_connect_by_cost_based    TRUE       use cost-based transformation for connect by
_optimizer_connect_by_combine_sw    TRUE       combine no filtering connect by and start with
_optimizer_connect_by_elim_dups     TRUE       allow connect by to eliminate duplicates from input
_optimizer_connect_by_cb_whr_only   FALSE      use cost-based transformation for whr clause in connect

分类

hint:NO_CONNECT_BY_FILTERING/CONNECT_BY_FILTERING

connect_by_filtering


A hierarchical query is typically executed using a plan that starts with the operation CONNECT BY WITH FILTERING, which has two child operations. 
The first child operation implements the START WITH clause 
the second child operation contains a step called CONNECT BY PUMP, implementing the recursive part of the query


CONNECT BY WITH FILTERING通常有2个子节点,如:
----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |       |       |   233 (100)|          |
|*  1 |  CONNECT BY WITH FILTERING|      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL       | T    |    10 |   110 |    77   (2)| 00:00:01 |
|*  3 |   HASH JOIN               |      |    50 |  1200 |   155   (2)| 00:00:02 |
|   4 |    CONNECT BY PUMP        |      |       |       |            |          |
|*  5 |    TABLE ACCESS FULL      | T    | 50000 |   537K|    78   (3)| 00:00:01 |
----------------------------------------------------------------------------------


第一步:根据start with条件扫描表t
第二步:CONNECT BY PUMP接收第一步扫描出的结果集,然后通过自连接的形式进行递归连接,CONNECT BY PUMP再接收连接后的结果集,再通过自连接的形式进行下一次递归连接,直到达到最后一个层级。
note:Operation 5的扫描次数与递归的level有关,扫描次数=max(level)

如:

SELECT /*+gather_plan_statistics CONNECT_BY_FILTERING*/id,
    parent_id,
    sys_connect_by_path(id, '->') scbp
FROM t START WITH parent_id =3
CONNECT BY parent_id = prior id
    AND indicator = 'N';

执行计划:
PLAN_TABLE_OUTPUT
--------------------------------------------------
| Id  | Operation                 | Name | Starts 
--------------------------------------------------
|   0 | SELECT STATEMENT          |      |      1 
|*  1 |  CONNECT BY WITH FILTERING|      |      1 
|*  2 |   TABLE ACCESS FULL       | T    |      1 
|*  3 |   HASH JOIN               |      |      4 
|   4 |    CONNECT BY PUMP        |      |      4 
|*  5 |    TABLE ACCESS FULL      | T    |      4 
--------------------------------------------------
最大level:
SQL> SELECT max(level)
  2  FROM t START WITH parent_id =3
  3  CONNECT BY parent_id = prior id
  4     AND indicator = 'N';
MAX(LEVEL)
----------
         4

no_connect_by_filtering

这是10.2.0.2出的connect by的新算法,这种方式只会全表扫描表1次,具体是如何实现的,我目前还没研究清楚
如:
SELECT /*+gather_plan_statistics NO_CONNECT_BY_FILTERING*/id,
    parent_id,
    sys_connect_by_path(id, '->') scbp
FROM t START WITH parent_id =3
CONNECT BY parent_id = prior id
    AND indicator = 'N';

执行计划:
------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |      |    60 |  1740 |   517  (86)| 00:00:07 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |       |       |            |          |
|   2 |   TABLE ACCESS FULL                     | T    |   100K|  1074K|    77   (2)| 00:00:01 |
------------------------------------------------------------------------------------------------

总结

个人认为no_connect_by_filtering比connect by filtering的优势在于避免了多次扫描表,尤其是level比较大的情况。但是如果需要返回的记录数比较少,在关联列创建索引(上述sql的parent_id列),使用connect by filtering会更好。









































来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31347199/viewspace-2123067/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/31347199/viewspace-2123067/

你可能感兴趣的:(connect by递归查询)