oracle 提供了start with connect by 语法结构可以实现递归查询。
1. 一个简单举例:
SQL> select * from test;
BILL_MONTH DAY_NUMBER MSISDN
-------------------- ---------- --------------------
200803 1 13800
200803 3 13800
200803 2 13800
200803 2 13801
200803 4 13804
200803 5 13804
200803 7 13804
200803 8 13804
200803 6 13802
200803 6 13801
200803 7 13801
200803 8 13801
12 rows selected
SQL>
SQL> select * from test
2 start with day_number=1
3 connect by prior day_number=day_number-1 and prior msisdn= msisdn
4 ;
BILL_MONTH DAY_NUMBER MSISDN
-------------------- ---------- --------------------
200803 1 13800
200803 2 13800
200803 3 13800
SQL>
上面的语句查找出了从1开始,并且day_number 逐渐+1 递增的,并且 msisdn 相同的哪些个数据.
2. start with connect by 语法结构
如上面说看到的 例子, 其语法结构为 start with condition connect by condition (含 prior 关键字)
start with conditon 给出的seed 数据的范围, connect by 后面给出了递归查询的条件,prior 关键字表示父数据,prior 条件表示子数据需要满足父数据的什么条件【要的时候有两种写法:connect by prior day_number=day_number-1 或 connect by day_number=prior day_number-1,前一种写法表示采用自上而下的搜索方式(先找父节点然后找子节点),后一种写法表示采用自下而上的搜索方式(先找叶子节点然后找父节点)】。
在下面的这个start with connect by 结构中,就表示 查找出了从1开始,并且day_number 逐渐+1 递增的,并且 msisdn 相同的那些个数据.
start with day_number=1
connect by prior day_number=day_number-1 and prior msisdn= msisdn
3. 执行计划
对于这个特殊的语法结构,我们来看看它的执行计划。
通过下面的执行计划,我们可以看出,对于简单的访问一个对象的递归查询,实际上oracle 要三次访问要查询的对象。因此,这一个告诉我们,在使用递归查询时,一定要谨慎,因为即使原表数据不多,但是三倍的访问喜爱来,代价也会很大。
SQL> explain plan for
2
2 select * from test
3 --where bill_month='200803'
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
Explained
SQL> select * from table( dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | CONNECT BY WITH FILTERING| | | | |
|* 2 | FILTER | | | | |
| 3 | TABLE ACCESS FULL | TEST | | | |
| 4 | NESTED LOOPS | | | | |
| 5 | BUFFER SORT | | | | |
| 6 | CONNECT BY PUMP | | | | |
|* 7 | TABLE ACCESS FULL | TEST | | | |
| 8 | TABLE ACCESS FULL | TEST | | | |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST"."DAY_NUMBER"=1)
2 - filter("TEST"."DAY_NUMBER"=1)
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
7 - filter("TEST"."MSISDN"=NULL AND "TEST"."DAY_NUMBER"-1=NULL)
Note: rule based optimization
23 rows selected
SQL>
另外,发现了在含有其他条件的递归中,是先处理所有的递归查询,最后才用加入的条件过滤.
请看下面的例子。
和上面的执行计划对比下我们可以知道,加入条件 where bill_month='200803' 后,实际上却是在递归完成后,最后才执行的 1 - filter("TEST"."BILL_MONTH"='200803') 。
所 以,为了确保语句的性能,不要直接加入条件在start with connect by 结构中,而是要想办法将原表的数据控制住。这个可以采用子查询的办法,或者使用临时表等(最好采用临时表,将数据量从本源上控制住;因为从子查询的执行计 划我们可以看到,它每次也都是访问全表,再用条件过滤,要重复三次,不是一次过滤就够了).
--直接加入条件后的执行计划
SQL> explain plan for
2
2 select * from test
3 where bill_month='200803'
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
Explained
SQL> select * from table( dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | FILTER | | | | |
|* 2 | CONNECT BY WITH FILTERING| | | | |
|* 3 | FILTER | | | | |
| 4 | TABLE ACCESS FULL | TEST | | | |
| 5 | NESTED LOOPS | | | | |
| 6 | BUFFER SORT | | | | |
| 7 | CONNECT BY PUMP | | | | |
|* 8 | TABLE ACCESS FULL | TEST | | | |
| 9 | TABLE ACCESS FULL | TEST | | | |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST"."BILL_MONTH"='200803')
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
2 - filter("TEST"."DAY_NUMBER"=1)
3 - filter("TEST"."DAY_NUMBER"=1)
8 - filter("TEST"."MSISDN"=NULL AND "TEST"."DAY_NUMBER"-1=NULL)
Note: rule based optimization
25 rows selected
SQL>
--使用子查询,将过滤条件嵌在子查询中
SQL> explain plan for
2
2 select * from (select * from test
3 where bill_month='200803')
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | CONNECT BY WITH FILTERING| | | | |
|* 2 | FILTER | | | | |
|* 3 | TABLE ACCESS FULL | TEST | | | |
| 4 | NESTED LOOPS | | | | |
| 5 | BUFFER SORT | | | | |
| 6 | CONNECT BY PUMP | | | | |
|* 7 | TABLE ACCESS FULL | TEST | | | |
|* 8 | TABLE ACCESS FULL | TEST | | | |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST"."DAY_NUMBER"=1)
2 - filter("TEST"."DAY_NUMBER"=1)
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
3 - filter("TEST"."BILL_MONTH"='200803')
7 - filter("TEST"."BILL_MONTH"='200803' AND "TEST"."MSISDN"=NULL AND
"TEST"."DAY_NUMBER"-1=NULL)
8 - filter("TEST"."BILL_MONTH"='200803')
Note: rule based optimization
26 rows selected
SQL>
4. 实际中 递归查询的使用。
问题:
数据库里有字段day_number,msisdn。如何写月度连续3天有记录的手机号?表结构如下:
id bill_month day_number msisdn
1 200803 1 13800000000
2 200803 1 130137.....
3 200803 2 13800000000
4 200803 3 13800000000
..............................
表中3月份连续3天有记录的纪录就是1380000000。请问如何写这样的sql?
解决方案:
SQL> create table test ( bill_month varchar2(20),day_number number ,msisdn varchar2(20));
Table created
SQL> insert into test values ( '200803',1,'13800');
1 row inserted
SQL> insert into test values ( '200803',3,'13800');
1 row inserted
SQL> insert into test values ( '200803',2,'13800');
1 row inserted
SQL> insert into test values ( '200803',2,'13801');
1 row inserted
SQL> insert into test values ( '200803',4,'13804');
1 row inserted
SQL> insert into test values ( '200803',5,'13804');
1 row inserted
SQL> commit;
Commit complete
SQL> select * from test;
BILL_MONTH DAY_NUMBER MSISDN
-------------------- ---------- --------------------
200803 1 13800
200803 3 13800
200803 2 13800
200803 2 13801
200803 4 13804
200803 5 13804
6 rows selected
SQL>
SQL> select distinct msisdn from test a
2 where bill_month='200803'
3 and exists
4 ( select msisdn from test
5 where bill_month='200803' and msisdn=a.msisdn
6 start with day_number=a.day_number
7 connect by prior day_number=day_number-1 and prior msisdn= msisdn
8 group by msisdn
9 having count(*)>=3
10 );
MSISDN
--------------------
13800
SQL> select * from test;
BILL_MONTH DAY_NUMBER MSISDN
-------------------- ---------- --------------------
200803 1 13800
200803 3 13800
200803 2 13800
200803 2 13801
200803 4 13804
200803 5 13804
6 rows selected
SQL> insert into test values ( '200803',7,'13804');
1 row inserted
SQL> insert into test values ( '200803',8,'13804');
1 row inserted
SQL> insert into test values ( '200803',6,'13802');
1 row inserted
SQL> insert into test values ( '200803',6,'13801');
1 row inserted
SQL> insert into test values ( '200803',7,'13801');
1 row inserted
SQL> insert into test values ( '200803',8,'13801');
1 row inserted
SQL> select * from test;
BILL_MONTH DAY_NUMBER MSISDN
-------------------- ---------- --------------------
200803 1 13800
200803 3 13800
200803 2 13800
200803 2 13801
200803 4 13804
200803 5 13804
200803 7 13804
200803 8 13804
200803 6 13802
200803 6 13801
200803 7 13801
200803 8 13801
12 rows selected
SQL> commit;
Commit complete
SQL>
SQL> select distinct msisdn from test a
2 where bill_month='200803'
3 and exists
4 ( select msisdn from test
5 where bill_month='200803' and msisdn=a.msisdn
6 start with day_number=a.day_number
7 connect by prior day_number=day_number-1 and prior msisdn= msisdn
8 group by msisdn
9 having count(*)>=3
10 );
MSISDN
--------------------
13800
13801