很多场景下,部分数据是连续,按照业务要求需要求出这些连续的数据,那么如何实现和优化呢?
1、构造实验数据:
CREATE OR REPLACE VIEW v(proj_id, proj_start, proj_end) AS
SELECT 1 , date '2005-01-01', date '2005-01-02' FROM dual UNION ALL
SELECT 2 , date '2005-01-02', date '2005-01-03' FROM dual UNION ALL
SELECT 3 , date '2005-01-03', date '2005-01-04' FROM dual UNION ALL
SELECT 4 , date '2005-01-04', date '2005-01-05' FROM dual UNION ALL
SELECT 5 , date '2005-01-06', date '2005-01-07' FROM dual UNION ALL
SELECT 6 , date '2005-01-16', date '2005-01-17' FROM dual UNION ALL
SELECT 7 , date '2005-01-17', date '2005-01-18' FROM dual UNION ALL
SELECT 8 , date '2005-01-18', date '2005-01-19' FROM dual UNION ALL
SELECT 9 , date '2005-01-19', date '2005-01-20' FROM dual UNION ALL
SELECT 10, date '2005-01-21', date '2005-01-22' FROM dual UNION ALL
SELECT 11, date '2005-01-26', date '2005-01-27' FROM dual UNION ALL
SELECT 12, date '2005-01-27', date '2005-01-28' FROM dual UNION ALL
SELECT 13, date '2005-01-28', date '2005-01-29' FROM dual UNION ALL
SELECT 14, date '2005-01-29', date '2005-01-30' FROM dual;
2、可以看到有部分工程的时间是连续的,现在要求把这种连续的数据查出来
select * from v;
select v1.proj_id as 工程号,
v1.proj_start as 开始日期,
v1.proj_end as 结束日期
from v v1, v v2
where v1.proj_end = v2.proj_start;
自关联查询的执行计划如下所示:
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 56 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 2 | 60 | 56 (0)| 00:00:01 |
| 2 | VIEW | V | 14 | 294 | 28 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
| 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 11 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 12 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 13 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 14 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 15 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 16 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 17 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 18 | VIEW | V | 14 | 126 | 28 (0)| 00:00:01 |
| 19 | UNION-ALL | | | | | |
| 20 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 21 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 22 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 23 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 24 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 25 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 26 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 27 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 28 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 29 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 30 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 31 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 32 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 33 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("V1"."PROJ_END"="V2"."PROJ_START")
4、第二种方法,使用分析函数:lead() over()函数
select 工程号, 开始日期, 结束日期
from (select proj_id as 工程号,
proj_start as 开始日期,
proj_end as 结束日期,
lead(proj_start) over(order by proj_id) as 下一工程开始日期
from v)
where 结束日期 = 下一工程开始日期;
分析函数查询的执行计划如下所示:
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 420 | 29 (4)| 00:00:01 |
|* 1 | VIEW | | 14 | 420 | 29 (4)| 00:00:01 |
| 2 | WINDOW SORT | | 14 | 294 | 29 (4)| 00:00:01 |
| 3 | VIEW | V | 14 | 294 | 28 (0)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 11 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 12 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 13 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 14 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 15 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 16 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 17 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 18 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("结束日期"="下一工程开始日期")
看到这里,铁子们应该高下立判了,使用分析函数可以只访问一次表或视图,而使用自关联的方法查询的话,需要访问两次,查询次数一多的话性能差异就上来了。