概念:子查询展开(Subquery Unnesting)是优化器处理带子查询的目标sql的一种优化手段,它是指优化器不再将目标sql中子查询当作一个独立的处理单元来单独执行,而是将该子查询转换为它自身和外部查询之间等价的表连接。这种等价连接转换要么是将子查询展开(即将该子查询中的表,视图从子查询中拿出来,然后和外部查询中的表,视图做表连接),要么是不拆开但是会把该子查询转换为一个内嵌视图(Inline View)然后再和外部查询中的表,视图做表连接。Oracle 会确保子查询展开所对应的等价连接转换的正确性,即转换后的sql和原sql在语义上一定是等价的。当然不是所有的子查询都能做子查询展开,有些子查询是不能做这种等价表连接转换的,这种情况下oracle就不会对其做子查询展开,也就是说此时oracle还是会将该子查询当作一个独立的处理单元来单独执行。另外,在oracle10g以后版本中,对于那种不拆开子查询但是会把该子查询转换成一个内嵌视图的子查询展开,只有当经过子查询展开后的等价改写sql的成本值小于原sql的成本值时,oracle才会对原sql执行子查询展开
子查询展开通常都会提高原sql的执行效率,因为如果原sql不做子查询展开,那么通常情况下该子查询就会在其执行计划的最后一步才被执行,并且会走filter类型的执行计划,这也就意味着对于外部查询所在结果集的没一条记录,该子查询就会被执行多少次,这种执行方式的执行效率通常情况不会太高,尤其在子查询中包含两个或两个以上表连接时,此时做子查询展开后的执行效率往往会比走filter类型的执行计划高很多。
Oracle 数据库里子查询前where条件如果是如下这些条件之一,那么这种类型的目标sql在满足了一定条件后就可以做子查询展开,
single-row,exists,not exists,in ,not in,any,all。
范例1:
SQL> set lines 200 pagesize 1000 in写法: SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id IN 4 (SELECT t2.cust_id FROM sales t2 WHERE t2.amount_sold > 700) 5 ; Execution Plan ---------------------------------------------------------- Plan hash value: 2448612695 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7059 | 158K| | 1583 (1)| 00:00:20 | | | |* 1 | HASH JOIN SEMI | | 7059 | 158K| 1360K| 1583 (1)| 00:00:20 | | | | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 704K| | 405 (1)| 00:00:05 | | | | 3 | PARTITION RANGE ALL| | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | |* 4 | TABLE ACCESS FULL | SALES | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- any等价写法: SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id = ANY 4 (SELECT t2.cust_id FROM sales t2 WHERE t2.amount_sold > 700) 5 ; Execution Plan ---------------------------------------------------------- Plan hash value: 2448612695 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7059 | 158K| | 1583 (1)| 00:00:20 | | | |* 1 | HASH JOIN SEMI | | 7059 | 158K| 1360K| 1583 (1)| 00:00:20 | | | | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 704K| | 405 (1)| 00:00:05 | | | | 3 | PARTITION RANGE ALL| | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | |* 4 | TABLE ACCESS FULL | SALES | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | ---------------------------------------------------------------------------------------------------------- exists等价写法: SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id = ANY 4 (SELECT t2.cust_id FROM sales t2 WHERE t2.amount_sold > 700) 5 ; Execution Plan ---------------------------------------------------------- Plan hash value: 2448612695 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7059 | 158K| | 1583 (1)| 00:00:20 | | | |* 1 | HASH JOIN SEMI | | 7059 | 158K| 1360K| 1583 (1)| 00:00:20 | | | | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 704K| | 405 (1)| 00:00:05 | | | | 3 | PARTITION RANGE ALL| | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | |* 4 | TABLE ACCESS FULL | SALES | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | ----------------------------------------------------------------------------------------------------------
不展开,显然不合理,sales表要执行很多次: SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id IN (SELECT /*+ no_unnest */ t2.cust_id FROM sales t2 WHERE t2.amount_sold > 700)
子查询展开后,变成hash 半连接:
等价写法:(如果cust_id是唯一键值)可以转换为内连接:
SELECT t1.cust_last_name, t1.cust_id FROM customers t1, sales t2 WHERE t1.cust_id= t2.cust_id AND t2.amount_sold > 700
如果是not in,则会转换为hash 反连接:
SQL> set autot trace SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id not in 4 (SELECT t2.cust_id FROM sales t2 WHERE t2.amount_sold > 700); Execution Plan ---------------------------------------------------------- Plan hash value: 2850422635 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 48441 | 1088K| | 1583 (1)| 00:00:20 | | | |* 1 | HASH JOIN ANTI | | 48441 | 1088K| 1360K| 1583 (1)| 00:00:20 | | | | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 704K| | 405 (1)| 00:00:05 | | | | 3 | PARTITION RANGE ALL| | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | |* 4 | TABLE ACCESS FULL | SALES | 560K| 5469K| | 526 (2)| 00:00:07 | 1 | 28 | ----------------------------------------------------------------------------------------------------------
把子查询转换成内联视图:
SELECT t1.cust_last_name, t1.cust_id FROM customers t1 WHERE t1.cust_id NOT IN (SELECT t2.cust_id FROM sales t2, products t3 WHERE t2.prod_id = t3.prod_id and t2.amount_sold > 700) Execution Plan ---------------------------------------------------------- Plan hash value: 1272298339 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 48441 |1229K| |1665 (1)| 00:00:20 | | | |* 1 | HASH JOIN ANTI | | 48441 |1229K|1360K|1665 (1)| 00:00:20 | | | | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 704K| | 405 (1)| 00:00:05 | | | | 3 | VIEW | VW_NSO_1 | 560K|7110K| | 529 (2)| 00:00:07 | | | |* 4 | HASH JOIN | | 560K|9844K| | 529 (2)| 00:00:07 | | | | 5 | INDEX FULL SCAN | PRODUCTS_PK | 72 | 288 | | 1 (0)| 00:00:01 | | | | 6 | PARTITION RANGE ALL| | 560K|7657K| | 526 (2)| 00:00:07 | 1 | 28 | |* 7 | TABLE ACCESS FULL | SALES | 560K|7657K| | 526 (2)| 00:00:07 | 1 | 28 | --------------------------------------------------------------------------------------------------------------
这里oracle把子查询转换成内联视图 VM_NSO_1,然后再和外部查询中的表customers做hash半连接。
等价:
SELECT t1.cust_last_name, t1.cust_id FROM customers t1, (SELECT t2.cust_id FROM sales t2, products t3 WHERE t2.prod_id = t3.prod_id AND t2.amount_sold > 700) vm_nso_1 WHERE t1.cust_id semi = vm_nso_1.cust_id
子查询是否能够做子查询展开取决于如下两个条件:
子查询展开所对应的等价改写sql和原sql在语义上一定要完全等价的,如果改写后的sql和原sql并不一定能保持语义上的完全等价,这种类型的子查询就不能做子查询展开。
对于不能拆开的子查询但是会把它转换为一个内嵌视图的子查询展开,只有经过子查询展开的等价改写sql成本值小于原sql的成本值。oracle才会对目标sql执行子查询展开。
对于子查询展开的第一种情形(即将子查询展开,把该子查询中的表,视图从子查询中拿出来,然后和外部查询中表,视图做表连接),即使在oracle 10g以后的版本中,oracle也不会考虑子查询展开的成本,即oracle此时会认为这种情形下子查询展开的效率始终比不展开的效率高,这就意味着如果目标sql满足子查询展开的第一种情形。则oracle始终会做子查询展开,而不管经过子查询展开后的等价sql的成本值是否小于原sql的成本值。