Hash join与nested loops join的效率比较


创建测试表

SQL> create table t as select rownum id from dba_objects;


Table created.


SQL> create index idx_t on t(id);


Index created.



SQL> exec dbms_stats.gather_table_stats('sys','t',cascade=>true);


PL/SQL procedure successfully completed.



SQL> create table t1 as select id,'T1' name from t where id<1000;


Table created.



SQL> create index idx_t1 on t1(id);


Index created.



SQL> exec dbms_stats.gather_table_stats('sys','t1',cascade=>true);


PL/SQL procedure successfully completed.


上面的表中t表有7万多行,t1表1000行数据


SQL> set autotrace traceonly;

SQL> select * from t,t1 where t1.id<100 and t.id=t1.id;


99 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 3627535484


----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 98 | 1176 | 5 (20)| 00:00:01 |

|* 1 | HASH JOIN | | 98 | 1176 | 5 (20)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |

|* 3 | TABLE ACCESS FULL| T1 | 99 | 693 | 2 (0)| 00:00:01 |

----------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


1 - access("T"."ID"="T1"."ID")

2 - access("T"."ID"<100)

3 - filter("T1"."ID"<100)



Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

13 consistent gets

0 physical reads

0 redo size

2881 bytes sent via SQL*Net to client

590 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

99 rows processed



从上面看出CBO默认根据统计信息计算使用Hash join方式,逻辑读13

SQL> select /*+ use_nl(t,t1) */ * from t,t1 where t1.id<100 and t.id=t1.id;


99 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 967599162


----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 98 | 1176 | 58 (2)| 00:00:01 |

| 1 | NESTED LOOPS | | 98 | 1176 | 58 (2)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |

|* 3 | TABLE ACCESS FULL| T1 | 1 | 7 | 1 (0)| 00:00:01 |

----------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


2 - access("T"."ID"<100)

3 - filter("T1"."ID"<100 AND "T"."ID"="T1"."ID")



Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

412 consistent gets

0 physical reads

0 redo size

2881 bytes sent via SQL*Net to client

590 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

99 rows processed

当使用hint强制使用nested join方式逻辑读为412,远远超过了hash join


SQL> select /*+ use_merge(t,t1) */ * from t,t1 where t1.id<100 and t.id=t1.id;


99 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 3760646495


-----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 98 | 1176 | 5 (20)| 00:00:01 |

| 1 | MERGE JOIN | | 98 | 1176 | 5 (20)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |

|* 3 | SORT JOIN | | 99 | 693 | 3 (34)| 00:00:01 |

|* 4 | TABLE ACCESS FULL| T1 | 99 | 693 | 2 (0)| 00:00:01 |

-----------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


2 - access("T"."ID"<100)

3 - access("T"."ID"="T1"."ID")

filter("T"."ID"="T1"."ID")

4 - filter("T1"."ID"<100)



Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

13 consistent gets

0 physical reads

0 redo size

2881 bytes sent via SQL*Net to client

590 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

99 rows processed


当使用hint强制使用Merge join方式逻辑读与hash join一致,但是多了一次内存排序,因此最优的还是hash join



继续下面实验:

也可以通过hint改变基数的方式走nested loop join

SQL> select /*+ cardinality(t1 1) */ * from t,t1 where t1.id<100 and t.id=t1.id;



----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 12 | 3 (0)| 00:00:01 |

| 1 | NESTED LOOPS | | 1 | 12 | 3 (0)| 00:00:01 |

|* 2 | TABLE ACCESS FULL| T1 | 1 | 7 | 2 (0)| 00:00:01 |

|* 3 | INDEX RANGE SCAN | IDX_T | 1 | 5 | 1 (0)| 00:00:01 |

----------------------------------------------------------------------------


从这里就可以看出在连接一个小表的时候CBO会选择nested loop join


SQL> delete t1 where id>10;


989 rows deleted.


SQL> commit;


Commit complete.


SQL> select count(*) from t1;


COUNT(*)

----------

10


SQL> select * from t,t1 where t.id=t1.id;


10 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 1557655968


----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 10 | 110 | 12 (0)| 00:00:01 |

| 1 | NESTED LOOPS | | 10 | 110 | 12 (0)| 00:00:01 |

| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |

|* 3 | INDEX RANGE SCAN | IDX_T | 1 | 5 | 1 (0)| 00:00:01 |

----------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


3 - access("T"."ID"="T1"."ID")



Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

14 consistent gets

0 physical reads

0 redo size

804 bytes sent via SQL*Net to client

524 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

10 rows processed


从上面看出当连接一张小表的时候,CBO计算出使用nested loops join方式

SQL> select /*+ use_hash(t,t1) */ * from t,t1 where t.id=t1.id;


10 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 1444793974


---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 10 | 110 | 35 (3)| 00:00:01 |

|* 1 | HASH JOIN | | 10 | 110 | 35 (3)| 00:00:01 |

| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |

| 3 | TABLE ACCESS FULL| T | 73961 | 361K| 32 (0)| 00:00:01 |

---------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


1 - access("T"."ID"="T1"."ID")



Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

122 consistent gets

0 physical reads

0 redo size

804 bytes sent via SQL*Net to client

524 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

10 rows processed



当我们强制使用hash join 可以看到逻辑读增加到了122,相差100多


SQL> select /*+ use_merge(t,t1) */ * from t,t1 where t.id=t1.id;


10 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 4143939902


-----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 10 | 110 | 169 (2)| 00:00:03 |

| 1 | MERGE JOIN | | 10 | 110 | 169 (2)| 00:00:03 |

| 2 | INDEX FULL SCAN | IDX_T | 73961 | 361K| 166 (1)| 00:00:02 |

|* 3 | SORT JOIN | | 10 | 60 | 3 (34)| 00:00:01 |

| 4 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |

-----------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


3 - access("T"."ID"="T1"."ID")

filter("T"."ID"="T1"."ID")



Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

7 consistent gets

0 physical reads

0 redo size

804 bytes sent via SQL*Net to client

524 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

10 rows processed


使用merge join 逻辑读最低,但是多了一次内存排序,CBO最终还是选择了nested loops join


SQL> drop index idx_t ;


Index dropped.


SQL> exec dbms_stats.gather_table_stats('sys','t',cascade=>true);


PL/SQL procedure successfully completed.


SQL> select * from t,t1 where t.id=t1.id;


20 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 1444793974


---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 20 | 220 | 67 (3)| 00:00:01 |

|* 1 | HASH JOIN | | 20 | 220 | 67 (3)| 00:00:01 |

| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |

| 3 | TABLE ACCESS FULL| T | 147K| 722K| 64 (2)| 00:00:01 |

---------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


1 - access("T"."ID"="T1"."ID")



Statistics

----------------------------------------------------------

135 recursive calls

0 db block gets

250 consistent gets

0 physical reads

0 redo size

1097 bytes sent via SQL*Net to client

535 bytes received via SQL*Net from client

3 SQL*Net roundtrips to/from client

3 sorts (memory)

0 sorts (disk)

20 rows processed


如上当删除索引后cbo会选择hash join


总结:当多表连接的时候,如果是大表之间hash join速度最快,如果其中一个表相对来说比较小,并且关联字段上有索引的时候nested loops 速度比较快,上面实验也证实了如果CBO判断基数不准确会导致错误的执行计划。


你可能感兴趣的:(JOIN,JOIN,JOIN,oracle,优化,hash,merge,nested,loops)