[SQL调优]nested loop 和 hash join 的区别

前者使用驱动表查的记录在被驱动表中进行查找,需要在连接列上建立索引,而且驱动表上的谓词也要有索引;
后者是分别对小表进行hash 之后再对大表进行hash 然后在小表的存储桶里进行匹配。

简单翻译如下
How the CBO Chooses the Join Method
The optimizer estimates the cost of each join method and chooses the method with the least cost. If a join returns many rows, then the optimizer considers the following three factors:
优化器估计每个连接的方法成本并且选择最低的成本方法。如果一个连接返回很多行,那么优化器考虑下面三个因素:
1 A nested loop join is inefficient when a join returns a large number of rows (typically, more than 10,000 rows is considered large), and the optimizer might choose not to use it. The cost of a nested loop join is calculated by the following formula:

当一个连接返回很多行(超过10000行被认为是大的)一个嵌套连接效率很差,优化器或许不会选择使用它。嵌套连接成本的计算公式如下:
cost= access cost of A + (access cost of B * number of rows from A)
2 If you are using the CBO, then a hash join is the most efficient join when a join returns a large number or rows. The cost of a hash join is calculated by the following formula:
如果使用CBO方式,那么一个HASH连接是最有效的连接,当一个连接返回一个大数量或行
HASH连接成本使用下面的公式计算
cost= (access cost of A * number of hash partitions of B) + access cost of B
3 If you are using the RBO, then a merge join is the most efficient join when a join returns a large number or rows. The cost of a merge join is calculated by the following formula:
如果你使用RBO方式,那么一个合并连接是最有效的连接,当一个连接返回一个大数量或行。合并连接的成本使用下面的公式计算
cost= access cost of A + access cost of B +(sort cost of A + sort cost of B)

Nested Loops

Nested Loops (NL) is the most common type of join. NL selects a row from one table, and then looks up matching rows in the second table using the join predicates specified in the WHERE clause. If three or more tables are specified in the FROM clause, NL - having selected a row from table 1 and matching rows from table 2 - can then pick up matching rows from the third and subsequent tables also using Nested Loops. As soon as all tables are joined using the first row from table 1, NL will then proceed to the second row and so on.

When to use Nested Loops joins

* On-line applications where the first few rows of data need to be returned quickly for viewing. Providing a query has no DISTINCT, ORDER BY, GROUP BY, CONNECT BY, grouping functions, or analytic functions, Nested Loops can return rows as soon as they are joined. ie. Oracle does not need to join every row.
* The second and subsequent tables in the join can all be joined on low cardinality (few rows per key) indexes.
* The first (driving) table in the join has a selective WHERE condition that is indexed that is not a join condition. ie. WHERE t1.KEYCOL = 'VALUE'. Without an indexed selection on the driving table, Oracle will perform a Full Table Scan. NL may still be the best join method, but Sort-Merge and Hash should be considered.
* All tables to be joined by Nested Loops are medium-large (>500 rows). Small tables are usually best joined with a Hash join.

Nested Loops is Oracle's fall-back position for joins. Other join types can be very efficient in special circumstances, but they all have special conditions that must be met. If these conditions are not met, Oracle can always use a Nested Loops join, even if it is chronically inefficient.

Hash

Hash joins were introduced in V7 of the RDBMS, and became stable and reliable in 7.3. Some found that the Cost Based Optimizer was so keen on hash joins that it used them even when it was inappropriate. This behaviour has improved markedly in V8, but some DBAs may still have Hash joins disabled. If this is the case on your database, you can still request a hash join using the USE_HASH hint, or allow hash joins for the entire session with ALTER SESSION SET HASH_JOIN_ENABLED=TRUE. The algorithm for performing a hash join is particularly clever, and worth reading in the Oracle online doco (Concepts manual). Like Sort-Merge joins, Hash joins are useful for large joins, but only where the join condition is an equi-join.

When to use Hash joins

* At least one side of the join is returning many rows
* Low cardinality indexes are not available on the join keys.
* The join predicates use only equals (=) conditions.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/593324/viewspace-376158/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/593324/viewspace-376158/

你可能感兴趣的:([SQL调优]nested loop 和 hash join 的区别)