表的连接原理
表的连接方式:
1、循环嵌套连接(nested loop join)
2、哈西连接(hash join)
3、排序合并连接(merge join)
一、循环嵌套连接
A B 循环嵌套连接
方法:
1、扫描其中的一个小表,每读到一条记录,就根据这条记录(连接字段)的值到另外一个表中去查找。
2、另外一个表一般是大表,大表的连接字段上有索引,而且索引用的比较好。
3、扫描小表每读到一个连接字段的指,就到大表上走索引去扫描,以此循环小表,最后的结果合集就是要求查询的结果。
小表――又叫做驱动表,或者外表
大表――又叫做被驱动表,或者内表
CBO法则下,oracle会自动去选择驱动表,被驱动表的连接字段要求有索引而且条件要好,如果驱动表过大,也不适合这种方法。
SQL> conn plsql/plsql
Connected.
SQL> set autotrace trace exp
SQL> set linesize 1000
SQL> select h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 566430324
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 306 | 8 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 306 | 8 (25)| 00:00:01 |
|* 2 | HASH JOIN | | 9 | 306 | 7 (15)| 00:00:01 |
| 3 | TABLE ACCESS FULL| ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| HRC_TAB | 9 | 108 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."HRC_CODE"="H"."HRC_CODE")
强制循环嵌套连接
SQL> select /*+ use_nl(o h) */ h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3315346234
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 306 | 13 (8)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 306 | 13 (8)| 00:00:01 |
| 2 | NESTED LOOPS | | 9 | 306 | 12 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| HRC_TAB | 1 | 12 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_HRC_TAB | 1 | | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("O"."HRC_CODE"="H"."HRC_CODE")
将HRC_TAB的数据加大
SQL> conn plsql/plsql
Connected.
SQL> desc hrc_tab
Name Null? Type
----------------------------------------- -------- ----------------------------
HRC_CODE NOT NULL NUMBER(4)
HRC_DESCR NOT NULL VARCHAR2(20)
SQL> alter table hrc_tab modify hrc_code number(10);
Table altered.
SQL> begin
2 for i in 1..100000 loop
3 insert into hrc_tab values(hrc_org_seq.nextval,'111');
4 end loop;
5 end;
6 /
PL/SQL procedure successfully completed.
SQL> commit;
Commit complete.
SQL> select h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 566430324
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 306 | 8 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 306 | 8 (25)| 00:00:01 |
|* 2 | HASH JOIN | | 9 | 306 | 7 (15)| 00:00:01 | --为什么还是hash join
| 3 | TABLE ACCESS FULL| ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| HRC_TAB | 9 | 108 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."HRC_CODE"="H"."HRC_CODE")
因为性能数据还是旧的。
分析:
SQL> exec dbms_stats.gather_table_stats(user,'hrc_tab',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
索引的结构没有问题,因为插入数据是一条条插入的。
SQL> analyze index PK_HRC_TAB validate structure;
Index analyzed.
SQL> set autotrace off
SQL> select btree_space,height,pct_used,(del_lf_rows/decode(lf_rows,0,1,lf_rows))*100||'%' as delete_pct from index_stats;
BTREE_SPACE HEIGHT PCT_USED DELETE_PCT
----------- ---------- ---------- -----------------------------------------
1503280 2 100 0%
SQL> set autotrace trace exp
SQL> exec dbms_stats.gather_table_stats(user,'org_tab',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3315346234
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 270 | 13 (8)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 270 | 13 (8)| 00:00:01 |
| 2 | NESTED LOOPS | | 9 | 270 | 12 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| HRC_TAB | 1 | 8 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_HRC_TAB | 1 | | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("O"."HRC_CODE"="H"."HRC_CODE")
SQL> select /*+ use_hash(o h) */ h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 566430324
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 270 | 64 (10)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 270 | 64 (10)| 00:00:01 |
|* 2 | HASH JOIN | | 9 | 270 | 63 (8)| 00:00:01 |
| 3 | TABLE ACCESS FULL| ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| HRC_TAB | 100K| 781K| 58 (6)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."HRC_CODE"="H"."HRC_CODE")
此时,hash join代价比nested loop高。
二、哈西连接方式
hash算法――数据结构中讲述的一种算法,需要有一个哈西函数,用哈西函数将集合的值哈西化以后,集合的重复值将会变多,叫做簇值,根据簇值来分簇。
x和y关联
mod(x,10) --> 10个簇
mod(y,10) --> 10个簇
簇间匹配,再簇内匹配。
假设两个数据集合:
S={1,1,1,3,3,4,4,4,4,5,8,8,8,8,10}
B={0,0,1,1,2,2,3,8,9,9,9,10,11}
hash函数还是mod(x,10),将每个子分区用hash算法分出来,匹配后再进行原值的匹配:
分区 B0 B1 B2 B3 B4 B5 B6 B7 B8 B9
值 {0,0,10} {1,1,11} {2,2} {3} null null null null {8} {9,9,9}
--------------------------------------------------------
S0{10} Y
S1{1,1,1} Y
S2 NULL N
S3{3,3} Y
S4{4,4,4,4} N
S5{5} N
S6 NULL N
S7 NULL N
S8{8,8,8,8} Y
S9 NULL N
-------------------------------------------------------------------------------------------------
只有4个簇可以匹配
hash join的过程:
1、存放数据到hash区域――PGA区域
将小表(B)的数据,经过hash算法得到哈西簇(bucket),将原值放到相应的哈西簇中。
2、创建位图向量
将B表的连接字段集合读取到hash簇,oracle记录连接键的唯一值(原值),构建位图向量,这里的位图向量{0,1,2,3,8,9,10,11}
3、超出内存大小的部分(私有PGA区域耗完)会被移动到磁盘的临时表空间,任何需要写入到磁盘的操作都会生成IO的损耗,会影响性能, 所以最理想的情况是在PGA中完成HASH的操作。
4、对散列簇进行排序
为了能充分的利用内存,尽量存储更多的散列簇,oracle按照各个散列簇的大小在内存和磁盘中排序。
5、读取大表数据进行hash运算簇间匹配。如果一致,再进行原值的匹配返回结果。如果没有匹配上,就将S中没有匹配到的写到新的散列簇中。算法是一样的,也就是说在S中,S2,6,7,9被过滤了,同样把B中的位图向量进行过滤,这叫做位图向量的过滤。
hash区域:
SQL> show parameter hash
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hash_area_size integer 131072
因为PGA是自动管理方式,所以这个大小不需要手工调整的。
使用hash join的场合:
1、确认小表存在,驱动表,因为小表是优先在hash区域构建hash簇的表
2、确认涉及到的表和表的连接键都被分析过了。
3、确定用hash join方式了,不需要在连接键上创建B树索引。
4、如果SQL语句中,有where的过滤条件,此时这个字段不是连接字段,还是需要考虑创建B树索引的。如果过滤条件是连接字段,也要考虑创建B树索引,可以通过索引的搜索将参与运算结果集变小。
5、如果连接键值数值分布不均匀还是要做直方图的。
6、PGA不足的话,要调整PGA
7、hash join适合大表和超大表的连接,返回大型的结果集合。
SQL> conn scott/scott
Connected.
SQL> create table tt as select * from all_objects;
Table created.
SQL> create table tt1 as select * from all_objects;
Table created.
SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'tt1',estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> set autotrace trace exp
SQL> set linesize 1000
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id;
Execution Plan
----------------------------------------------------------
Plan hash value: 2918007165
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40909 | 7590K| | 686 (2)| 00:00:09 |
|* 1 | HASH JOIN | | 40909 | 7590K| 4280K| 686 (2)| 00:00:09 |
| 2 | TABLE ACCESS FULL| TT1 | 40910 | 3795K| | 134 (3)| 00:00:02 |
| 3 | TABLE ACCESS FULL| TT | 40909 | 3795K| | 134 (3)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TT"."OBJECT_ID"="TT1"."OBJECT_ID")
SQL> create index ind_tt on tt(object_id);
Index created.
SQL> create index ind_tt1 on tt1(object_id);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'tt1',estimate_percent=>100,cascade=>true);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100,cascade=>true);
PL/SQL procedure successfully completed.
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id;
Execution Plan
----------------------------------------------------------
Plan hash value: 2918007165
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40909 | 7590K| | 686 (2)| 00:00:09 |
|* 1 | HASH JOIN | | 40909 | 7590K| 4280K| 686 (2)| 00:00:09 |
| 2 | TABLE ACCESS FULL| TT1 | 40910 | 3795K| | 134 (3)| 00:00:02 |
| 3 | TABLE ACCESS FULL| TT | 40909 | 3795K| | 134 (3)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TT"."OBJECT_ID"="TT1"."OBJECT_ID")
此时加一个非连接字段的过滤条件:
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id and tt.object_type='TYPE';
Execution Plan
----------------------------------------------------------
Plan hash value: 3719458023
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1705 | 316K| 268 (3)| 00:00:04 |
|* 1 | HASH JOIN | | 1705 | 316K| 268 (3)| 00:00:04 |
|* 2 | TABLE ACCESS FULL| TT | 1705 | 158K| 133 (2)| 00:00:02 |
| 3 | TABLE ACCESS FULL| TT1 | 40910 | 3795K| 134 (3)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TT"."OBJECT_ID"="TT1"."OBJECT_ID")
2 - filter("TT"."OBJECT_TYPE"='TYPE')
SQL> create index ind_tt2 on tt(object_type);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100,cascade=>true);
PL/SQL procedure successfully completed.
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id and tt.object_type='TYPE';
Execution Plan
----------------------------------------------------------
Plan hash value: 712899355
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1249 | 231K| 179 (3)| 00:00:03 |
|* 1 | HASH JOIN | | 1249 | 231K| 179 (3)| 00:00:03 |
| 2 | TABLE ACCESS BY INDEX ROWID| TT | 1249 | 115K| 44 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_TT2 | 1249 | | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | TT1 | 40910 | 3795K| 134 (3)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TT"."OBJECT_ID"="TT1"."OBJECT_ID")
3 - access("TT"."OBJECT_TYPE"='TYPE')
此时一个过滤条件落在了连接键上。
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id and tt.object_id=259;
Execution Plan
----------------------------------------------------------
Plan hash value: 2300561305
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 190 | 4 (0)| 00:00:01 |
| 1 | MERGE JOIN CARTESIAN | | 1 | 190 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID | TT | 1 | 95 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_TT | 1 | | 1 (0)| 00:00:01 |
| 4 | BUFFER SORT | | 1 | 95 | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| TT1 | 1 | 95 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_TT1 | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("TT"."OBJECT_ID"=259)
6 - access("TT1"."OBJECT_ID"=259)
SQL> drop index ind_tt1;
Index dropped.
SQL> drop index ind_tt;
Index dropped.
SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100,cascade=>true);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'tt1',estimate_percent=>100,cascade=>true);
PL/SQL procedure successfully completed.
SQL> select * from tt,tt1 where tt.object_id=tt1.object_id and tt.object_id=259;
Execution Plan
----------------------------------------------------------
Plan hash value: 3719458023
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 190 | 267 (2)| 00:00:04 |
|* 1 | HASH JOIN | | 1 | 190 | 267 (2)| 00:00:04 |
|* 2 | TABLE ACCESS FULL| TT | 1 | 95 | 133 (2)| 00:00:02 |
|* 3 | TABLE ACCESS FULL| TT1 | 1 | 95 | 133 (2)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TT"."OBJECT_ID"="TT1"."OBJECT_ID")
2 - filter("TT"."OBJECT_ID"=259)
3 - filter("TT1"."OBJECT_ID"=259)
三、sort merge join排序合并连接
连接过程:
1、对连接的每个表作全表扫描或者索引全扫――――代价很大
2、对其中一个表或者两个全表扫的结果排序(如果其中的一个表是全索引扫描的方式是不需要排序的,因为索引本身是排序的)――代价最大
3、连接的时候将两个结果集合合并 ――代价不大
SQL> select /*+ use_merge(o h) */ h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3620910538
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 270 | 394 (3)| 00:00:05 |
| 1 | SORT ORDER BY | | 9 | 270 | 394 (3)| 00:00:05 |
| 2 | MERGE JOIN | | 9 | 270 | 393 (2)| 00:00:05 |
| 3 | TABLE ACCESS BY INDEX ROWID| HRC_TAB | 100K| 781K| 389 (2)| 00:00:05 |
| 4 | INDEX FULL SCAN | PK_HRC_TAB | 100K| | 191 (2)| 00:00:03 |
|* 5 | SORT JOIN | | 9 | 198 | 4 (25)| 00:00:01 |
| 6 | TABLE ACCESS FULL | ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("O"."HRC_CODE"="H"."HRC_CODE")
filter("O"."HRC_CODE"="H"."HRC_CODE")
SQL> select h.hrc_descr,o.org_short_name from org_tab o,hrc_tab h where o.hrc_code=h.hrc_code order by 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3315346234
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 270 | 13 (8)| 00:00:01 |
| 1 | SORT ORDER BY | | 9 | 270 | 13 (8)| 00:00:01 |
| 2 | NESTED LOOPS | | 9 | 270 | 12 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | ORG_TAB | 9 | 198 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| HRC_TAB | 1 | 8 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_HRC_TAB | 1 | | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("O"."HRC_CODE"="H"."HRC_CODE")
merge join使用场合:
1、9i开始,因为其排序和扫描的成本太高,大多被hash join取代。
2、如果源表(或者索引)在存储的时候已经排过序了,或者两个表的连接列都有索引且都是索引全扫的过程,此时性能会高于hash join、循环嵌套,出现的几率不大。
----------------------------------------------------------------
Hint――hint是oracle中的SQL语法,允许用户在SQL语句中插入提示(hint),影响SQL的执行方式。
误区:
1、/*+ index(table_name index_name) */认为这种做法一定能加快执行。
2、hint只要语法没错,一定会生效。
SQL> create table t as select rownum id,object_name from all_objects where rownum<100;
Table created.
SQL> create index t_ind on t(id);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> set autotrace trace exp
SQL> set linesize 1000
SQL> select * from t where id=100;
Execution Plan
----------------------------------------------------------
Plan hash value: 1376202287
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 18 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_IND | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=100)
SQL> select count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 99 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------
SQL> select /*+ index(t t_ind)*/ count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 99 | 3 (0)| 00:00:01 | --为什么hint没有生效?
-------------------------------------------------------------------
被忽略的原因:count是聚合函数,聚合函数对于null值是不起作用的。如果强制走索引的话,索引中是不索引NULL值的,NULL肯定不会被计算进去的,需求是计算所有不为null的行,算出的结果中,如果ID为null,name也不为空,此时走索引,忽略掉这种情况。强制走索引回导致错误的结果。所以被忽略。
SQL> select count(comm),count(1) from emp;
COUNT(COMM) COUNT(1)
----------- ----------
4 14
SQL> select /*+ index(t t_ind)*/ count(id) from t; --需求就是不计算id为null的值
Execution Plan
----------------------------------------------------------
Plan hash value: 646498162
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
| 2 | INDEX FULL SCAN| T_IND | 99 | 297 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------
如果此时索引是unique索引,会不会被忽略?
unique索引含义――不索引null值,要求不为null的键值是不重复的。
SQL> drop index t_ind;
Index dropped.
SQL> create unique index t_ind on t(id);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select /*+ index(t t_ind)*/ count(*) from t; --走索引计算的是不为null的值,但是需求是所有行只要有一个字段不为null就要计算的。
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 99 | 3 (0)| 00:00:01 | --忽略
-------------------------------------------------------------------
如果此时的索引是主键的话,会不会被忽略?
SQL> drop index t_ind;
Index dropped.
SQL> alter table t add constraint t_pk primary key(id);
Table altered.
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select /*+ index(t t_pk)*/ count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 56794325
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FULL SCAN| T_PK | 99 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------
--没有被忽略,主键不能为NULL的,需求所有行只要有一个字段不为null就要计算的,主键不为NULL,此时计算主键的个数和计算行数是一样的。不会被忽略。
SQL> select count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 56794325
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FULL SCAN| T_PK | 99 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------
SQL> select /*+ full(t) */ count(*) from t; --全表扫的代价高
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 99 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------
第二种情况:如果表使用了别名,提示中也要用别名,否则会被忽略。
SQL> select /*+ full(t) */ count(*) from t a;
Execution Plan
----------------------------------------------------------
Plan hash value: 56794325
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FULL SCAN| T_PK | 99 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------
SQL> select /*+ full(a) */ count(*) from t a;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 99 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------
总结:
1、如果所强制走索引可能导致错误的查询结果的时候,hint会被忽略。
2、如果使用了别名,hint中没有用,也会被忽略。