跳跃式索引(Skip Scan Index)的浅析[转]

跳跃式索引(Skip Scan Index)的浅析

[English]

作者: fuyuncat

来源: www.HelloDBA.com

  

Oracle9i中,有一个新的特性:跳跃式索引(Skip Scan Index)。当表有一个复合索引,而在查询中有除了索引中第一列的其他列作为条件,并且优化器模式为CBO,这时候查询计划就有可能使用到SS。此外,还可以通过使用提示index_ssCBO下)来强制使用SS

举例:

SQL> create table test1 (a number, b char(10), c varchar2(10));

Table created.

SQL> create index test_idx1 on test1(a, b);

Index created.

SQL> set autotrace on

SQL> select /*+index_ss(test1 test_idx1)*/* from test1 a

2 where b ='a';

no rows selected

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=32)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST1' (Cost=2 Card=1 Bytes=32)

2 1 INDEX (SKIP SCAN) OF 'TEST_IDX1' (NON-UNIQUE)

    
但并不是任何情况下都会使用到SS。在Oracle的官方文档中,除了提到需要CBO,并且对表进行过分析外,还需要保证第一列的distinct value非常小。这一段是从官方文档上摘取的关于SS的一段解释:

 

Index skip scans improve index scans by nonprefix columns since it is often faster to scan index blocks than scanning table data blocks.

 

In this case a composite index is split logically into smaller subindexes. The number of logical subindexes depends on the cardinality of the initial column. Hence it is now possible to use the index even if the leading column is not used in a where clause.

 

Oracle并没有公布过关于SS更多的内部技术细节。但注意上面的这句话:In this case a composite index is split logically into smaller subindexes. The number of logical subindexes depends on the cardinality of the initial column.Oralce会对复合索引进行逻辑划分,分成多个子索引。可以这样理解,Oracle将索引从逻辑上划分为a.num_distinct个子索引,每次对一个子索引进行扫描。因此SS的索引扫描成本为a.num_distinct.

下面做一些试验,看看在什么情况下Oracle采用SS.

首先要保证使用SS的几个必要条件:

·        OptimizerCBO

·        相关表要有正确的统计数据

·        Oracle DB版本为9i以上

下面就是一个使用到SS的特殊条件:第一列的distinct num要足够小。小到什么程度呢?

还是以上面的表为例(省略中间的麻烦步骤,取两个临界值做实验):

取第一列distinct number37

SQL> truncate table test1;

 

Table truncated.

 

SQL> begin

  2  for i in 1..100000 loop

  3  insert into test1 values (mod(i,37), to_char(i), to_char(i));

  4  end loop;p;

  5  commit;

  6  end;

  7  /

 

PL/SQL procedure successfully completed.

 

SQL> analyze table test1 compute statistics;

 

Table analyzed.

 

SQL> set autotrace on explain

SQL> select * from test1

  2  where b = '500';

 

         A B          C

---------- ---------- ----------

        19 500        500

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=1 Bytes=17)

   1    0   TABLE ACCESS (FULL) OF 'TEST1' (Cost=37 Card=1 Bytes=17)

 

再取第一列distinct number36

SQL> truncate table test1;

 

Table truncated.

 

SQL> begin

  2  for i in 1..100000 loop

  3  insert into test1 values (mod(i,36), to_char(i), to_char(i));

  4  end loop;

  5  commit;

  6  end;

  7  /

 

PL/SQL procedure successfully completed.

 

SQL> analyze table test1 compute statistics;

 

Table analyzed.

 

SQL> select * from test1 where b = '500';

 

         A B          C

---------- ---------- ----------

        32 500        500

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1 Bytes=17)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST1' (Cost=12 Card=1 B

          ytes=17)

 

   2    1     INDEX (SKIP SCAN) OF 'TEST_IDX1' (NON-UNIQUE) (Cost=37 C

          ard=1)

 

从上面试验结果看,FTScost37。当第一列distinct number小于这个值时,Oracle选择了SS

 

继续试验:

SQL> select count(*) from test1

  2  where b <= '1';

 

  COUNT(*)

----------

         1

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1 Bytes=10)

   1    0   SORT (AGGREGATE)

   2    1     INDEX (SKIP SCAN) OF 'TEST_IDX1' (NON-UNIQUE) (Cost=37 C

          ard=1 Bytes=10)

 

注意:在b’10’是比’1’大的最小值(char(10)类型)

SQL> select count(*) from test1

  2  where b <= '10';

 

  COUNT(*)

----------

         2

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=1 Bytes=10)

   1    0   SORT (AGGREGATE)

   2    1     TABLE ACCESS (FULL) OF 'TEST1' (Cost=37 Card=773 Bytes=7

          730)

观察结果,这时候影响的因素是cardinality了。第二个查询计划中的cardinality值(773)正是b<=’10’cardinality值:

SQL> set autotrace off

SQL> select 100000*(to_number('31302020202020202020', 'xxxxxxxxxxxxxxxxxxxx')-to

_number('31202020202020202020', 'xxxxxxxxxxxxxxxxxxxx'))/(to_number('39393939392

020202020', 'xxxxxxxxxxxxxxxxxxxx')-to_number('31202020202020202020', 'xxxxxxxxx

xxxxxxxxxxx'))+1 from dual;

 

100000*(TO_NUMBER('31302020202020202020','XXXXXXXXXXXXXXXXXXXX')-TO_NUMBER('3120

 

--------------------------------------------------------------------------------

 

                                                                      772.791768

 

再看一个含有第一列条件的等效的语句:

SQL> set autotrace on explain

SQL> select count(*) from test1

  2  where a>=0

  3  and b <='1';

 

  COUNT(*)

----------

         1

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1 Bytes=12)

   1    0   SORT (AGGREGATE)

   2    1     INDEX (SKIP SCAN) OF 'TEST_IDX1' (NON-UNIQUE) (Cost=37 C

          ard=1 Bytes=12)

 

再做几个有趣的试验,下面的试验条件是不满足SS的,但是请注意查询返回列队查询计划的影响:

SQL> truncate table test1;

 

Table truncated.

 

SQL> begin

  2  for i in 1..100000 loop

  3  insert into test1 values (i, to_char(i), to_char(i));

  4  end loop;

  5  commit;

  6  end;

  7  /

 

PL/SQL procedure successfully completed.

 

SQL> analyze table test1 compute statistics;

 

Table analyzed.

 

SQL> select * from test1

  2  where b = '500';

 

         A B          C

---------- ---------- ----------

       500 500        500

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=1 Bytes=19)

   1    0   TABLE ACCESS (FULL) OF 'TEST1' (Cost=37 Card=1 Bytes=19)

 

改变返回列:

SQL> select count(*) from test1

  2  where b = '500';

 

  COUNT(*)

----------

         1

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=10)

   1    0   SORT (AGGREGATE)

   2    1     INDEX (FAST FULL SCAN) OF 'TEST_IDX1' (NON-UNIQUE) (Cost

          =34 Card=1 Bytes=10)

 

再改变一种:

SQL> select a from test1

  2  where b = '500';

 

         A

----------

       500

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=14)

   1    0   INDEX (FAST FULL SCAN) OF 'TEST_IDX1' (NON-UNIQUE) (Cost=3

          4 Card=1 Bytes=14)

 

使用RBO呢?

SQL> select /*+rule*/a from test1

  2  where b = '500';

 

         A

----------

       500

 

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=HINT: RULE

   1    0   TABLE ACCESS (FULL) OF 'TEST1'

 

值得一提的是,上述任何一个例子在8i中执行的话,都不会使用到索引(无论是否符合SS的条件)。

你可能感兴趣的:(Skip Scan Index,跳跃式扫描)