Oracle 优化——位图、函数等索引介绍

一、位图索引

我将使用一个例子,来描述位图索引的存储,并分析它的优点。

Table :Loans 放贷信息

 

ID userId 行业投向 币种 证件类型 还本付息方式 状态
1 1 农业 人民币 身份证 等额本息还款法 已上报
2 2 农业 人民币 身份证 等本还款法 未上报
3 1 工业 人民币 护照 按季计息到期还本法 已上报
4 2 个体 人民币 身份证 等本还款法 已上报
5 5 其他 人民币 身份证 按月计息到期还本法 未上报

 

 

 

 

我对行业投向,和还本付息方式添加了位图索引

create bitmap index index_投向 on loans(行业投向);
create bitmap index index_还本付息方式 on loans(还本付息方式);

那么它会这么对位图索引进行存储:当前列的每一种值,存放在一个块中,通过0和1来标示改rownumber是否存在改值。

行业投向位图索引/还本付息方式  

值/行 第一行 第二行 第三行 第四行
农业  1  1  0  0
工业  0  0  1  0
个体  0  0  0  1
其他  0  0  0  0
值/行 第一行 第二行 第三行 第四行
等额本息还款法  1  0  0  0
等本还款法  0  1  0  0
按季计息到期还本法  0  0  1  0
按月计息到期还本法  0  0  0 1

有图可以看出, 农业、工业、个体都各以一个块来存放 所有列“自己是否为真”。 

所以暂时可以得出:

1、位图索引,必须创建在“仅仅几种值的情况”。 

  如果在低重复度的列上创建位图索引是很恐怖的,他将创建N多个块来存储。不论创建,还是查询,都是不聪明的。

 

2、位图索引,不适合放在常修改的字段列(如状态列)容易发生死锁。   

  位图索引死锁情况举例

--SESSION 1(持有者)

DELETE FROM LOANS WHERE 行业投向='农业' AND status=1;



---SESSION 2(其他会话) 插入带'农业'的记录就立即被阻挡,以下三条语句都会被阻止

insert into loans(Id,投向.....) values (1,'农业',....);

update t set 投向='工业' WHERE id=25;

delete from loans WHERE 行业投向='农业';



--以下是可以进行不受阻碍的

insert into loans(Id,投向.....) values (1,'工业',....);

delete from t where gender='工业' ;

UPDATE T SET status='aa' WHERE ROWID NOT IN ( SELECT ROWID FROM T WHERE 投向='工业' ) ; 



--update只要不更新位图索引所在的列即可

 

3、索引通过 比特位 存储01,来标示真假,占用内存很小,检索效率极高。 

  count(*) where 行业投向 = 农业,效率是很高的, 

当采集平台完成这些金融数据采集后,金融监管部门要对信息进行分析、统计,形成报表。有位图索引效率是很好的。

具体案例

  1 /*

  2 总结:本质原因:其实就是位图索引存放的是0,1的比特位,占字节数特别少。

  3 */

  4 

  5 --位图索引跟踪前准备

  6 drop table t purge;

  7 set autotrace off

  8 create table t as select * from dba_objects;

  9 insert into t select * from t;

 10 insert into t select * from t;

 11 insert into t select * from t;

 12 insert into t select * from t;

 13 insert into t select * from t;

 14 insert into t select * from t;

 15 update t set object_id=rownum;

 16 commit;

 17 

 18 --观察COUNT(*)全表扫描的代价

 19 set autotrace on

 20 set linesize 1000

 21 select count(*) from t;

 22 

 23 

 24 

 25 ------------------------------------------

 26   COUNT(*)

 27 ----------

 28   4684992

 29 执行计划

 30 ----------------------------------------------------------

 31 Plan hash value: 2966233522

 32 

 33 -------------------------------------------------------------------

 34 | Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |

 35 -------------------------------------------------------------------

 36 |   0 | SELECT STATEMENT   |      |     1 | 20420  (11)| 00:04:06 |

 37 |   1 |  SORT AGGREGATE    |      |     1 |            |          |

 38 |   2 |   TABLE ACCESS FULL| T    |   294M| 20420  (11)| 00:04:06 |

 39 -------------------------------------------------------------------

 40 统计信息

 41 ----------------------------------------------------------

 42           0  recursive calls

 43           0  db block gets

 44       66731  consistent gets

 45           0  physical reads

 46           0  redo size

 47         426  bytes sent via SQL*Net to client

 48         415  bytes received via SQL*Net from client

 49           2  SQL*Net roundtrips to/from client

 50           0  sorts (memory)

 51           0  sorts (disk)

 52           1  rows processed

 53           

 54           

 55 

 56 

 57 

 58 --观察COUNT(*)用普通索引的代价

 59 create index idx_t_obj on t(object_id);

 60 alter table T modify object_id not null;

 61 set autotrace on

 62 select count(*) from t;

 63 

 64 

 65 

 66 

 67 

 68   COUNT(*)

 69 ----------

 70   4684992

 71 普通索引的执行计划

 72 ---------------------------------------------------------------------------

 73 | Id  | Operation             | Name      | Rows  | Cost (%CPU)| Time     |

 74 ---------------------------------------------------------------------------

 75 |   0 | SELECT STATEMENT      |           |     1 |  3047   (2)| 00:00:37 |

 76 |   1 |  SORT AGGREGATE       |           |     1 |            |          |

 77 |   2 |   INDEX FAST FULL SCAN| IDX_T_OBJ |  4620K|  3047   (2)| 00:00:37 |

 78 ---------------------------------------------------------------------------

 79 普通索引的统计信息

 80 ----------------------------------------------------------

 81           0  recursive calls

 82           0  db block gets

 83       10998  consistent gets

 84           0  physical reads

 85           0  redo size

 86         426  bytes sent via SQL*Net to client

 87         415  bytes received via SQL*Net from client

 88           2  SQL*Net roundtrips to/from client

 89           0  sorts (memory)

 90           0  sorts (disk)

 91           1  rows processed

 92 

 93 

 94 

 95 

 96 --观察COUNT(*)用位图索引的代价(注意,这里我们特意取了status这个重复度很高的列做索引)

 97 create bitmap index idx_bitm_t_status on t(status);

 98 select count(*) from t;

 99 

100 SQL> select count(*) from t;

101 

102 

103 

104 

105 

106 

107   COUNT(*)

108 ----------

109   4684992

110 

111 位图索引的执行计划

112 -------------------------------------------------------------------------------------------

113 | Id  | Operation                     | Name              | Rows  | Cost (%CPU)| Time     |

114 -------------------------------------------------------------------------------------------

115 |   0 | SELECT STATEMENT              |                   |     1 |   115   (0)| 00:00:02 |

116 |   1 |  SORT AGGREGATE               |                   |     1 |            |          |

117 |   2 |   BITMAP CONVERSION COUNT     |                   |  4620K|   115   (0)| 00:00:02 |

118 |   3 |    BITMAP INDEX FAST FULL SCAN| IDX_BITM_T_STATUS |       |            |          |

119 -------------------------------------------------------------------------------------------

120 位图索引的统计信息

121 ----------------------------------------------------------

122           0  recursive calls

123           0  db block gets

124         125  consistent gets

125           0  physical reads

126           0  redo size

127         426  bytes sent via SQL*Net to client

128         415  bytes received via SQL*Net from client

129           2  SQL*Net roundtrips to/from client

130           0  sorts (memory)

131           0  sorts (disk)

132           1  rows processed

133          

134  
位图索引与普通索引比较以及执行计划

 

 

二、反向索引

 假如 我现在有些ID  100001,100002,100003,100004,100005 ,那么反向索引 ,他的索引创建的就是  100001,200001,300001,400001,500001。 由于序列本身有序,会根据范围放在不同的叶子块

详见:索引,组合索引篇      那么索引就被放在不同的快中,有效的减少了热快争用。

Oracle 优化——位图、函数等索引介绍

再看一下这张图, 最下面就是叶子块  ,100001 和200001 和300001 会放在不同的块中,而一般常常会频繁的访问近期的数据,那么由于他们在不同的块中,在索引进行检索的时候,能够有效的减少资源竞争。

创建反向索引的sql

---反向索引

create index rev_index on t(column) reverse;



---将反向索引转换成普通索引。

alter index rev_index  rebuild noreverse;

2、反向索引,在进行范围查询的时候无效,  

3、反向索引无序了,所以无法走索引排序,

 

三、函数索引

我们现在,有一个场景:有一列数据是有大小写的,但是查询的时候,不需要区分大小写。

那么语句只能这么写    select * from t  where upper(object_name)='T' ;

 

首先有一个常识,就是  走了函数查询,不会走索引。 就像有些查询 列的类型与值类型不匹,会进行值类型函数转换,然后无法进行索引查询

eg: id为varchar类型  而查询语句为:select * from t where id = 1。  由于数据字段为varchar类型,而参数为number 类型,故会进行值类型转换。检索就走了全表扫描。

 

那么如何实现场景需求呢? 只能让函数索引一展身手:

create index idx_func_ojbnam on t( upper(object_name) );

upper()是Oracle内部函数

现在   select * from t  where upper(object_name)='T' ;   这条语句就能走上索引。

请看具体案例

 1 --测函数索引前准备

 2 drop table t purge;

 3 create table t as select * from dba_objects;

 4 create index idx_object_id on t(object_id);

 5 create index idx_object_name on t(object_name);

 6 create index idx_created on t(created);

 7 

 8 

 9 

10 

11 --对列做UPPER操作,无法用到索引

12 set autotrace traceonly

13 set linesize 1000

14 ---以下语句由于列运算,所以走的是全表扫描

15 select * from t  where upper(object_name)='T' ;

16 执行计划

17 --------------------------------------------------------------------------

18 | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

19 --------------------------------------------------------------------------

20 |   0 | SELECT STATEMENT  |      |    12 |  2484 |   293   (1)| 00:00:04 |

21 |*  1 |  TABLE ACCESS FULL| T    |    12 |  2484 |   293   (1)| 00:00:04 |

22 --------------------------------------------------------------------------

23 统计信息

24 ----------------------------------------------------------

25           0  recursive calls

26           0  db block gets

27        1049  consistent gets

28           0  physical reads

29           0  redo size

30        1500  bytes sent via SQL*Net to client

31         415  bytes received via SQL*Net from client

32           2  SQL*Net roundtrips to/from client

33           0  sorts (memory)

34           0  sorts (disk)

35           2  rows processed

36           

37 --去掉列的UPPER操作后立即用索引

38 select * from t  where  object_name='T' ;

39 执行计划

40 ----------------------------------------------------------

41 Plan hash value: 1138138579

42 

43 -----------------------------------------------------------------------------------------------

44 | Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |

45 -----------------------------------------------------------------------------------------------

46 |   0 | SELECT STATEMENT            |                 |     2 |   414 |     4   (0)| 00:00:01 |

47 |   1 |  TABLE ACCESS BY INDEX ROWID| T               |     2 |   414 |     4   (0)| 00:00:01 |

48 |*  2 |   INDEX RANGE SCAN          | IDX_OBJECT_NAME |     2 |       |     3   (0)| 00:00:01 |

49 -----------------------------------------------------------------------------------------------

50 统计信息

51 ----------------------------------------------------------

52           0  recursive calls

53           0  db block gets

54           6  consistent gets

55           0  physical reads

56           0  redo size

57        1506  bytes sent via SQL*Net to client

58         415  bytes received via SQL*Net from client

59           2  SQL*Net roundtrips to/from client

60           0  sorts (memory)

61           0  sorts (disk)

62           2  rows processed      

63           

64               

65 --如果必须用upper的条件,那你想用到索引,就得去建函数索引

66 create index idx_func_ojbnam on t(upper(object_name));

67 --继续执行,终于走索引了。

68 

69 select * from t  where upper(object_name)='T' ;

70 执行计划

71 -----------------------------------------------------------------------------------------------

72 | Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |

73 -----------------------------------------------------------------------------------------------

74 |   0 | SELECT STATEMENT            |                 |   775 |   206K|   152   (0)| 00:00:02 |

75 |   1 |  TABLE ACCESS BY INDEX ROWID| T               |   775 |   206K|   152   (0)| 00:00:02 |

76 |*  2 |   INDEX RANGE SCAN          | IDX_FUNC_OJBNAM |   310 |       |     3   (0)| 00:00:01 |

77 -----------------------------------------------------------------------------------------------

78 统计信息

79 ----------------------------------------------------------

80           0  recursive calls

81           0  db block gets

82           6  consistent gets

83           0  physical reads

84           0  redo size

85        1500  bytes sent via SQL*Net to client

86         415  bytes received via SQL*Net from client

87           2  SQL*Net roundtrips to/from client

88           0  sorts (memory)

89           0  sorts (disk)

90           2  rows processed

91 

92 

93 

94 

95 

96  

97  
场景中使用函数索引的案例实战(可以直接运行)
 1 /*

 2   结论:什么类型就放什么值,否则会发生类型转换,导致系能问题!

 3 (是存放字符的字段就设varchar2类型,是存放数值的字段就设置number类型,是存放日期的字段就设置date类型)

 4  这里的案例宏中

 5  select * from t_col_type where id=6; 用不到索引,要改成select * from t_col_type where id='6';

 6  如果送来的参数无法保证是'6',只能写成select * from t_col_type where to_number(id)=6;并且建to_number(id)的函数索引

 7  方可,这是很无奈的事。

 8   

 9 */

10 

11 

12 --举例说明:

13 drop table t_col_type purge;

14 create table t_col_type(id varchar2(20),col2 varchar2(20),col3 varchar2(20));

15 insert into t_col_type select rownum,'abc','efg' from dual connect by level<=10000;

16 commit;

17 create index idx_id on t_col_type(id);

18 set linesize 1000

19 set autotrace traceonly

20 

21 select * from t_col_type where id=6;

22 

23 执行计划

24 --------------------------------------------------------------------------------

25 | Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |

26 --------------------------------------------------------------------------------

27 |   0 | SELECT STATEMENT  |            |     1 |    36 |     9   (0)| 00:00:01 |

28 |*  1 |  TABLE ACCESS FULL| T_COL_TYPE |     1 |    36 |     9   (0)| 00:00:01 |

29 --------------------------------------------------------------------------------

30    1 - filter(TO_NUMBER("ID")=6)

31 统计信息

32 ----------------------------------------------------------

33           0  recursive calls

34           0  db block gets

35          32  consistent gets

36           0  physical reads

37           0  redo size

38         540  bytes sent via SQL*Net to client

39         415  bytes received via SQL*Net from client

40           2  SQL*Net roundtrips to/from client

41           0  sorts (memory)

42           0  sorts (disk)

43           1  rows processed

44 

45 --实际上只有如下写法才可以用到索引,这个很不应该,是什么类型的取值就设置什么样的字段。

46           

47 select * from t_col_type where id='6';

48 执行计划

49 ------------------------------------------------------------------------------------------

50 | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |

51 ------------------------------------------------------------------------------------------

52 |   0 | SELECT STATEMENT            |            |     1 |    36 |     2   (0)| 00:00:01 |

53 |   1 |  TABLE ACCESS BY INDEX ROWID| T_COL_TYPE |     1 |    36 |     2   (0)| 00:00:01 |

54 |*  2 |   INDEX RANGE SCAN          | IDX_ID     |     1 |       |     1   (0)| 00:00:01 |

55 ------------------------------------------------------------------------------------------

56    2 - access("ID"='6')

57 统计信息

58 ----------------------------------------------------------

59           0  recursive calls

60           0  db block gets

61           4  consistent gets

62           0  physical reads

63           0544  bytes sent via SQL*Net to client

65         415  bytes received via SQL*Net from client

66           2  SQL*Net roundtrips to/to/from client

67           0  sorts (memory)

68           0  sorts (disk)

69           1  rows processed          

70 

71 create index idx_func_tonumber_id on t_col_type(to_number(id));         

72 select * from t_col_type where to_number(id)=6;        

73lt;span style="color: #008080;">75 ----------------------------------------------------------------------------------------------------

76 | Id  | Operation                   | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |

77t;>| Time     |

77 ----------------------------------------------------------------------------------------------------

78 |   0 | SELECT STATEMENT            |                      |   100 |  4900 |     2   (0)| 00:00:01 |

79 |   1 |  TABLE ACCESS BY INDEX ROWID| T_COL_TYPE           |   100 |  4900 |     2   (0)| 00:00:01 |

80 |*  2 |   INDEX RANGE SCAN          | IDX_FUNC_TONUMBER_ID |    40 |       |     1   (0)| 00:00:01 |

81 ----------------------------------------------------------------------------------------------------

82 2 - access(TO_NUMBER("ID")=6)

83 统计信息

84 ----------------------------------------------------------

85           0  recursive calls

86           0  db block gets

87           4  consistent gets

88           0  physical reads

89           0  redo size

90         540  bytes sent via SQL*Net to client

91         416  bytes received via SQL*Net from client

92           2  SQL*Net roundtrips to/from client

93           0  sorts (memory)

94           0  sorts (disk)

95           1  rows processed  
列类型不匹配导致无法走索引的案例实战

 

2、自定义函数索引

首先创建一个自定义函数让 id-1 的形式创建序列,当然没什么意义。

create or replace function f_minus1(i int)

   return int DETERMINISTIC is

    begin

    return(i-1);

   end;
create index idx_test on test (f_minus1(object_id));
DETERMINISTIC关键字很重要

 

 

四、全文检索

干、、写了四个小时,Google 浏览器崩溃了。

你可能感兴趣的:(oracle)