dbms_stats 包对段表的分析有三个层次:
表自身的分析:包括表中的行数、数据块数、行长等信息。
列的分析:包括列值的重复数、列上的空值、数据在列上的分布情况。
索引的分析:包括索引叶块的数量、索引的深度、索引的聚合因子等。
直方图有时对于CBO非常重要,特别是对于有字段数据非常倾斜的表,做直方图分析尤为重要,比如下面的例子:
SQL> show user USER 为 "U1" SQL> create table t3 as select 1 id,object_name name from test; 表已创建。 SQL> update t3 set id = 99 where rownum = 1; 已更新 1 行。 SQL> commit; 提交完成。 SQL> select id,count(*) from t3 group by id; ID COUNT(*) ---------- ---------- 1 402340 99 1 SQL> create index idx_t3_id on t3(id); 索引已创建。 SQL> exec dbms_stats.gather_table_stats(ownname=>'u1',tabname=>'t3',cascade=>true); PL/SQL 过程已成功完成。
我们创建了一张表,他的id字段数值严重倾斜,除了有一条记录为99外,其他的记录全部是1,然后对表做分析,默认情况下,dbms_stats包会对所有的列做直方图分析,可以从这个视图上查到相关的信息:
SQL> select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name = 'T3'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ------------------------------ --------------- --------------- -------------- T3 ID 0 1 T3 NAME 0 2.4504E+35 T3 ID 1 1 T3 NAME 1 6.2511E+35 SQL> select table_name,num_rows,avg_row_len,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') from user_tables; TABLE_NAME NUM_ROWS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ ------------------------------ ---------- ----------- ------------------- T3 400883 27 2011-12-10 03:41:55
现在来看让id值分布等于1,99的时候,CBO产生的执行计划的情况:
SQL> set autotrace traceonly; SQL> select * from t3 where id = 99; 执行计划 ---------------------------------------------------------- Plan hash value: 3977267011 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 27 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 27 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1 | | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=99) 统计信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 1 physical reads 0 redo size 466 bytes sent via SQL*Net to client 400 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed从这里我们可以看到,CBO根据直方图的信息知道id=99的记录只有一行,所以它选择了索引,这个选择非常正确。
SQL> set autotrace traceonly; SQL> select * from t3 where id = 1; 已选择402340行。 执行计划 ---------------------------------------------------------- Plan hash value: 4161002650 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 400K| 10M| 418 (4)| 00:00:06 | |* 1 | TABLE ACCESS FULL| T3 | 400K| 10M| 418 (4)| 00:00:06 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID"=1) 统计信息 ---------------------------------------------------------- 2 recursive calls 0 db block gets 28495 consistent gets 1728 physical reads 0 redo size 12140483 bytes sent via SQL*Net to client 295442 bytes received via SQL*Net from client 26824 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 402340 rows processed同样,CBO从直方图的信息中得到id=1的值几乎与表的记录数一样,所以全表扫描被使用。现在让我们将直方图信息删除,保留表和索引的分析信息:
SQL> exec dbms_stats.delete_column_stats(user,'t3','id'); PL/SQL 过程已成功完成。 SQL> select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name = 'T3'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ------------------------------ --------------- --------------- -------------- T3 NAME 0 2.4504E+35 T3 NAME 1 6.2963E+35 SQL> select table_name,num_rows,avg_row_len,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') from user_tables where table_name = 'T3'; TABLE_NAME NUM_ROWS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ ------------------------------ ---------- ----------- ------------------- T3 403448 27 2011-12-10 04:12:42 SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name = 'T3'; BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED ---------- ----------- ------------- -------------- 2 786 2 10-12月-11可以看见id字段的信息已经被删除,索引和表的信息仍然存在,并且索引中甚至可以找到重复的数值只有2个,但是CBO却无法得到这两个重复的数值的分布情况,所以依然没有办法选出一个正确的执行计划。
SQL> set autotrace traceonly; SQL> select * from t3 where id = 1; 已选择402340行。 执行计划 ---------------------------------------------------------- Plan hash value: 3977267011 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4034 | 106K| 403 (1)| 00:00:05 | | 1 | TABLE ACCESS BY INDEX ROWID| T3 | 4034 | 106K| 403 (1)| 00:00:05 | |* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1614 | | 395 (0)| 00:00:05 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=1) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 28495 consistent gets 1309 physical reads 0 redo size 12140483 bytes sent via SQL*Net to client 295442 bytes received via SQL*Net from client 26824 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 402340 rows processed SQL> select * from t3 where id = 99; 执行计划 ---------------------------------------------------------- Plan hash value: 3977267011 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4034 | 106K| 403 (1)| 00:00:05 | | 1 | TABLE ACCESS BY INDEX ROWID| T3 | 4034 | 106K| 403 (1)| 00:00:05 | |* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1614 | | 395 (0)| 00:00:05 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=99) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 466 bytes sent via SQL*Net to client 400 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
可以看见上面的执行计划已经是不正确的了。