官网网址参考:
https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_stats.htm#CIHBIEII
https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_4005.htm#i2150533
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::p11_question_id:5792247321358
https://docs.oracle.com/database/121/TGSQL/tgsql_histo.htm#TGSQL366
查询表上一次收集统计信息的时间:
1
|
select owner,table_name,last_analyzed from dba_tables where owner= 'SCOTT' ;
|
统计信息涉及的视图:
Column statistics appear in the data dictionary views USER_TAB_COLUMNS
, ALL_TAB_COLUMNS
, and DBA_TAB_COLUMNS
. Histograms appear in the data dictionary views USER_TAB_HISTOGRAMS
, DBA_TAB_HISTOGRAMS
, and ALL_TAB_HISTOGRAMS
; USER_PART_HISTOGRAMS
, DBA_PART_HISTOGRAMS
, and ALL_PART_HISTOGRAMS
; and USER_SUBPART_HISTOGRAMS
, DBA_SUBPART_HISTOGRAMS
, and ALL_SUBPART_HISTOGRAMS
.
收集统计信息主要有2种方法:
1. analyze
analyze可以用来收集表,索引,列以及系统的统计信息和直方图,以下为一些典型用法:
1
2
3
4
5
6
7
8
9
|
analyze table scott.emp compute statistics ; --收集所有的统计信息和直方图信息,包括表、列、索引。
analyze table scott.emp compute statistics for table ; --收集emp表的统计信息,不含列、索引统计信息和直方图。
analyze table scott.emp compute statistics for all columns; --收集所有列的统计信息和直方图(超大表较耗资源,因为只要列中有非空值,那么就会收集这个列的统计信息和直方图)。
analyze table scott.emp compute statistics for all indexed columns; --收集所有索引列的统计信息和直方图。
analyze table scott.emp compute statistics for all indexes; --收集所有索引统计信息,不含列的统计信息和直方图。
analyze table scott.emp compute statistics for columns 列1,列2; --收集2个列的统计信息和直方图。
analyze index idx_ename delete statistics ; --删除索引idx_ename的统计信息。
analyze table scott.emp delete statistics ; --删除表t1所有的表,列,索引的统计信息和列直方图。
analyze table scott.emp estimate statistics sample 15 percent for table ; --收集emp表的统计信息,以估算模式采样比例为15%进行收集,不含列、索引统计信息和直方图。
|
从语法可以看出,只有指定列统计信息收集时,才会收集相关列的直方图,此外收集直方图时for子句还可以加size子句,size的取值范围是1-254,默认值是75,表示直方图的buckets的最大数目。而dbms_stats包的size选择则有:数字|auto|repeat|skewonly选项,但analyze的size只能是数字。
关于直方图:
A histogram is a special type of column statistic that provides more detailed information about the data distribution in a table column. A histogram sorts values into "buckets," as you might sort coins into buckets.
从官网解释(参考第四个网址)来看,直方图就是一种特殊的列统计信息,这也与我们上边的推断相符,只有列才有直方图。
这里贴一个Tom Kyte用于查看analyze后统计信息的SQL:(已稍作改进,仅示例,这种格式的SQL不推荐,原SQL较简单参考第三个网址)
1
2
3
4
5
|
select t.num_rows as num_rows_in_table, i.index_name, i.num_rows as num_rows_in_index, co.num_analyzed_cols,ch.histogram_cnt
from ( select num_rows from user_tables where table_name = 'EMP' ) t,
( select index_name,num_rows from user_indexes where table_name = 'EMP' ) i,
( select count (*) as num_analyzed_cols from user_tab_columns where table_name= 'EMP' and num_distinct is not null ) co,
( select count ( distinct column_name) histogram_cnt from user_tab_histograms where table_name = 'EMP' ) ch;
|
需要注意的一点是for table选项在某些版本中并不只收集表统计信息,而是连列和索引的统计信息一块收集了,至于具体哪些版本的表现不同这里不做深究,使用上述SQL可以轻易的测试出你的analyze和dbms_stats语句到底收集了什么统计信息和直方图。
2. 调用dbms_stats包
dbms_stats与analyze的区别是:
analyze收集系统内部对象会报错,而dbms_stats不会
analyze不能正确的收集分区表的统计信息 ,而dbms_stats可以通过指定粒度来实现(granularity)。
analyze不能并行的收集统计信息,而dbms_stats可以(可以加上degree=>4来实现并行度为4的收集)。
Oracle推荐使用dbms_stats来收集统计信息,analyze将会被逐渐抛弃。
dbms_stats中负责收集统计信息的是以下几个存储过程:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
GATHER_DATABASE_STATS
--This procedure gathers statistics for all objects in the database.
GATHER_DICTIONARY_STATS
--This procedure gathers statistics for dictionary schemas 'SYS', 'SYSTEM' and schemas of RDBMS components.
GATHER_FIXED_OBJECTS_STATS
--This procedure gathers statistics for all fixed objects (dynamic performance tables).
GATHER_INDEX_STATS
--This procedure gathers index statistics. It attempts to parallelize as much of the work as possible. Restrictions are described in the individual parameters. This operation will not parallelize with certain types of indexes, including cluster indexes, domain indexes, and bitmap join indexes. The granularity and no_invalidate arguments are not relevant to these types of indexes.
GATHER_SCHEMA_STATS
--This procedure gathers statistics for all objects in a schema.
GATHER_SYSTEM_STATS
--This procedure gathers system statistics.
GATHER_TABLE_STATS
--This procedure gathers table and column (and index) statistics. It attempts to parallelize as much of the work as possible, but there are some restrictions as described in the individual parameters.
|
三个常用Procedure用法详解:GATHER_SCHEMA_STATS(两种用法)、GATHER_TABLE_STATS、GATHER_INDEX_STATS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
|
PROCEDURE GATHER_SCHEMA_STATS
Argument Name Type In / Out Default ?
------------------------------ ----------------------- ------ --------
OWNNAME VARCHAR2 IN
ESTIMATE_PERCENT NUMBER IN DEFAULT
BLOCK_SAMPLE BOOLEAN IN DEFAULT
METHOD_OPT VARCHAR2 IN DEFAULT
DEGREE NUMBER IN DEFAULT
GRANULARITY VARCHAR2 IN DEFAULT
CASCADE BOOLEAN IN DEFAULT
STATTAB VARCHAR2 IN DEFAULT
STATID VARCHAR2 IN DEFAULT
OPTIONS VARCHAR2 IN DEFAULT
OBJLIST OBJECTTAB OUT
STATOWN VARCHAR2 IN DEFAULT
NO_INVALIDATE BOOLEAN IN DEFAULT
GATHER_TEMP BOOLEAN IN DEFAULT
GATHER_FIXED BOOLEAN IN DEFAULT
STATTYPE VARCHAR2 IN DEFAULT
FORCE BOOLEAN IN DEFAULT
OBJ_FILTER_LIST OBJECTTAB IN DEFAULT
PROCEDURE GATHER_SCHEMA_STATS
Argument Name Type In / Out Default ?
------------------------------ ----------------------- ------ --------
OWNNAME VARCHAR2 IN
ESTIMATE_PERCENT NUMBER IN DEFAULT
BLOCK_SAMPLE BOOLEAN IN DEFAULT
METHOD_OPT VARCHAR2 IN DEFAULT
DEGREE NUMBER IN DEFAULT
GRANULARITY VARCHAR2 IN DEFAULT
CASCADE BOOLEAN IN DEFAULT
STATTAB VARCHAR2 IN DEFAULT
STATID VARCHAR2 IN DEFAULT
OPTIONS VARCHAR2 IN DEFAULT
STATOWN VARCHAR2 IN DEFAULT
NO_INVALIDATE BOOLEAN IN DEFAULT
GATHER_TEMP BOOLEAN IN DEFAULT
GATHER_FIXED BOOLEAN IN DEFAULT
STATTYPE VARCHAR2 IN DEFAULT
FORCE BOOLEAN IN DEFAULT
OBJ_FILTER_LIST OBJECTTAB IN DEFAULT
PROCEDURE GATHER_TABLE_STATS
Argument Name Type In / Out Default ?
------------------------------ ----------------------- ------ --------
OWNNAME VARCHAR2 IN
TABNAME VARCHAR2 IN
PARTNAME VARCHAR2 IN DEFAULT
ESTIMATE_PERCENT NUMBER IN DEFAULT
BLOCK_SAMPLE BOOLEAN IN DEFAULT
METHOD_OPT VARCHAR2 IN DEFAULT
DEGREE NUMBER IN DEFAULT
GRANULARITY VARCHAR2 IN DEFAULT
CASCADE BOOLEAN IN DEFAULT
STATTAB VARCHAR2 IN DEFAULT
STATID VARCHAR2 IN DEFAULT
STATOWN VARCHAR2 IN DEFAULT
NO_INVALIDATE BOOLEAN IN DEFAULT
STATTYPE VARCHAR2 IN DEFAULT
FORCE BOOLEAN IN DEFAULT
PROCEDURE GATHER_INDEX_STATS
Argument Name Type In / Out Default ?
------------------------------ ----------------------- ------ --------
OWNNAME VARCHAR2 IN
INDNAME VARCHAR2 IN
PARTNAME VARCHAR2 IN DEFAULT
ESTIMATE_PERCENT NUMBER IN DEFAULT
STATTAB VARCHAR2 IN DEFAULT
STATID VARCHAR2 IN DEFAULT
STATOWN VARCHAR2 IN DEFAULT
DEGREE NUMBER IN DEFAULT
GRANULARITY VARCHAR2 IN DEFAULT
NO_INVALIDATE BOOLEAN IN DEFAULT
STATTYPE VARCHAR2 IN DEFAULT
FORCE BOOLEAN IN DEFAULT
|
GATHER_SCHEMA_STATS参数详解:(其他存储过程的参数解释参见官方页面,很多参数description都是通用的)
详见: Table 103-30 GATHER_SCHEMA_STATS Procedure Parameters
一些实际用例:
1
2
3
|
EXEC DBMS_STATS.GATHER_SCHEMA_STATS( 'SCOTT' ,estimate_percent=>80,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,degree=>4, cascade => TRUE );
EXEC DBMS_STATS.GATHER_TABLE_STATS( 'SCOTT' , 'EMP' ,estimate_percent=>80,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,degree=>4, cascade => TRUE );
EXEC DBMS_STATS.GATHER_INDEX_STATS( 'SCOTT' , 'PK_EMP' ,estimate_percent=>80,degree=>4);
|
一些特别提示:
关于执行权限:
当然关于权限还有个取巧的办法,示例如下:
1
2
3
4
5
6
7
8
9
|
conn hr/hr
create or replace procedure gather_stats is
begin
dbms_stats.gather_table_stats( 'HR' , 'EMPLOYEES' );
end gather_stats;
/
grant select on hr.employees to scott;
grant execute on gather_stats to scott;
conn scott/scott exec hr.gather_stats;
|