最近听了老猫讲的 extended stats 统计信息的收集, 感觉很好, 在网上搜了搜, 发现 https://blogs.oracle.com/optimizer/extended-statistics , 有类似范例, 根据这个范例自己做了下实验。 记录实验的过程。
目的:
1. 测试下 dbms_stats 的 extended stats 包的使用效果, 是否可以在选择率低的情况下,解决 执行计划/cardinatlity 基数行不准的问题,
并且是只验证 extended stats 包的使用效果 。 :D:
环境:
1. windows 7 64 位
2. oracle 12.2.0.1 64 位
3. 使用demo程序的 sales history , SH 用户 测试 。
4. 先手动安装下 sales history 范例数据
步骤 :
1. sales 表先插入 50 行,
2. 分析表,获得统计信息
3. 插入大量数据, 让统计信息不准
4. 使用extended status 分析包
5. 对比执行计划
步骤1-2 :
alter session set statistics_level = all ;
set linesize 120
show parameter statistics_level ;
NAME TYPE VALUE
------------------------------------ ---------------------- --------------------
statistics_level string ALL
SQL>
insert into sales select * from sales_02 where rownum <= 50 ;
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'SALES');
select column_name, num_distinct, histogram from user_tab_columns where table_name = 'SALES' ;
SQL> select column_name, num_distinct, histogram from user_tab_columns where table_name = 'SALES' ;
COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ------------------------------
PROD_ID 1 FREQUENCY
CUST_ID 50 NONE
TIME_ID 2 NONE
CHANNEL_ID 2 FREQUENCY
PROMO_ID 1 FREQUENCY
QUANTITY_S 1 NONE
AMOUNT_SOL 2 FREQUENCY
select num_rows from user_tables where table_name = 'SALES' ;
SQL> select num_rows from user_tables where table_name = 'SALES' ;
NUM_ROWS
----------
50
select count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
COUNT(*)
----------
309
已用时间: 00: 00: 00.02
SQL>
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 18vj1zs6jut5g, child number 0
-------------------------------------
select count(*) from sales a where amount_sold > ( select
avg(amount_sold) from sales b where b.prod_id = a.prod_id and
channel_id = 3 and promo_id = 999 ) and channel_id = 3 and promo_id =
999
Plan hash value: 1265065521
----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 14 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 14 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.01 | 14 |
|* 3 | TABLE ACCESS FULL | SALES | 1 | 48 | 49 |00:00:00.01 | 7 |
| 4 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 7 |
|* 5 | TABLE ACCESS FULL| SALES | 1 | 48 | 49 |00:00:00.01 | 7 |
----------------------------------------------------------------------------------------
CHANNEL_ID 2 FREQUENCY
PROMO_ID 1 FREQUENCY
NUM_ROWS
----------
50
计算 cardinality
1/2 * 1 * 50 = 25 ,
基数算出来是 25
步骤3:
SQL> select count(*) from sales ;
COUNT(*)
----------
2756579
再次执行sql, 并看执行计划:
select count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL_ID 18vj1zs6jut5g, child number 0
-------------------------------------
select count(*) from sales a where amount_sold > ( select
avg(amount_sold) from sales b where b.prod_id = a.prod_id and
channel_id = 3 and promo_id = 999 ) and channel_id = 3 and promo_id =
999
Plan hash value: 1265065521
----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:01:53.45 | 2336K|
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:01:53.45 | 2336K|
|* 2 | FILTER | | 1 | | 709K|00:01:53.33 | 2336K|
|* 3 | TABLE ACCESS FULL | SALES | 1 | 48 | 1554K|00:00:00.98 | 13431 |
| 4 | SORT AGGREGATE | | 173 | 1 | 173 |00:01:51.30 | 2323K|
|* 5 | TABLE ACCESS FULL| SALES | 173 | 48 | 2131K|00:01:50.93 | 2323K|
----------------------------------------------------------------------------------------
48 VS 2131000 = 50000 倍
SQL> select num_rows from user_tables where table_name = 'SALES' ;
NUM_ROWS
----------
50
使用 extended 包
select dbms_stats.create_extended_stats(ownname => 'SH' , tabname => 'SALES' , extension => '(CHANNEL_ID,PROMO_ID)' ) from dual ;
DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>'SH',TABNAME=>'SALES',EXTENSION=>'(CHANNEL_ID,PROMO_ID)')
------------------------------------------------------------------------------------------------------------------------
SYS_STU7$MLVU9QOBUF89709XS1VC9
已用时间: 00: 00: 01.65
SQL>
SQL>
exec dbms_stats.gather_table_stats(null,'SALES', method_opt => 'for columns SYS_STU7$MLVU9QOBUF89709XS1VC9 size 2' );
select count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
alter system flush shared_pool;
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
SQL_ID 18vj1zs6jut5g, child number 0
-------------------------------------
select count(*) from sales a where amount_sold > ( select
avg(amount_sold) from sales b where b.prod_id = a.prod_id and
channel_id = 3 and promo_id = 999 ) and channel_id = 3 and promo_id =
999
Plan hash value: 4009253081
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.79 | 26898 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.79 | 26898 | | | |
|* 2 | HASH JOIN | | 1 | 94910 | 709K|00:00:01.79 | 26898 | 1316K| 1316K| 1666K (0)|
| 3 | VIEW | VW_SQ_1 | 1 | 72 | 72 |00:00:00.92 | 13431 | | | |
| 4 | HASH GROUP BY | | 1 | 72 | 72 |00:00:00.92 | 13431 | 1106K| 1106K| 2480K (0)|
|* 5 | TABLE ACCESS FULL| SALES | 1 | 1554K| 1554K|00:00:00.48 | 13431 | | | |
|* 6 | TABLE ACCESS FULL | SALES | 1 | 1554K| 1554K|00:00:00.49 | 13431 | | | |
----------------------------------------------------------------------------------------------------------------------
SQL> select num_rows from user_tables where table_name = 'SALES' ;
NUM_ROWS
----------
2756579
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'SALES');
SQL> select column_name, num_distinct, histogram from user_tab_columns where table_name = 'SALES' ;
COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ------------------------------
PROD_ID 72 FREQUENCY
CUST_ID 7059 NONE
TIME_ID 1460 NONE
CHANNEL_ID 4 FREQUENCY
PROMO_ID 4 FREQUENCY
QUANTITY_S 1 NONE
AMOUNT_SOL 3586 HYBRID
已选择 7 行。
CHANNEL_ID PROMO_ID 各选择一个值,就是:
1/4 * 1/4 * 2756579 = 172286
基数是 172286
estimated rows : 1554000
去掉统计信息
这些都无效,
exec DBMS_STATS.DROP_EXTENDED_STATS(null,'SALES','(CHANNEL_ID,PROMO_ID)');
exec DBMS_STATS.DELETE_TABLE_STATS(ownname => 'SH', tabname => 'SALES');
select ADDRESS, HASH_VALUE from V$SQLAREA where SQL_ID = '18vj1zs6jut5g';
ADDRESS HASH_VALUE
---------------- ----------
000007FF0465AB40 220030127
exec SYS.DBMS_SHARED_POOL.PURGE ('000007FF0465AB40,220030127', 'C');
select count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
SQL> select column_name, num_distinct, histogram from user_tab_columns where table_name = 'SALES' ;
COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ------------------------------
PROD_ID NONE
CUST_ID NONE
TIME_ID NONE
CHANNEL_ID NONE
PROMO_ID NONE
QUANTITY_S NONE
AMOUNT_SOL NONE
已选择 7 行。
SQL> select num_rows from user_tables where table_name = 'SALES' ;
NUM_ROWS
----------
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
Plan hash value: 4009253081
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.67 | 26898 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.67 | 26898 | | | |
|* 2 | HASH JOIN | | 1 | 95637 | 709K|00:00:01.67 | 26898 | 1316K| 1316K| 1583K (0)|
| 3 | VIEW | VW_SQ_1 | 1 | 72 | 72 |00:00:00.87 | 13431 | | | |
| 4 | HASH GROUP BY | | 1 | 72 | 72 |00:00:00.87 | 13431 | 1106K| 1106K| 2480K (0)|
|* 5 | TABLE ACCESS FULL| SALES | 1 | 1566K| 1554K|00:00:00.45 | 13431 | | | |
|* 6 | TABLE ACCESS FULL | SALES | 1 | 1566K| 1554K|00:00:00.45 | 13431 | | | |
----------------------------------------------------------------------------------------------------------------------
最后还是truncate table ,重新建立数据
truncate table sales ;
insert into sales select * from sales_02 where rownum <= 50 ;
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'SALES');
SQL> select column_name, num_distinct, histogram from user_tab_columns where table_name = 'SALES' ;
COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ------------------------------
PROD_ID 1 FREQUENCY
CUST_ID 50 NONE
TIME_ID 2 NONE
CHANNEL_ID 2 FREQUENCY
PROMO_ID 1 FREQUENCY
QUANTITY_S 1 NONE
AMOUNT_SOL 2 FREQUENCY
已选择 7 行。
SQL> select num_rows from user_tables where table_name = 'SALES' ;
NUM_ROWS
----------
50
select count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
COUNT(*)
---------
709087
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
Plan hash value: 1265065521
----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:01:32.27 | 2339K|
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:01:32.27 | 2339K|
|* 2 | FILTER | | 1 | | 709K|00:01:32.13 | 2339K|
|* 3 | TABLE ACCESS FULL | SALES | 1 | 1 | 1554K|00:00:00.67 | 13571 |
| 4 | SORT AGGREGATE | | 173 | 1 | 173 |00:01:30.73 | 2326K|
|* 5 | TABLE ACCESS FULL| SALES | 173 | 1 | 2131K|00:01:30.44 | 2326K|
----------------------------------------------------------------------------------------
1 VS 2131k 差无数倍
1 - SEL$1
3 - SEL$1 / A@SEL$1
4 - SEL$2
5 - SEL$2 / B@SEL$2
使用 hint
select /*+UNNEST(@"SEL$2")*/ count(*) from sales a where amount_sold >
( select avg(amount_sold) from sales b
where b.prod_id = a.prod_id
and channel_id = 3 and promo_id = 999 )
and channel_id = 3 and promo_id = 999 ;
select * from table(dbms_xplan.display_cursor(NULL,NULL,'allstats last')) ;
Plan hash value: 4009253081
----------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.64 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.64 |
|* 2 | HASH JOIN | | 1 | 2 | 709K|00:00:01.64 |
| 3 | VIEW | VW_SQ_1 | 1 | 1 | 72 |00:00:00.85 |
| 4 | HASH GROUP BY | | 1 | 1 | 72 |00:00:00.85 |
|* 5 | TABLE ACCESS FULL| SALES | 1 | 48 | 1554K|00:00:00.44 |
|* 6 | TABLE ACCESS FULL | SALES | 1 | 48 | 1554K|00:00:00.44 |
----------------------------------------------------------------------------------
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/16993226/viewspace-2147758/,如需转载,请注明出处,否则将追究法律责任。