----------------------Oracle性能优化的基本概念和原理--------------------
一、成本的含义
两层含义——
1、从表面看,是诸如执行计划等工具导致的执行次数
2、SQL语句执行的时候导致的实际的消耗
综上描述,将成本定义为指由优化器执行计算而导致的消耗,
也就是优化器对执行语句所用的时间的最优的估计。
优化器——数学模型,驻留在数据库上的程序、视图、触发器、作业。当执行SQL语句的时候,
会自动调用后台的程序结合视图的数据找出最优的SQL的执行策略。
二、优化器的模式
1、RBO优化器(RBO优化法则)——执行机制很简单,人为的通过既定的规则控制SQL语句的执行,
表的内容对执行计划是没有影响的,从oracle10g版本开始,RBO优化法则被全面淘汰。
2、CBO优化器(CBO优化法则)——执行SQL语句的时候,尝试找到成本最低的数据访问的方法,
而且考虑表和索引的内容。
CBO优化器又叫做基于成本的优化器、基于代价的优化法则、CBO优化器、CBO优化法则、
基于代价的优化器=CBO
参数:
SQL> show parameter optimizer;
NAME TYPE VALUE
------------------------------------ -------------------------------- ------------------------------
optimizer_dynamic_sampling integer 2
optimizer_features_enable string 10.2.0.4
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_mode string ALL_ROWS
optimizer_secure_view_merging boolean TRUE
optimizer_mode选项:
ALL_ROWS——用能返回结果集的所有行来表示最短时间内完成语句的执行计划,默认用CBO法则
FIRST_ROWS_n——用能返回结果集的前n行来表示最短时间内完成语句的执行计划,默认用CBO法则
例如:first_rows(20)
CHOOSE_oracle9i数据库的默认值,提供了一种时运行选择方式,
可以在RBO和CBO之间切换,建议在10g版本中不用。
RULE——基于规则的优化法则(RBO)
三、动态采样
在段对象(表、索引、分区)没有作统计分析(收集对象的性能数据)的前提下,
为了使得优化器得到足够的信息,以保证作出正确的执行计划而生的一种技术。
可以看成是分析手段的一个补充,直接从需要分析的对象收集一些数据块(采样)
来获得CBO需要的信息。
案例:
SQL> create table t as select owner,object_type from all_objects;
Table created.
SQL> select count(1) from t;
COUNT(1)
----------
40907
SQL> set linesize 10000
SQL> set autotrace traceonly exp --打开查看执行计划的工具
SQL> select * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 43528 | 1190K| 30 (4)| 00:00:01 | --执行计划估计的结果是有误差的。
| 1 | TABLE ACCESS FULL| T | 43528 | 1190K| 30 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement --这个SQL做了动态采样。
COST值是RBO优化法则中是没有的。
COST=将所有的消耗转化为单块读的次数。
禁用动态采样的话:
SQL> select /*+ dynamic_sampling(t 0) */ * from t; --采样的级别为0,表示禁用动态采样
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10047 | 274K| 29 (0)| 00:00:01 | --误差更大。
| 1 | TABLE ACCESS FULL| T | 10047 | 274K| 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> conn /as sysdba
Connected.
SQL> show parameter dyn
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2 --默认的动态采样的级别
系统级别禁用动态采样的话:
SQL> alter system set optimizer_dynamic_sampling=0 scope=both;
System altered.
SQL> show parameter dyn
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 0
SQL> select * from t; --不用加提示,直接禁用
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10047 | 274K| 29 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 10047 | 274K| 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> alter system set optimizer_dynamic_sampling=2 scope=both; --还原默认值
System altered.
SQL> show parameter dyn
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2
结论——有动态采样虽然有少许的误差,但是比不采样好好的多,最起码执行计划不会很离谱。
正常的情况:
1、收集统计分析信息
2、当用户执行SQL的时候,会根据收集到的统计分析信息生成执行计划。
3、按照执行计划去执行SQL语句
尝试手工分析(手工收集性能数据):
在SYS下:
SQL> exec dbms_stats.gather_table_stats('scott','t',estimate_percent=>100);
PL/SQL procedure successfully completed.
如果在scott下:
SQL> show user
USER is "SCOTT"
SQL> exec dbms_stats.gather_table_stats(user,'t',estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40907 | 559K| 30 (4)| 00:00:01 | --没有误差
| 1 | TABLE ACCESS FULL| T | 40907 | 559K| 30 (4)| 00:00:01 |
--------------------------------------------------------------------------
发现此时没有以下的字样:
Note
-----
- dynamic sampling used for this statement
执行计划会不会随着数据的改变而改变:
SQL> delete from t where rownum<=30000;
30000 rows deleted.
SQL> commit;
Commit complete.
SQL> select * from t; --执行计划的得出完全依赖于性能数据,性能数据不变,执行计划就不变。性能数据的改变依赖于统计信息的收集。
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40907 | 559K| 30 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 40907 | 559K| 30 (4)| 00:00:01 |
--------------------------------------------------------------------------
重新分析一次:
SQL> exec dbms_stats.gather_table_stats(user,'t',estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10907 | 149K| 29 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 10907 | 149K| 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
尝试将统计信息删除:
SQL> exec dbms_stats.delete_table_stats(user,'t'); --删掉性能数据
PL/SQL procedure successfully completed.
SQL> select * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12185 | 333K| 29 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 12185 | 333K| 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement --开始动态采样了。
SQL> delete from t;
10907 rows deleted.
SQL> commit;
Commit complete.
SQL> select * from t; --动态采样是会随着数据的变化而变化的,因为每次执行SQL,发现没有性能数据收集的话,强制动态采样。
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 29 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 1 | 28 | 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select /*+ dynamic_sampling(t 0) */ * from t; --禁用
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10047 | 274K| 29 (0)| 00:00:01 | --plan严重错误
| 1 | TABLE ACCESS FULL| T | 10047 | 274K| 29 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> truncate table t; --HWM下降了,extent回收。
Table truncated.
SQL> select /*+ dynamic_sampling(t 0) */ * from t; --不用动态采样的话,估计行数的时候是按照extent估算的。
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82 | 2296 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 82 | 2296 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> select * from t; --不是按照extent数量估算,而是按照实际来估算。
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 1 | 28 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
统计分析和动态采样的区别:
1、场景不一样——统计分析的结果会存储在磁盘(存储在数据字典表),
但是动态采样的分析结果是不存储的,每次SQL执行,动态采样都会强制执行
(前提是没有分析数据)。每次执行SQL的时候,
会根据这些分析的结果按照CBO数学模型来组装执行计划,执行计划存储在共享池的库缓存中,
执行计划需要存储在数据字典中的分析信息,这些内容会存储在数据字典高速缓存中。
显然,动态采样不具有共享性,随时变化,加大了资源的消耗。
2、分析范围指标不一样:动态采样靠加大采样的级别来加大分析的范围。
而统计分析是靠参数estimate_percent来加大分析的范围。
exec dbms_stats.gather_table_stats(user,'t',estimate_percent=>100);
3、分析程度不一样——动态采样是采样评估,统计分析是对CBO数学模型中众多的指标进行数学运算,
从本质上说,统计分析的结果会更加准确。
4、动态采样仅仅是一个补救措施(CBO数学模型的一个分支),
而不是得到准确的执行计划的专用方法。
加大采样级别:
SQL> insert into t select owner,object_type from all_objects;
40907 rows created.
SQL> commit;
Commit complete.
SQL> select /*+ dynamic_sampling(t 0) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82 | 2296 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 82 | 2296 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> select /*+ dynamic_sampling(t 1) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 44185 | 1208K| 29 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 44185 | 1208K| 29 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select /*+ dynamic_sampling(t 2) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 45659 | 1248K| 29 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 45659 | 1248K| 29 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select /*+ dynamic_sampling(t 3) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40907 | 1118K| 29 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 40907 | 1118K| 29 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select /*+ dynamic_sampling(t 4) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40907 | 1118K| 29 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 40907 | 1118K| 29 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select /*+ dynamic_sampling(t 10) */ * from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40907 | 1118K| 29 (4)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 40907 | 1118K| 29 (4)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
采样级别越高,越准确,但是给系统带来的损耗也越大!
或者直接将系统参数改为一个较大的值,不用写提示/*+ dynamic_sampling(t n) */,
直接用所调整的采样级别。