一、前言
生产中偶尔会碰到一些sql,有多种执行计划,其中部分情况是统计信息过旧造成的,重新收集下统计信息就行了。但是有些时候重新收集统计信息也解决不了问题,而开发又在嗷嗷叫,没时间让你去慢慢分析原因的时候,这时临时的解决办法是通过spm去固定一个正确的执行计划,等找到真正原因后再解除该spm。
二、解决办法
1. 通过dbms_xplan.display_cursor查看指定sql都有哪些执行计划
SQL> select * from table(dbms_xplan.display_cursor('&sql_id',null,'TYPICAL PEEKED_BINDS'));
Enter value for sql_id: 66a4184u0t6hn old 1: select * from table(dbms_xplan.display_cursor('&sql_id',null,'TYPICAL PEEKED_BINDS')) new 1: select * from table(dbms_xplan.display_cursor('66a4184u0t6hn',null,'TYPICAL PEEKED_BINDS')) SQL_ID 66a4184u0t6hn, child number 0 ------------------------------------- select /*for_test*/ * from test1 where object_id = 1 Plan hash value: 4122059633 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 693 (100)| | |* 1 | TABLE ACCESS FULL| TEST1 | 173K| 15M| 693 (1)| 00:00:09 | --------------------------------------------------------------------------- SQL_ID 66a4184u0t6hn, child number 1 ------------------------------------- select /*for_test*/ * from test1 where object_id = 1 Plan hash value: 2214001748 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| TEST1 | 11 | 1056 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST1 | | | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------------------
2. 查询该sql的历史执行情况
SQL> col snap_id for 99999999
SQL> col date_time for a30
SQL> col plan_hash for 9999999999
SQL> col executions for 99999999
SQL> col avg_etime_s heading 'etime/exec' for 9999999.99
SQL> col avg_lio heading 'buffer/exec' for 99999999999
SQL> col avg_pio heading 'diskread/exec' for 99999999999
SQL> col avg_cputime_s heading 'cputim/exec' for 9999999.99
SQL> col avg_row heading 'rows/exec' for 9999999
SQL> select * from(
select distinct
s.snap_id,
to_char(s.begin_interval_time,'mm/dd/yy_hh24mi') || to_char(s.end_interval_time,'_hh24mi') date_time,
sql.plan_hash_value plan_hash,
sql.executions_delta executions,
(sql.elapsed_time_delta/1000000)/decode(sql.executions_delta,null,1,0,1,sql.executions_delta) avg_etime_s,
sql.buffer_gets_delta/decode(sql.executions_delta,null,1,0,1,sql.executions_delta) avg_lio,
sql.disk_reads_delta/decode(sql.executions_delta,null,1,0,1,sql.executions_delta) avg_pio,
(sql.cpu_time_delta/1000000)/decode(sql.executions_delta,null,1,0,1,sql.executions_delta) avg_cputime_s,
sql.rows_processed_total/decode(sql.executions_delta,null,1,0,1,sql.executions_delta) avg_row
from dba_hist_sqlstat sql, dba_hist_snapshot s
where sql.instance_number =(select instance_number from v$instance)
and sql.dbid =(select dbid from v$database)
and s.snap_id = sql.snap_id
and sql_id = trim('&sql_id') order by s.snap_id desc)
where rownum <= 100;
Enter value for sql_id: 66a4184u0t6hn old 16: and sql_id = trim('&sql_id') order by s.snap_id desc) new 16: and sql_id = trim('66a4184u0t6hn') order by s.snap_id desc) SNAP_ID DATE_TIME PLAN_HASH EXECUTIONS etime/exec buffer/exec diskread/exec cputim/exec rows/exec --------- ------------------------------ ----------- ---------- ----------- ------------ ------------- ----------- --------- 39 08/16/19_1500_1600 2214001748 1 .12 25839 2901 .10 173927 39 08/16/19_1500_1600 4122059633 3 .11 13992 847 .11 173927
3. 绑定执行计划
从前两步中可以看到该sql有两条执行计划,假如plan_hash_value为’2214001748’才是对的,而此时数据库选择的是另一条执行计划,我们可以通过执行以下function去将执行计划固定为我们想要的。
SQL> var temp number;
SQL> begin
:temp := dbms_spm.load_plans_from_cursor_cache(sql_id=>'66a4184u0t6hn', plan_hash_value=>2214001748);
end;
/
三、做个实验
1. 准备测试表
实验环境,使用scott账号,并给scott赋予dba权限
SQL> create table test1 as select * from dba_objects;
SQL> insert into test1 select * from test1;
SQL> update test1 set object_id = 1 where rownum < (select count(*) from test1) - 10;
SQL> commit;
SQL> select object_id, count(*) from test1 group by object_id;
OBJECT_ID COUNT(*) ---------- ---------- 1 173927 82112 1 82121 1 82118 1 82119 1 82122 1 82113 1 82114 1 82120 1 82115 1 82116 1 82117 1
2. 创建索引并收集统计信息
SQL> create index idx_test1 on test1(object_id) online;
SQL> begin
dbms_stats.gather_table_stats(ownname => 'SCOTT',
tabname => 'TEST1',
cascade => true,
method_opt => 'for columns object_id size 10',
no_invalidate => false);
end;
/
3. 通过修改优化器模式,模拟同样的sql产生两条不同的执行计划
开启一个窗口A
SQL> set autot trace
SQL> alter session set optimizer_mode = all_rows; // 11g默认的值
SQL> select /*for_test*/ * from test1 where object_id = 1;
Execution Plan ---------------------------------------------------------- Plan hash value: 4122059633 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 173K| 15M| 693 (1)| 00:00:09 | |* 1 | TABLE ACCESS FULL| TEST1 | 173K| 15M| 693 (1)| 00:00:09 | ---------------------------------------------------------------------------
开启另一个窗口B
SQL> set autot trace
SQL> alter session set optimizer_mode = first_rows_10;
SQL> select /*for_test*/ * from test1 where object_id = 1;
Execution Plan ---------------------------------------------------------- Plan hash value: 2214001748 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 1056 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST1 | 11 | 1056 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST1 | | | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------------------
再开启一个窗口C
SQL> select sql_id, sql_text, optimizer_mode, plan_hash_value, child_number from v$sql where sql_text like 'select /*for_test*/ * from test1%';
SQL_ID SQL_TEXT OPTIMIZER_ PLAN_HASH_VALUE CHILD_NUMBER ------------- ------------------------------------------------------- ---------- --------------- ------------ 66a4184u0t6hn select /*for_test*/ * from test1 where object_id = 1 ALL_ROWS 4122059633 0 66a4184u0t6hn select /*for_test*/ * from test1 where object_id = 1 FIRST_ROWS 2214001748 1
可以看到,因为优化器模式的不同,相同的sql产生了两条截然不同的执行计划
当optimizer_mode = all_rows为全表扫描,当optimizer_mode = first_rows_10为索引扫描
4. 绑定执行计划
再新开一个窗口D,执行
SQL> set autot trace
SQL> select /*for_test*/ * from test1 where object_id = 1;
Execution Plan ---------------------------------------------------------- Plan hash value: 4122059633 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 173K| 15M| 693 (1)| 00:00:09 | |* 1 | TABLE ACCESS FULL| TEST1 | 173K| 15M| 693 (1)| 00:00:09 | ---------------------------------------------------------------------------
可以看到执行计划为全表扫描,跟窗口A一样,这个是正常的
通过执行以下function去将执行计划固定为索引扫描
SQL> var temp number;
SQL> begin
:temp := dbms_spm.load_plans_from_cursor_cache(sql_id=>'66a4184u0t6hn', plan_hash_value=>2214001748);
end;
/
再执行以下sql
SQL> select /*for_test*/ * from test1 where object_id = 1;
Execution Plan ---------------------------------------------------------- Plan hash value: 2214001748 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 1056 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST1 | 11 | 1056 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST1 | 173K| | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Note ----- - SQL plan baseline "SQL_PLAN_9657urkb9u2tnf24a05ff" used for this statement
可以看到spm已经生效了
四、删除spm
当我们找到sql执行计划突变的原因了,解决问题之后,就可以删除spm了。如何删除spm呢?
新开窗口E
查看当前sql的执行计划基线
SQL> select sql_handle, plan_name, origin from dba_sql_plan_baselines where sql_text like 'select /*for_test*/ * from test1%';
SQL_HANDLE PLAN_NAME ORIGIN ------------------------------ ------------------------------ -------------- SQL_9314fabc969d0b34 SQL_PLAN_9657urkb9u2tnf24a05ff MANUAL-LOAD SQL_9314fabc969d0b34 SQL_PLAN_9657urkb9u2tnfe026eff AUTO-CAPTURE
可以看到该sql有两条PLAN_NAME,一个是系统自动捕获的,一个是我们手工绑定的,反正我们不再需要这个了,统统删除
通过执行以下function去将执行计划基线删除
SQL> var temp number;
SQL> begin
:temp := dbms_spm.drop_sql_plan_baseline(sql_handle=>'SQL_9314fabc969d0b34', plan_name=>NULL);
end;
/
查看当前sql的执行计划基线
SQL> select sql_handle, plan_name, origin from dba_sql_plan_baselines where sql_text like 'select /*for_test*/ * from test1%';
no rows selected
再在窗口D中执行以下sql
SQL> select /*for_test*/ * from test1 where object_id = 1;
Execution Plan ---------------------------------------------------------- Plan hash value: 4122059633 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 173K| 15M| 693 (1)| 00:00:09 | |* 1 | TABLE ACCESS FULL| TEST1 | 173K| 15M| 693 (1)| 00:00:09 | ---------------------------------------------------------------------------
可以看到执行计划又变成默认的全表扫描了
五、说明
文章例子整理于《基于oracle的sql优化》,后面将写另一个场景,就是如果系统里就一个执行计划,但是该执行计划是有问题的,如何去手工生成一个正确的执行计划,然后绑定。