《Pro Oracle SQL》 读书笔记--Chapter 6--6.2 Execution Plans--之五

Using Plan Information for Solving Problems 使用计划信息解决问题 (Page 180)
     Now that you know how to access the various bits of information, what do you do with them?  The plan information, particularly the plan statistics, helps you confirm how the plan is performing.  You can use the information to determine if there are any trouble spots, so you can then adjust the way the SQL is written, add or modify indexes, or even use the data to support a need to update statistics or adjust instance parameter settings. 
     既然你知道如何访问各种信息了,那你用它们能做什么?计划信息,特别是计划统计,帮助你确认计划是如何执行的。你能用这些信息定位问题点,这样你就能调整 你的SQL(书写形式),加或者修改索引,或者通过这些数据去支持需要更新统计信息或者调整实例参数设定(的想法)。
If, for example, there is a missing or sub-optimal index, you can see that in the plan.  Listing 6-14
shows two examples: one shows how to determine an index is missing and the other shows how to
determine if an index is sub-optimal.
    例如,如果某处缺少或者欠优化索引,你能从计划中看出来。列表6-14展示两个例子:一个展示怎么确定缺少一个索引,另一个展示是否一个索引是欠优化的。
Listing 6-14. Using Plan Information to Determine Missing and Sub-Optimal Indexes
列表6-14 使用计划信息确定缺失和欠优化的索引
SQL> -- Example 1: sub-optimal index
SQL>
SQL> select /* KM1 */ job_id, department_id, last_name
  2  from employees
  3  where job_id = 'SA_REP'
  4  and department_id is null ;
 
JOB_ID       DEPARTMENT_ID LAST_NAME
---------- --------------- -------------------------
SA_REP     .               Grant
SQL>
SQL> @pln KM1
 
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID  cdqaq2k8dvvma, child number 0
-------------------------------------
select /* KM1 */ job_id, department_id,
last_name from employees where job_id = 'SA_REP' and department_id is
null

Plan hash value: 1019430118save
 
---------------------------------------------------------------------------------------
| Id  | Operation                                             | Name              | Starts | E-Rows | A-Rows  | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                        |      1  |                |      1       |       4   |
|*  1 |   TABLE ACCESS BY INDEX ROWID      | EMPLOYEES    |      1  |      1        |      1       |       4   |
|*  2 |    INDEX RANGE SCAN                         | EMP_JOB_IX   |      1  |     30        |     30       |       2   |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("DEPARTMENT_ID" IS NULL)
   2 - access("JOB_ID"='SA_REP')
 
SQL> 
SQL> create index emp_job_dept_ix on employees (department_id, job_id) compute statistics ;   (page181)
SQL>
SQL> select /* KM2 */ job_id, department_id, last_name
  2  from employees
  3  where job_id = 'SA_REP'
  4  and department_id is null ;
 
JOB_ID       DEPARTMENT_ID LAST_NAME
---------- --------------- -------------------------
SA_REP     .               Grant
SQL>
SQL> @pln KM2
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  b4wnf48g9pgzy, child number 0
-------------------------------------
select /* KM2 */ job_id, department_id, last_name from employees where
job_id = 'SA_REP' and department_id is null
 
Plan hash value: 798439539
 --------------------------------------------------------------------------------------------
| Id  | Operation                                           | Name                        | Starts  | E-Rows | A-Rows | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                                   |      1   |               |      1      |       2   |
|   1 |  TABLE ACCESS BY INDEX ROWID     | EMPLOYEES             |      1   |      1       |      1      |       2   |
|*  2 |   INDEX RANGE SCAN                       | EMP_JOB_DEPT_IX  |      1   |      1       |      1      |       1   |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("DEPARTMENT_ID" IS NULL AND "JOB_ID"='SA_REP')
       filter("JOB_ID"='SA_REP')
 
SQL> -- Example 2: missing index
SQL>
SQL> select /* KM3 */ last_name, phone_number
  2  from employees
  3  where phone_number = '650.507.9822';                                                  (page182)
LAST_NAME                 PHONE_NUMBER
------------------------- --------------------
Feeney                    650.507.9822
SQL>
SQL> @pln KM3
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
SQL_ID  8vzwg0vkrjp8r, child number 0
-------------------------------------
select /* KM3 */ last_name, phone_number from employees where
phone_number = '650.507.9822'
 
Plan hash value: 1445457117
 
----------------------------------------------------------------------------
| Id  | Operation                       | Name             | Starts | E-Rows | A-Rows | Buffers |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                        |      1  |               |      1      |       7   |
|*  1 |  TABLE ACCESS FULL  | EMPLOYEES |      1  |      1        |      1     |       7   |
----------------------------------------------------------------------------
 Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("PHONE_NUMBER"='650.507.9822')
 
SQL> column column_name format a22 heading 'Column Name'
SQL> column index_name heading 'Index Name'
SQL> column column_position format 999999999 heading 'Pos#'
SQL> column descend format a5 heading 'Order'
SQL> column column_expression format a40 heading 'Expression'
SQL>
SQL> break on index_name skip 1
SQL>
SQL> -- Check current indexes
SQL>
SQL> select  lower(b.index_name) index_name, b.column_position,
  2             b.descend, lower(b.column_name) column_name
  3  from       all_ind_columns b
  4  where      b.table_owner = 'HR'                                                         (page183)
  5  and        b.table_name = 'EMPLOYEES'
  6  order by b.index_name, b.column_position, b.column_name
  7  /
 
Index Name                           Pos#    Order   Column Name
------------------------------    ----------   --------   ------------------
emp_department_ix                 1        ASC     department_id
 
emp_email_uk                         1         ASC    email
 
emp_emp_id_pk                      1         ASC     employee_id
 
emp_job_dept_ix                     1         ASC     department_id
                                                2         ASC      job_id
 
emp_job_ix                              1         ASC      job_id
 
emp_manager_ix                    1          ASC     manager_id
 
emp_name_ix                         1           ASC    last_name
                                              2           ASC     first_name

SQL> -- Create new index on phone_number
SQL>
SQL> create index emp_phone_ix on employees (phone_number) compute statistics ;
SQL>
SQL> select /* KM4 */ last_name, phone_number
  2  from employees
  3  where phone_number = '650.507.9822';
 
LAST_NAME                 PHONE_NUMBER
------------------------- --------------------
Feeney                    650.507.9822
SQL>
SQL> @pln KM4
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
SQL_ID  3tcqa5jqsyzm0, child number 0
-------------------------------------
select /* KM4 */ last_name, phone_number from employees where
phone_number = '650.507.9822'                                                                 (page184)   
Plan hash value: 1086981517
 
-----------------------------------------------------------------------------------------
| Id  | Operation                                          | Name                   | Starts  | E-Rows  | A-Rows | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                              |      1   |                |      1       |       3   |
|   1 |  TABLE ACCESS BY INDEX ROWID | EMPLOYEES        |      1   |      1        |      1       |       3   |
|*  2 |   INDEX RANGE SCAN                      | EMP_PHONE_IX   |      1   |      1        |      1      |       2   |
-----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("PHONE_NUMBER"='650.507.9822')        
 
    In each of these examples, there are two keys to look for.  I’ve made these examples short and
simple to keep the output easy to view, but regardless of how complex the plan is, the way to spot a
missing or sub-optimal index is to look for 1) a TABLE ACCESS FULL operation with a filter predicate that shows a small  A-Rows value (i.e. small as in comparison to the total rows in the table) and 2) an index scan operation with a large A-Rows value as compared to the parent TABLE ACCESS BY INDEX ROWID A-Rows value.
    在每个例子中,有两个关键点需要注意。我已经使得这些例子短小精悍,让输出容易看,但是不论多复杂的计划,查看缺失和欠调优的索引的方法,1)一个带有过 滤谓词的全表访问操作显示较小的A-Rows值(例如,小是较表的总行数而言),2)带有一个较大A-Rows值的索引扫描操作,(较大是)相比上一级 ROWID索引访问表(操作)的A-Rows值(而言)。
    In the first example, the index chosen is the EMP_JOB_IX index.  The predicate contains two
conditions: one for JOB_ID and one for DEPARTMENT_ID.  Since the index is only a single-column index on JOB_ID, the index operation will return all rowids for rows with a JOB = ‘SA_REP’ and then hand those rowids to the parent step.  The parent TABLE ACCESS BY INDEX ROWID step will retrieve all of these rows (there are 30) and then apply the filter condition for DEPARTMENT_ID IS NULL.  The end result is that only 1 row matches the entire predicate.  So, for this example, 97% of the rows were thrown away (29 out of 30).  
   在第一个例子中,所选择的索引是EMP_JOB_IX。谓词包含两个条件:一个是JOB_ID,一个是DEPARTMENT_ID。由于索引只是在 JOB_ID上的单列索引,索引操作将返回所有满足JOB='SA_REP'的行的row_id,然后将这些rowid提交个上一级(处理)。上一级操作 ROWID索引访问表将检索所有这些行(这里有30行),然后应用过滤条件DEPARTMENT_ID IS NULL。最终的结果是仅有一行匹配整个谓词。这样,对于这个例子而言,97%的行被抛弃(29/30)。
    This example is small, but imagine the additional overhead required if this query was accessing a
very large table.  By simply adding the DEPARTMENT_ID column to the index, the index scan operation
will be able to return rowids that match the entire predicate.  After adding the index, you can see how
both the index scan and the table access step both have an A-Rows value of 1.  In other words, the parent didn’t have to do any work to retrieve rows that it ultimately threw away.
    这个例子较小,但是可以想象,如果这个查询是访问一个非常大的表所需的附加开销(有多大)。通过简单的在DEPARTMENT_ID上加索引,索引扫描操 作将能返回匹配整个谓词的rowid。在添加索引之后,你能看到索引扫描和表访问步骤的A-Row值都为1了。换句话说,父操作不必做任何工作去检索最终将被丢弃的行。
    The second example, where no index exists on the column used in the predicate, the optimizer has
no other choice except to choose a full table scan operation.  But, as you can see from the A-Rows value,  only 1 row was returned.  Once again, the ill effects are minimal in this example since the table is so small.  However, you should always watch out for full table scans that return a very small number of rows.  The bigger the table becomes, the slower the query will become as more and more blocks have to be accessed to compare to the filter condition only to be thrown away.
    在第二个例子中,没有索引存在于谓词使用的列上,优化器除了使用全表扫描外没有其他的选择。但是,你能从A-Rows值看出只有一行返回了。再一次,在这 个例子中病症是最小的,因为表很小。然而,你应该总是关注只返回小量行数的全表扫描。随着表变得越大,查询将变得更慢,因为越来越多的块必须访问,而在筛 选后仅仅被丢弃。
    In both cases, the main thing to watch out for is excess throw away.  The more blocks that have to be accessed to check filter conditions on rows that will ultimately not be included in the result set, the
poorer the performance will become.  You may not even notice it if data volume is low in the
beginning, but the larger the tables become, the more effect accessing unneeded blocks will have on
response time.
    上述两个例子,主要的问题是关注过量的丢弃。必须访问的块越多,用于在行级上检查过滤条件而最终没有包含在结果集中,性能将变的越差。开始你可能没有注意到,如果数据量小的话,但是表变得越大,访问不必要的块的效果将在响应时间上(凸显出来)。
    Another way plan information can help you is by making it easy to spot when statistics might be
out of date.  Listing 6-15 shows an example of how plan information can point out stale statistics.
    解释计划的帮助你的另一种方式是:当统计信息变得过时,容易被察觉。列表6-15展示了一个从计划信息中如何指出陈旧的统计信息的例子。
(Page 185)
Listing 6-15. Using Plan Information to Determine When Statistics May Be Out of Date
SQL> -- Check current column statistics (collected at 100%)
SQL>
SQL> select column_name, num_distinct, density
  2  from user_tab_cols
  3  where table_name = 'MY_OBJECTS' ;
 
Column Name               NUM_DISTINCT                      DENSITY
-------------------               ----------------------                 ---------------
OWNER                            29                                   .03448275862069
OBJECT_NAME                44245                             .00002260142389
SUBOBJECT_NAME         161                                  .00621118012422
OBJECT_ID                     72588                              .00001377638177
DATA_OBJECT_ID          7748                                .00012906556531
OBJECT_TYPE                 44                                    .02272727272727
CREATED                       1418                                 .00070521861777
LAST_DDL_TIME            1480                                 .00067567567568
TIMESTAMP                   1552                                  .00064432989691
STATUS                            1                                       1
TEMPORARY                   2                                      .5
GENERATED                     2                                     .5
SECONDARY                  2                                        .5
NAMESPACE                  21                                      .04761904761905
EDITION_NAME               0                                        0  
 
SQL> -- Execute query for object_type = 'TABLE'
SQL> 
SQL> select /* KM7 */ object_id, object_name
  2  from my_objects
  3* where object_type = 'TABLE';
...
365056 rows selected.
 
SQL> @pln KM7
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  7xphu2p2m9hdr, child number 0
-------------------------------------
select /* KM7 */ object_id, object_name from my_objects where
object_type = 'TABLE'
 (page 186)
Plan hash value: 2785906523
 
-------------------------------------------------------------------------------------------
| Id  | Operation                                            | Name                      | Starts | E-Rows | A-Rows | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                                 |      1   |               |    365K  |   55697 |
|   1 |  TABLE ACCESS BY INDEX ROWID     | MY_OBJECTS           |      1   |   1650    |    365K  |   55697 |
|*  2 |   INDEX RANGE SCAN                      | OBJECT_TYPE_IX      |      1   |   1650    |    365K  |   26588 |
-------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OBJECT_TYPE"='TABLE')
 
SQL> -- Compare statistic to actual
SQL>
SQL> select num_rows
  2  from dba_tables
  3  where table_name = 'MY_OBJECTS';
 
       NUM_ROWS
---------------
          72588
 
1 row selected.
SQL> select count(*)
  2  from my_objects ;
 
       COUNT(*)
---------------
         434792
 
1 row selected.
 
SQL> -- Update statistics
SQL>
SQL> exec dbms_stats.gather_table_stats(user,'MY_OBJECTS',estimate_percent=>100,
cascade=>true,method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
(page 187)
SQL> select /* KM8 */ object_id, object_name
  2  from my_objects
  3* where object_type = 'TABLE';
...
365056 rows selected.
 
SQL> @pln KM8
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  2qq7ram92zc85, child number 0
-------------------------------------
select /* KM8 */ object_id, object_name from my_objects where
object_type = 'TABLE'
 
Plan hash value: 2785906523
 
-------------------------------------------------------------------------------------------
| Id  | Operation                                               | Name                      | Starts | E-Rows | A-Rows | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                                 |      1   |               |    365K  |   54553 |
|   1 |  TABLE ACCESS BY INDEX ROWID      | MY_OBJECTS            |      1   |   9882    |    365K  |   54553 |
|*  2 |   INDEX RANGE SCAN                         | OBJECT_TYPE_IX     |      1   |   9882    |    365K  |   25444 |
------------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OBJECT_TYPE"='TABLE')   
 
SQL> -- Collect histogram statistics
SQL>
SQL> exec dbms_stats.gather_table_stats(user,'MY_OBJECTS',estimate_percent=>100,
cascade=>true,method_opt=>'FOR ALL COLUMNS SIZE AUTO');
 
PL/SQL procedure successfully completed.
 
SQL> select /* KM9 */ object_id, object_name
  2  from my_objects
  3* where object_type = 'TABLE';
...
365056 rows selected.
 (page 188)
SQL> @pln KM9
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------
SQL_ID  dbvrtvutuyp6z, child number 0
-------------------------------------
select /* KM9 */ object_id, object_name from my_objects where
object_type = 'TABLE'
 
Plan hash value: 880823944
 
-----------------------------------------------------------------------------
| Id  | Operation                            | Name               | Starts | E-Rows | A-Rows | Buffers |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                          |      1   |              |    365K  |   30000 |
|*  1 |  TABLE ACCESS FULL     | MY_OBJECTS |      1   |    365K  |    365K  |   30000 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("OBJECT_TYPE"='TABLE')
    In this example, the optimizer initially computed that only 1650 rows would be returned by the
query for OBJECT_TYPE = ‘TABLE’.  This was due to the fact that the statistics had been computed prior to the addition of a few hundred thousand rows.  When the plan was chosen, the optimizer didn’t have the updated information and it selected a plan using an index scan on the object_type index based on the old statistics.  However, in reality, there were over 474,000 total rows in the table and over 365,000 of them matched the filter criteria.  So, you collected statistics and executed the query again.  This time the estimate went up to 9882 rows, but that’s still an incorrect estimate as compared to the actual rows returned. 
    在这个例子中,通过查询OBJECT_TYPE='TABLE',优化器最开始计算只有1650将返回。事实上统计是在几十万行数据加入之前计算的。当计 划选定后,优化器没有最新的信息且它基于过时的统计信息选择使用在object_type上的索引扫描的执行计划。然而,实际上,在表中共有 474,000行且它们中有超过365,000行匹配筛选标准。当你收集统计信息,再次执行查询。这次估计值上升到9882行,但是相比于实际返回的行还 是不正确的估计值。
    What happened?  You collected fresh statistics and even used a 100% estimate, so everything
should be correct, right?  Well, the problem was that you didn’t collect histogram statistics that would
tell the optimizer about the heavy skew in the distribution of values of the object_type column.  You
needed to use a method_opt parameter that would collect histograms.  So, you did the collection again and this time used method_opt=>’FOR ALL COLUMNS SIZE AUTO’.  This setting allows Oracle to properly collect a histogram on the object_type column.  Now when you executed the query, the estimate is right on target and you get full table scan plan instead.  In this case, the full scan operation is the best choice as the query returns nearly 80% of all the rows in the table and a full scan will access fewer blocks than an index scan plan would.
   发生了什么?你收集新的统计信息,甚至使用了100%的估计,这样一切应该是正确的,不是么?问题在于你没有收集柱状统计,那将告诉优化器 object_type列的值的分布已经严重偏离。你应该再次收集(统计信息),这次使用method_opt=>'FOR ALL COLUMNS SIZE AUTO'。这个设定允许Oracle适当的收集在objct_type列上的柱状统计。现在你执行查询,估计值就正中目标了,且你将得到使用全表扫描的 执行计划。在这个示例中,全扫描操作是最佳选择,因为查询返回了整个表中近80%的行,且全表扫描相比索引扫描将访问更少的块。

Summary  总结
    There is a wealth of information contained in plan output for every SQL statement.  In this chapter,
you have reviewed how plan output can be obtained using EXPLAIN PLAN to get only estimated
information or obtained after executing the statement and extracting the plan information from the
library cache using DBMS_XPLAN.  At times, you may only be able to use EXPLAIN PLAN output, particularly if a query is very long-running and it is not easy or possible to wait to execute the query and get its actual execution data.  However, in order to have the best information possible from which to make decisions about indexing, query syntax changes, or the need to update statistics or parameter settings, the use of actual plan execution statistics is the way to go.
    每条SQL语句的计划输出中都包含有价值的信息。在本章中,你学习了怎么通过使用EXPLAIN PLAN获得计划输出的评估信息,或者在语句执行之后,使用DBMS_XPLAN从库缓存中提取(实际的)计划信息。很多时候,可能你只能使用 EXPLAIN PLAN输出,特别是如果一个查询非常的长且它不容易或可能等查询执行(完再去)取得它的实际执行计划。然而为了尽可能的获得最佳的信息,从中做出关于索 引,查询语法改变,或者需要更新统计或参数设定的判断,使用实际的执行计划是必然选择。
    I covered some of the ways you can use plan information to help diagnose and solve performance
problems for a SQL statement.  By carefully reviewing plan output, you can uncover sub-optimal or
missing indexes and determine if statistics are stale and need to be updated.  Utilizing the knowledge
you’ve gained about the various plan operations for accessing and joining data and understanding
how to read and effectively use plan information, you are equipped to not only solve problems quickly
and efficiently when they arise, but to verify the characteristics and performance footprint of any SQL
statement so that you can write well-behaved SQL from the start.
    我讲解了一些方法,使用计划信息帮助诊断和解决SQL语句的性能问题。通过仔细的检查计划输出,你能发现欠优化的或缺失的索引,判断是否统计是否是过时 的,需要更新。利用你所学的各种计划操作(的知识),访问、连接数据,理解如何阅读和有效的使用计划信息,你的装备不仅能快而有效的解决发生的问题,而且 能确认任何SQL语句的特性以及性能“足迹”,这样一开始你就能写出性能优良的SQL语句。

你可能感兴趣的:(oracle,sql,Pro Oracle SQL,Apress)