Oracle10g数据库自动诊断监视工具(ADDM)使用指南

第一章ADDM简介

在Oracle9i及之前,DBA们已经拥有了很多很好用的性能分析工具,比如,tkprof、sql_trace、statspack、set event 10046&10053等等。这些工具能够帮助DBA很快的定位性能问题。但这些工具都只给出一些统计数据,然后再由DBA们根据自己的经验进行优化。

那能不能由机器自动在统计数据的基础上给出优化建议呢?Oracle10g中就推出了新的优化诊断工具:数据库自动诊断监视工具(Automatic Database Diagnostic Monitor ADDM)和SQL优化建议工具(SQL Tuning Advisor STA)。这两个工具的结合使用,能使DBA节省大量优化时间,也大大减少了系统宕机的危险。简单点说,ADDM就是收集相关的统计数据到自动工作量知识库(Automatic Workload Repository AWR)中,而STA则根据这些数据,给出优化建议。例如,一个系统资源紧张,出现了明显的性能问题,由以往的办法,做个一个statspack快照,等30分钟,再做一次。查看报告,发现’ db file scattered read’事件在top 5 events里面。根据经验,这个事件一般可能是因为缺少索引、统计分析信息不够新、热表都放在一个数据文件上导致IO争用等原因引起的。根据这些经验,我们需要逐个来定位排除,比如查看语句的查询计划、查看user_tables的last_analysed子段,检查热块等等步骤来最后定位出原因,并给出优化建议。但是,有了STA以后,它就可以根据ADDM采集到的数据直接给出优化建议,甚至给出优化后的语句(抢了DBA的饭碗喽)。

ADDM能发现定位的问题包括:

·         操作系统内存页入页出问题
·         由于Oracle负载和非Oracle负载导致的CPU瓶颈问题
·         导致不同资源负载的Top SQL语句和对象——CPU消耗、IO带宽占用、潜在IO问题、RAC内部通讯繁忙 
·         按照PLSQL和JAVA执行时间排的Top SQL语句. 
·         过多地连接 (login/logoff). 
·         过多硬解析问题——由于shared pool过小、书写问题、绑定大小不适应、解析失败原因引起的。 
·         过多软解析问题
·         索引查询过多导致资源争用. 
·         由于用户锁导致的过多的等待时间 (通过包dbms_lock加的锁)
·         由于DML锁导致的过多等待时间(例如锁住表了) 
·         由于管道输出导致的过多等待时间(如通过包dbms_pipe.put进行管道输出) 
·         由于并发更新同一个记录导致的过多等待时间(行级锁等待) 
·         由于ITL不够导致的过多等待时间(大量的事务操作同一个数据块)
·         系统中过多的commit和rollback(logfile sync事件).
·         由于磁盘带宽太小和其他潜在问题(如由于logfile太小导致过多的checkpoint,MTTR设置问题,过多的undo操作等等)导致的IO性能问题I 
·         对于DBWR进程写数据块,磁盘IO吞吐量不足 
·         由于归档进程无法跟上redo日至产生的速度,导致系统变慢 
·         redo数据文件太小导致的问题
·         由于扩展磁盘分配导致的争用
·         由于移动一个对象的高水位导致的争用问题 
·         内存太小问题——SGA Target, PGA, Buffer Cache, Shared Pool 
·         在一个实例或者一个机群环境中存在频繁读写争用的热块 
·         在一个实例或者一个机群环境中存在频繁读写争用的热对象 
·         RAC环境中内部通讯问题 
·         LMS进程无法跟上导致锁请求阻塞 
·         在RAC环境中由于阻塞和争用导致的实例倾斜 
·         RMAN导致的IO和CPU问题
·         Streams和AQ问题
·         资源管理等待事件
 

有一点要记住:AWR收集的数据时放到内存中(share pool),通过一个新的后台进程MMON定期写到磁盘中。所以10g的share pool要求比以前版本更大,一般推荐比以前大15-20%。另外,还要求系统参数STATISTICS_LEVEL设置为TYPICAL(推荐)或ALL;

ALTER SESSION SET STATISTICS_LEVEL= TYPICAL;

第二章 工作采集、诊断过程

Oracle10g提供了一个图形化的界面(通过OEM),使这个工具使用起来非常简单。下面这里介绍一下如何通过sqlplus使用这个工具。这个工具的使用非常简单,它是不需要安装的。

第一步:创建测试用的表

SQL> CREATE TABLE bigtab AS SELECT rownum as "id", a.* FROM dba_objects a;
Table created. SQL> create table smalltab as select rownum as "id", a.* FROM dba_tables a;
Table created.

DECLARE   n NUMBER; BEGIN   FOR n IN 1 .. 100 LOOP     INSERT INTO bigtab       SELECT rownum as "id", a.* FROM dba_objects a;     COMMIT;   END LOOP; END; / 
PL/SQL procedure successfully completed. 

第二步:采集一次工作量快照

SQL> begin
  2   dbms_workload_repository.create_snapshot('TYPICAL');
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 

第三步:进行一些高负荷操作

DECLARE   v_var number; BEGIN   FOR n IN 1 .. 6 LOOP     select count(*) into v_var from bigtab b, smalltab a;   END LOOP; END; / 
PL/SQL procedure successfully completed.
 

第四步:再次采集一次工作量快照

要注意的是:两次快照之间的间隔时间必须足够(一般推荐30分钟左右),否则得到的ADDM报告中就会提示:THERE WAS NOT ENOUGH DATABASE TIME FOR ADDM ANALYSIS.

SQL> begin
  2   dbms_workload_repository.create_snapshot('TYPICAL');
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 

第五步:创建一个优化诊断任务并执行

先获取到两次快照的ID:

SQL> select snap_id from
  2  (SELECT * FROM dba_hist_snapshot
  3  ORDER BY snap_id desc)
  4  where rownum <=2;
 
 SNAP_ID
--------
      66
      65
 

然后创建优化任务,并执行。

 DECLARE
  task_name VARCHAR2(30) := 'DEMO_ADDM01';
  task_desc VARCHAR2(30) := 'ADDM Feature Test';
  task_id   NUMBER;
BEGIN
  dbms_advisor.create_task('ADDM', task_id, task_name, task_desc, null);
  dbms_advisor.set_task_parameter(task_name, 'START_SNAPSHOT', 65);
  dbms_advisor.set_task_parameter(task_name, 'END_SNAPSHOT', 66);
  dbms_advisor.set_task_parameter(task_name, 'INSTANCE', 1);
  dbms_advisor.set_task_parameter(task_name, 'DB_ID', 1712582900);
  dbms_advisor.execute_task(task_name);
END;
/

PL/SQL procedure successfully completed.
 

其中,set_task_parameter是用来设置任务参数的。START_SNAPSHOT是起始快照ID,END_SNAPSHOT是结束快照ID,INSTANCE是实例号,对于单实例,一般是1,在RAC环境下,可以通过查询视图v$instance得到,DB_ID是数据库的唯一识别号,可以通过查询v$database查到。

第六步:查看优化建议结果

通知函数dbms_advisor.get_task_report可以得到优化建议结果。

 
SQL> SET LONG 1000000 PAGESIZE 0 LONGCHUNKSIZE 1000
SQL> COLUMN get_clob FORMAT a80
SQL> SELECT dbms_advisor.get_task_report('DEMO_ADDM01', 'TEXT', 'ALL') FROM DUAL;
 
DBMS_ADVISOR.GET_TASK_REPORT('
--------------------------------------------------------------------------------
          DETAILED ADDM REPORT FOR TASK 'DEMO_ADDM01' WITH ID 243
          -------------------------------------------------------
 
              Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
         Database ID/Instance: 1712582900/1
      Database/Instance Names: EDGAR/edgar
                    Host Name: HUANGED
             Database Version: 10.2.0.1.0
               Snapshot Range: from 65 to 66
                Database Time: 1463 seconds
        Average Database Load: .4 active sessions
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 
FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.
 
   RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
      ACTION: Host operating system was experiencing significant paging but no
         particular root cause could be detected. Investigate processes that
         do not belong to this instance running on the host that are consuming
         significant amount of virtual memory. Also consider adding more
         physical memory to the host.
 
FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.
 
   RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
      ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
         "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
         and Reference"
         RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
         DECLARE
         v_var number;
         BEGIN
         FOR n IN 1..10000
         LOOP
         select count(*) into v_var from bigtab b, smalltab a;
         END LOOP;
         END;
 
   RECOMMENDATION 2: SQL Tuning, 67% benefit (986 seconds)
      ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
         "fvqfghq71cqns".
         RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
         PLAN_HASH 3281046854
         SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
      RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
         times and had an average elapsed time of 166 seconds.
 
FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.
 
   RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
      ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
         "fvqfghq71cqns".
         RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
         PLAN_HASH 3281046854
         SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
      RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
         times and had an average elapsed time of 166 seconds.
      RATIONALE: Average CPU used per execution was 162 seconds.
 
   RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
      ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
         "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
         and Reference"
         RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
         BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
      RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
         times and had an average elapsed time of 0.26 seconds.
      RATIONALE: Average CPU used per execution was 0.24 seconds.
 
FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.
 
   RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
      ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
         "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
         and Reference"
         RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
         BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
      RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
         times and had an average elapsed time of 0.26 seconds.
      RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.
 
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
          ADDITIONAL INFORMATION
          ----------------------
 
Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.
 
The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.
 
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
          TERMINOLOGY
          -----------
 
DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
   point of view: this is the total amount of time spent by users waiting for
   a response from the database after issuing a call (not including
   networking). From the database instance point of view: this is the total
   time spent by forground processes waiting for a database resource (e.g.,
   read I/O), running on the CPU and waiting for a free CPU (run-queue). The
   target of ADDM analysis is to reduce this metric as much as possible,
   thereby reducing the instance's response time.
 
AVERAGE DATABASE LOAD: At any given time we can count how many users (also
   called 'Active Sessions') are waiting for an answer from the instance. This
   is the ADDM's measurement for instance load. The 'Average Database Load' is
   the average of the the load measurement taken over the entire analysis
   period. We get this number by dividing the 'Database Time' by the analysis
   period. For example, if the analysis period is 30 minutes and the 'Database
   Time' is 90 minutes, we have an average of 3 users waiting for a response.
 
IMPACT: Each finding has an 'Impact' associated with it. The impact is the
   portion of the 'Database Time' the finding deals with. If we assume that
   the problem described by the finding is completely solved, then the
   'Database Time' will be reduced by the amount of the 'Impact'.
 
BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
   analysis estimates that the 'Database Time' can be reduced by the 'benefit'
   amount if all the actions of the recommendation are performed.
 

说明:

其中第五步到第六步可以直接执行$ORACLE_HOME/rdbms/admin/addmrpt.sql来得到,这个脚本的执行过程和statspack脚本执行过程类似:

SQL> @addmrpt
 
Current Instance
~~~~~~~~~~~~~~~~
 
   DB Id    DB Name      Inst Num Instance
----------- ------------ -------- ------------
 1712582900 EDGAR               1 edgar
 
 
Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
   DB Id     Inst Num DB Name      Instance     Host
------------ -------- ------------ ------------ ------------
* 1712582900        1 EDGAR        edgar        HUANGED
 
Using 1712582900 for database Id
Using          1 for instance number
 
 
Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed.  Pressing <return> without
specifying a number lists all completed snapshots.
 
 
 
Listing the last 3 days of Completed Snapshots
 
                                                        Snap
Instance     DB Name        Snap Id    Snap Started    Level
------------ ------------ --------- ------------------ -----
edgar        EDGAR                7 22 Nov 2005 00:00      1
... ...
                                 64 23 Nov 2005 15:02      1
                                 65 23 Nov 2005 16:00      1
                                 66 23 Nov 2005 16:06      1
 
 
Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 65
Begin Snapshot Id specified: 66
 
Enter value for end_snap: 66
End   Snapshot Id specified: 66
 
 
 
Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is addmrpt_1_65_66.txt.  To use this name,
press <return> to continue, otherwise enter an alternative.
 
Enter value for report_name:
 
Using the report name addmrpt_1_65_66.txt
 
 
Running the ADDM analysis on the specified pair of snapshots ...
 
 
Generating the ADDM report for this analysis ...
 
... ...
 

此外,如果是RAC环境下,可以执行$ORACLE_HOME/rdbms/admin/addmrpti.sql,这脚本的执行,会多出要求输入DB ID和instance ID的要求。

第三章 诊断结果分析

我们从上面的建议结果看到了,ADDM Report的结果与Statspack Report的结果大不相同。Statspack Report的结果给出的都是统计数据、各种事件,然后由DBA根据这些数据给出优化建议,而ADDM Report的结果包含就已经是给出的优化建议了(汗!DBA以后要失业了!)

第一部分:

 Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
   Database ID/Instance: 1712582900/1
 Database/Instance Names: EDGAR/edgar
 Host Name: HUANGED
 Database Version: 10.2.0.1.0
 Snapshot Range: from 65 to 66
 Database Time: 1463 seconds
   Average Database Load: .4 active sessions
 

这一部分包括一些基础信息,分析时间段、DB和instance ID&名字、主机名字、Oracle版本、快照范围、数据库消耗时间、多少个活动会话。

第二部分:

下面就是ADDM发现的问题,并给出的相应建议。在我们这个例子中总共发现4个问题,下面一一解释一下。第一个问题:

FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.
 
 RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
 ACTION: Host operating system was experiencing significant paging but no
 particular root cause could be detected. Investigate processes that
 do not belong to this instance running on the host that are consuming
 significant amount of virtual memory. Also consider adding more
 physical memory to the host.

先看第一行:100% impact (1463 seconds),这是这个问题所持续的实践及其对系统的影响,它的时间是1463秒,和分析期间的数据库消耗时间(在第一部分中)是一样(1463秒),所以对系统的影响是1463/1463*100=100%的。

再看第二行:Significant virtual memory paging was detected on the host operating system.,这是ADDM发现的这个问题的具体描述:在操作系统中发现有显著的虚拟内存页入页出的问题。

然后看ADDM给出的建议及其作用:Host Configuration, 100% benefit (1463 seconds)——更改主机配置,100%有效。

最后是具体该如何操作:略——在主机的操作系统上发现了明显的页入页出,但是没有发现明显导致内存频繁换如换出的根本原因。需要仔细检查那些消耗大量虚拟内存的进程(除Oracle实例外)。此外,还可以考虑增大主机的物理内存。说明一下:我的这个实例是跑在我自己的PC机上,Oracle运行的同时有大量的其他消耗内存的程序(word等)在运行,所以肯定有大量的内存交换存在。

再看第二个问题:

FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.
 
 RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
 ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
 "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
 and Reference"
 RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
 DECLARE
 v_var number;
   BEGIN
 FOR n IN 1..10000
 LOOP
 select count(*) into v_var from bigtab b, smalltab a;
 END LOOP;
 END;

ADDM发现有SQL语句在消耗大量数据库时间,它的影响是100%的。给出的建议是优化SQL,能取得68%的效果。

具体操作是优化ADDM找到的PL/SQL块,它的SQL_ID是"064wqx7c5b81z"(可以通过select sql_text from v$sql where sql_id=’064wqx7c5b81z’;查到)。至于如何优化SQL语句,可以参考Oracle文档PL/SQL User's Guide and Reference中的Tuning PL/SQL Applications章节。下面的内容便是我们用来插入数据的测试语句。

下面是ADDM发现的其他问题语句:

FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.
 
 RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
 ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
 "fvqfghq71cqns".
 RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
 PLAN_HASH 3281046854
 SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
 RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
 times and had an average elapsed time of 166 seconds.
 RATIONALE: Average CPU used per execution was 162 seconds.
 
 RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
 ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
 "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
 and Reference"
 RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
         BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
 RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
 times and had an average elapsed time of 0.26 seconds.
 RATIONALE: Average CPU used per execution was 0.24 seconds.

这个问题的描述是,实例消耗的CPU事件占据了大量的数据库运行时间。由于发现了两条问题语句,所以这里有两个建议。

第一个建议就是优化我们的测试语句。并且说明了这个问题的根本原因:这条语句总共执行过6次,平均每次消耗了166秒。平均这个问题消耗的CPU时间是162秒。

第二个建议实际上是针对一个系统过程,这个过程是用来读取队列信息的,消耗的资源比较小,我们这里就不需要关心了。

再看最后一个问题:

FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.
 
 RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
 ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
 "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
 and Reference"
 RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
 BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
 RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
 times and had an average elapsed time of 0.26 seconds.
 RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.

从内容上看,这个问题就是上一个问题中的第二个建议。但是,它导致的结果是不一样的。看这个问题的描述:PL/SQL的执行次数消耗了大量的数据库时间。它的根本原因是因为执行次数太多(125次)。可见ADDM的问题检查相当全面。

第三部分:

这一部分的内容是关于此次优化建议的一些附加信息:

 ADDITIONAL INFORMATION
 ----------------------
 
Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.
 
The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.

这是关于这次优化诊断对各类事件(在Oracle10g,新增了很多新的事件,主要是将原先一些较含糊的事件细化了,同时将所有事件进行了归类。可以查看视图V$SYSTEM_WAIT_CLASS)的一些总结:Application、Commit、Concurrency、Configuration、Network、User I/O类等待事件没有显著消耗数据库时间;会话连接、断连请求没有消耗大量数据库时间;对SQL语句的硬解析没有消耗大量数据库时间;对IO性能的分析是基于默认假设每次读一个数据块的时间是10000微秒的。

第四部分:

这部分是对诊断报告中用到的术语的解释:

 TERMINOLOGY
 -----------
 
DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
 point of view: this is the total amount of time spent by users waiting for
 a response from the database after issuing a call (not including
 networking). From the database instance point of view: this is the total
 time spent by forground processes waiting for a database resource (e.g.,
 read I/O), running on the CPU and waiting for a free CPU (run-queue). The
 target of ADDM analysis is to reduce this metric as much as possible,
 thereby reducing the instance's response time.
 
AVERAGE DATABASE LOAD: At any given time we can count how many users (also
 called 'Active Sessions') are waiting for an answer from the instance. This
 is the ADDM's measurement for instance load. The 'Average Database Load' is
 the average of the the load measurement taken over the entire analysis
 period. We get this number by dividing the 'Database Time' by the analysis
 period. For example, if the analysis period is 30 minutes and the 'Database
 Time' is 90 minutes, we have an average of 3 users waiting for a response.
 
IMPACT: Each finding has an 'Impact' associated with it. The impact is the
 portion of the 'Database Time' the finding deals with. If we assume that
 the problem described by the finding is completely solved, then the
 'Database Time' will be reduced by the amount of the 'Impact'.
 
BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
 analysis estimates that the 'Database Time' can be reduced by the 'benefit'
 amount if all the actions of the recommendation are performed.

DATABASE TIME:是ADDM的度量数据。从用户角度看:这是从向数据库请求开始,消耗在用户等待响应上的全部时间(不包括网络响应时间);从数据库实例角度看:前台进程消耗在等待一种数据库资源(例如,IO读)、CPU运行和等待CPU释放(队列等待)的总共时间。ADDM分析的目标就尽量降低这个数字,也就是减少实例响应时间。

AVERAGE DATABASE LOAD:所有能统计到的有多少用户(也称为“活动会话”)等待实例响应。这是实例负荷的度量指标。平均数据库负荷是由整个分析计算出来的平均负荷。通过“Database Time”除以分析周期时间得到。例如,分析周期时30分钟,而数据库运行消耗时间是90分钟,那就说明平均有3个用户在等待响应。

IMPACT:每一个找到的问题都有“影响”这一项。“影响”是数据库消耗时间用于处理这个问题的时间不分。假定我们所找到的这个问题完全解决,那么数据库消耗时间就会相应减少“影响”时间。

BENEFIT:每一个找到的问题都“受益”这一项。如果所有建议操作得到实施,ADDM分析估计数据库消耗时间能减少“受益”的全部时间。

第四章 使用STA来优化语句

ADDM得出了诊断结果,并给出了优化建议。通常90%的性能问题都是由于应用引起的,而应用问题肯定离不开问题语句。那么如何优化这些语句呢,以前靠的是DBA的经验,现在就可以使用STA了。

备注:关于STA的使用,请参考我的另一篇文章《利用SQL优化器(STA)优化语句》

STA的使用非常简单,从ADDM诊断报告中,发现有一个与句很影响系统性能,我们就对这条语句进行优化:

第一步:创建优化任务并执行

SQL> DECLARE
 2 my_task_name VARCHAR2(30);
 3 my_sqltext CLOB;
 4 BEGIN
 5 my_sqltext := 'select a.table_name, b.object_id from bigtab b, smalltab a';
 6 
  7 my_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
 8 sql_text => my_sqltext,
 9 user_name => 'DEMO',
 10 scope => 'COMPREHENSIVE',
 11 time_limit => 60,
 12 task_name => 'TEST_sql_tuning_task',
 13 description => 'Task to tune a query on a specified PRODUCT');
 14 
 15 dbms_sqltune.Execute_tuning_task (task_name => 'TEST_sql_tuning_task');
 16 END;
 17 /
 
PL/SQL procedure successfully completed 

DBMS_SQLTUNE.CREATE_TUNING_TASK就是用来创建优化任务的函数。其中,sql_text是需要优化的语句,user_name是该语句通过哪个用户执行,scope是优化范围(limitedcomprehensive),time_limit优化过程的时间限制,task_name优化任务名称,description优化任务描述。

dbms_sqltune.Execute_tuning_task是执行优化的函数。

第二步:查看优化建议结果

SQL> set long 10000
SQL> set longchunksize 1000
Cannot SET LONGCHUNKSIZE
 
SQL> set linesize 100
SQL> SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK('TEST_sql_tuning_task') FROM DUAL;
 
DBMS_SQLTUNE.REPORT_TUNING_TAS
--------------------------------------------------------------------------------
GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name : TEST_sql_tuning_task
Tuning Task Owner : DEMO
Scope : COMPREHENSIVE
Time Limit(seconds)               : 60
Completion Status : COMPLETED
Started at : 11/24/2005 14:46:23
Completed at : 11/24/2005 14:46:24
Number of SQL Restructure Findings: 1
 
-------------------------------------------------------------------------------
Schema Name: DEMO
SQL ID : f5k29d73zdpsd
SQL Text : select a.table_name, b.object_id from bigtab b, smalltab a
 
-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------
 
1- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
 An expensive cartesian product operation was found at line ID 1 of the
 execution plan.
 
 Recommendation
 --------------
 - Consider removing the disconnected table or view from this statement or
 add a join condition which refers to it.
 
 Rationale
 ---------
 A cartesian product should be avoided whenever possible because it is an
  expensive operation and might produce a large amount of data.
 
-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------
 
1- Original
-----------
Plan hash value: 3479921507
 
--------------------------------------------------------------------------------
| Id | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1474M|    32G|  4316K  (2)| 14:23:21
| 1 | MERGE JOIN CARTESIAN|          |  1474M|    32G|  4316K  (2)| 14:23:21
| 2 | TABLE ACCESS FULL | SMALLTAB |  1223 | 23237 |    11   (0)| 00:00:01
| 3 | BUFFER SORT | | 1205K|  5887K|  4316K  (2)| 14:23:21
| 4 | TABLE ACCESS FULL | BIGTAB | 1205K|  5887K|  3530   (2)| 00:00:43
--------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------
 

优化建议结果分为三部分,下面简单介绍一下每个部分。

第一部分:

和所有Oracle的报告一样,第一部分总是基础信息J。

GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name : TEST_sql_tuning_task
Tuning Task Owner : DEMO
Scope : COMPREHENSIVE
Time Limit(seconds)               : 60
Completion Status : COMPLETED
Started at   : 11/24/2005 14:46:23
Completed at : 11/24/2005 14:46:24
Number of SQL Restructure Findings: 1
 
-------------------------------------------------------------------------------
Schema Name: DEMO
SQL ID : f5k29d73zdpsd
SQL Text : select a.table_name, b.object_id from bigtab b, smalltab a

这部分信息包括:任务名称、任务所有者、任务范围、任务执行时间限制(前面几个信息实际上就是我们创建任务时指定的参数)、任务状态、任务开始时间、完成时间、发现的需要重新构造的问题语句数量。接下来就是schema名称、SQL ID和SQL内容。

第二部分:

这部分是优化器发现的需要优化的语句,在我们的例子中,肯定只有一条:

-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------
 
1- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
 An expensive cartesian product operation was found at line ID 1 of the
 execution plan.
 
 Recommendation
 --------------
 - Consider removing the disconnected table or view from this statement or
 add a join condition which refers to it.
 
 Rationale
 ---------
 A cartesian product should be avoided whenever possible because it is an
 expensive operation and might produce a large amount of data.

首先是在查询计划中发现的问题:在语句1的查询计划中发现有很大的笛卡尔积操作(显然,我们的语句是一个外连接操作,做的就是迪卡尔积操作);

其次是针对问题给出的建议:考虑将不需要做连接查询的表或视图去掉,或者增加一个连接条件;

最后导致问题的根本原因:由于迪卡尔积是一个会产生很大数据量的消耗资源的操作,所以需要尽量避免迪卡尔积操作。

第三部分:

这一部分是SQL语句的查询计划。

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------
 
1- Original
-----------
Plan hash value: 3479921507
 
--------------------------------------------------------------------------------
| Id | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1474M|    32G|  4316K  (2)| 14:23:21
| 1 | MERGE JOIN CARTESIAN|          |  1474M|    32G|  4316K  (2)| 14:23:21
| 2 | TABLE ACCESS FULL | SMALLTAB |  1223 | 23237 |    11   (0)| 00:00:01
| 3 | BUFFER SORT | | 1205K|  5887K|  4316K  (2)| 14:23:21
| 4 | TABLE ACCESS FULL | BIGTAB | 1205K|  5887K|  3530   (2)| 00:00:43
--------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------

这里给出的是语句优化前的原始查询计划。如果优化器认为这条语句需要重写,它还会给出重写后的查询计划。

第五章 总结

哎,不多说了,总之一个字:简单!如果Oracle在这样下去,这个世界上谁都是DBA,谁都不需要DBA了。

转载自:http://www.hellodba.com/reader.php?ID=160&lang=CN

你可能感兴趣的:(Oracle10g数据库自动诊断监视工具(ADDM)使用指南)