When problems occur with a system, it is important to perform accurate and timely diagnosis of the problem before making any changes to a system. Oftentimes, a database administrator (DBA) simply looks at the symptoms and immediately starts changing the system to fix those symptoms. However, an accurate diagnosis of the actual problem in the initial stage significantly increases the probability of success in resolving the problem.
With Oracle Database, the statistical data needed for accurate diagnosis of a problem is stored in the Automatic Workload Repository (AWR). The Automatic Database Diagnostic Monitor (ADDM):
Analyzes the AWR data on a regular basis
Diagnoses the root causes of performance problems
Provides recommendations for correcting any problems
Identifies non-problem areas of the system
Because AWR is a repository of historical performance data, ADDM can analyze performance issues after the event, often saving time and resources in reproducing a problem. For information about the AWR, see "Overview of the Automatic Workload Repository".
In most cases, ADDM output should be the first place that a DBA looks when notified of a performance problem. ADDM provides the following benefits:
Automatic performance diagnostic report every hour by default
Problem diagnosis based on decades of tuning expertise
Time-based quantification of problem impacts and recommendation benefits
Identification of root cause, not symptoms
Recommendations for treating the root causes of problems
Identification of non-problem areas of the system
Minimal overhead to the system during the diagnostic process
It is important to realize that tuning is an iterative process, and fixing one problem can cause the bottleneck to shift to another part of the system. Even with the benefit of ADDM analysis, it can take multiple tuning cycles to reach acceptable system performance. ADDM benefits apply beyond production systems; on development and test systems, ADDM can provide an early warning of performance issues.
This section contains the following topics:
ADDM Analysis
Using ADDM with Oracle Real Application Clusters
ADDM Analysis Results
Reviewing ADDM Analysis Results: Example
An ADDM analysis can be performed on a pair of AWR snapshots and a set of instances from the same database. The pair of AWR snapshots define the time period for analysis, and the set of instances define the target for analysis.
If you are using Oracle Real Application Clusters (Oracle RAC), ADDM has three analysis modes:
Database
In Database mode, ADDM analyzes all instances of the database.
Instance
In Instance mode, ADDM analyzes a particular instance of the database.
Partial
In Partial mode, ADDM analyzes a subset of all database instances.
If you are not using Oracle RAC, ADDM can only function in Instance mode because there is only one instance of the database.
An ADDM analysis is performed each time an AWR snapshot is taken and the results are saved in the database. The time period analyzed by ADDM is defined by the last two snapshots (the last hour by default). ADDM will always analyze the specified instance in Instance mode. For non-Oracle RAC or single instance environments, the analysis performed in the Instance mode is the same as a database-wide analysis. If you are using Oracle RAC, ADDM will also analyze the entire database in Database mode, as described in "Using ADDM with Oracle Real Application Clusters". After an ADDM completes its analysis, you can view the results using Oracle Enterprise Manager, or by viewing a report in a SQL*Plus session.
ADDM analysis is performed top down, first identifying symptoms, and then refining them to reach the root causes of performance problems. The goal of the analysis is to reduce a single throughput metric called DB
time
. DB
time
is the cumulative time spent by the database in processing user requests. It includes wait time and CPU time of all non-idle user sessions. DB
time
is displayed in the V$SESS_TIME_MODEL
and V$SYS_TIME_MODEL
views.
See Also:
Oracle Database Reference for information about the V$SESS_TIME_MODEL
and V$SYS_TIME_MODEL
views
"Time Model Statistics" for a discussion of time model statistics and DB
time
Oracle Database Concepts for information about server processes
By reducing DB
time
, the database is able to support more user requests using the same resources, which increases throughput. The problems reported by ADDM are sorted by the amount of DB
time
they are responsible for. System areas that are not responsible for a significant portion of DB
time
are reported as non-problem areas.
The types of problems that ADDM considers include the following:
CPU bottlenecks - Is the system CPU bound by Oracle Database or some other application?
Undersized Memory Structures - Are the Oracle Database memory structures, such as the SGA, PGA, and buffer cache, adequately sized?
I/O capacity issues - Is the I/O subsystem performing as expected?
High load SQL statements - Are there any SQL statements which are consuming excessive system resources?
High load PL/SQL execution and compilation, and high-load Java usage
Oracle RAC specific issues - What are the global cache hot blocks and objects; are there any interconnect latency issues?
Sub-optimal use of Oracle Database by the application - Are there problems with poor connection management, excessive parsing, or application level lock contention?
Database configuration issues - Is there evidence of incorrect sizing of log files, archiving issues, excessive checkpoints, or sub-optimal parameter settings?
Concurrency issues - Are there buffer busy problems?
Hot objects and top SQL for various problem areas
Note:
This is not a comprehensive list of all problem types that ADDM considers in its analysis.
ADDM also documents the non-problem areas of the system. For example, wait event classes that are not significantly impacting the performance of the system are identified and removed from the tuning consideration at an early stage, saving time and effort that would be spent on items that do not impact overall system performance.
If you are using Oracle RAC, you can run ADDM in Database analysis mode to analyze the throughput performance of all instances of the database. In Database mode, ADDM considers DB time as the sum of the database time for all database instances. Using the Database analysis mode enables you to view all findings that are significant to the entire database in a single report, instead of reviewing a separate report for each instance.
The Database mode report includes findings about database resources (such as I/O and interconnect). The report also aggregates findings from the various instances if they are significant to the entire database. For example, if the CPU load on a single instance is high enough to affect the entire database, the finding will appear in the Database mode analysis, which will point to the particular instance responsible for the problem.
See Also:
Oracle Database 2 Day + Real Application Clusters Guide for information about using ADDM with Oracle RAC
In addition to problem diagnostics, ADDM recommends possible solutions. ADDM analysis results are represented as a set of findings. See Example 6-1 for an example of ADDM analysis result. Each ADDM finding can belong to one of the following types:
Problem findings describe the root cause of a database performance problem.
Symptom findings contain information that often lead to one or more problem findings.
Information findings are used for reporting information that are relevant to understanding the performance of the database, but do not constitute a performance problem (such as non-problem areas of the database and the activity of automatic database maintenance).
Warning findings contain information about problems that may affect the completeness or accuracy of the ADDM analysis (such as missing data in the AWR).
Each problem finding is quantified by an impact that is an estimate of the portion of DB
time
caused by the finding's performance issue. A problem finding can be associated with a list of recommendations for reducing the impact of the performance problem. The types of recommendations include:
Hardware changes: adding CPUs or changing the I/O subsystem configuration
Database configuration: changing initialization parameter settings
Schema changes: hash partitioning a table or index, or using automatic segment-space management (ASSM)
Application changes: using the cache option for sequences or using bind variables
Using other advisors: running SQL Tuning Advisor on high-load SQL or running the Segment Advisor on hot objects
A list of recommendations can contain various alternatives for solving the same problem; you do not have to apply all the recommendations to solve a specific problem. Each recommendation has a benefit which is an estimate of the portion of DB
time
that can be saved if the recommendation is implemented. Recommendations are composed of actions and rationales. You must apply all the actions of a recommendation to gain the estimated benefit. The rationales are used for explaining why the set of actions were recommended and to provide additional information to implement the suggested recommendation.
Consider the following section of an ADDM report in Example 6-1.
Example 6-1 Example ADDM Report
FINDING 1: 31% impact (7798 seconds) ------------------------------------ SQL statements were not shared due to the usage of literals. This resulted in additional hard parses which were consuming significant database time. RECOMMENDATION 1: Application Analysis, 31% benefit (7798 seconds) ACTION: Investigate application logic for possible use of bind variables instead of literals. Alternatively, you may set the parameter "cursor_sharing" to "force". RATIONALE: SQL statements with PLAN_HASH_VALUE 3106087033 were found to be using literals. Look in V$SQL for examples of such SQL statements.
In Example 6-1, the finding points to a particular root cause, the usage of literals in SQL statements, which is estimated to have an impact of about 31% of total DB
time
in the analysis period.
The finding has a recommendation associated with it, composed of one action and one rationale. The action specifies a solution to the problem found and is estimated to have a maximum benefit of up to 31% DB
time
in the analysis period. Note that the benefit is given as a portion of the total DB
time
and not as a portion of the finding's impact. The rationale provides additional information on tracking potential SQL statements that were using literals and causing this performance issue. Using the specified plan hash value of SQL statements that could be a problem, a DBA could quickly examine a few sample statements.
When a specific problem has multiple causes, the ADDM may report multiple problem and symptom findings. In this case, the impacts of these multiple findings can contain the same portion of DB
time
. Because the performance issues of findings can overlap, the sum of the impacts of the findings can exceed 100% of DB
time
. For example, if a system performs many reads, then ADDM might report a SQL statement responsible for 50% of DB
time
due to I/O activity as one finding, and an undersized buffer cache responsible for 75% of DB
time
as another finding.
When multiple recommendations are associated with a problem finding, the recommendations may contain alternatives for solving the problem. In this case, the sum of the recommendations' benefits may be higher than the finding's impact.
When appropriate, an ADDM action may have multiple solutions for you to choose from. In the example, the most effective solution is to use bind variables. However, it is often difficult to modify the application. Changing the value of the CURSOR_SHARING
initialization parameter is much easier to implement and can provide significant improvement.