Master Note: How to diagnose Database Performance - FAQ [ID 402983.1] |
||
|
||
|
Modified 16-MAR-2011Type FAQStatus PUBLISHED |
|
In this Document
Purpose
Questions and Answers
Investigating a Database Performance Issue
Diagnostics
- AWR reports/Statspack reports
- 10046 Trace
- Querying V$Session_wait
- Finding session id
- System State Dumps
- Errorstack
- PSTACK
Hanganalyze
Interpreting the Results/Traces
Top Database Performance Issues/Problems and How To Resolve Them
- Library Cache/Shared Pool Latch waits
- High Version Counts
- Log File Sync waits
- Buffer Busy waits/Cache Buffers Chains Latch waits
- WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!
- ORA-60 DEADLOCK DETECTED/enqueue hash chains latch
Oracle Server - Enterprise Edition - Version: 6.0.0.0 and later[Release: 6.0 and later ]
Oracle Server - Personal Edition - Version: 7.1.4.0andlater [Release: 7.1.4andlater]
Oracle Server - Standard Edition - Version: 7.0.16.0andlater [Release: 7.0andlater]
Enterprise Manager for RDBMS - Version: 8.1.7.4andlater ]
Information in this document applies to any platform.
This document outlines a number of Frequently Asked Database Performance Questions
To investigate a slow performance problem, begin by deciding what diagnostics will be gathered. To do this, consider the following questions and take the appropriate action:-
Is the performance problem constant or does it occur at certain times of the day ?
CONSTANT - Gather an AWR or statspack report for a period of time when the problem occurs (a 1 hour report is usually sufficient). If you have an historic report which covers the same time of day and period when the performance was OK then take that too.
CERTAIN TIMES - Gather an AWR or statspack report for a period of time which covers the problem existing (For instance, if you have a problem when something is run between 12 and 3 then make sure the report covers either that time or part of that time).
ADDITIONALLY gather an AWR or statspack report for a similar period of time when the problem does not occur for comparison. Always ensure that you are making a fair comparison - for instance, the same time of day or the same workload and make sure the duration of the report is the same.
NOTE:- As much as possible statspacks reports should be minimum 10 minutes, maximum 30 minutes. Longer periods can distort the infomation and reports should be re-gathered using a shorter time period. With AWR a 1hr report is OK.
Does the problem affect one session, several sessions or all sessions ?
ONE SESSION - Gather 10046 trace for the session.
SEVERAL SESSIONS - Gather 10046 trace for one or two of the problem sessions
ALL SESSIONS - Gather AWR or statspack reports
Does the database ''actually'' hang or just ''appear'' to hang?
(ie do sessions never complete their tasks (HANG or SPIN?) or do they it eventually finish (SLOW) )
HANG - Take some systemstates as well as a statspack report
SPIN? - See: Document 68738.1 No Response from the Server, Does it Hang or Spin?
SLOW - Gather 10046 for a selection of slow sessions.
Is the CPU usage high for one or more sessions when things run slowly ?
YES - Take some errorstacks from the suspect CPU process.
(* If unable to gather errorstacks then gather pstack reports)
AWR/Statspack reports provide a method for evaluating the relative performance of a database.
In 10G, to check for general performance issues use the Automatic Workload Repository (AWR) and
specifically the Automatic Database Diagnostic Monitor (ADDM) tool for assistance.
This is covered in the following Document 276103.1 PERFORMANCE TUNING USING 10g ADVISORS AND MANAGEABILITY FEATURES
Note: If uploading reports to support, please ensure that they are in Text format
For 9i and 8i, statspack, rather than AWR, reports should be gathered.
To gather a statspack report, please refer to Document 94224.1 FAQ- Statspack Complete Reference.
To interpret statspack output refer to:
http://www.oracle.com/technology/deploy/performance/pdf/statspack_tuning_otn_new.pdf
10046 trace gathers tracing information about a session.
alter session set timed_statistics = true;
alter session set statistics_level=all;
alter session set max_dump_file_size = unlimited;
alter session set events '10046 trace name context forever,level 12';
-- run the statement(s) to be traced --
select * from dual;
exit;
Also see Document :376442.1 Recommended Method for Obtaining 10046 trace for Tuning.
The view V$Session_wait can show useful information about what a session is waiting for.
Multiple selects from this view can indicate if a session is moving or not.
When wait_time=0 the session is waiting, any other value indicates CPU activity:
set lines 132 pages 999
column event format a30
select sid,event,seq#,p1,p2,p3,wait_time from V$session_wait where SID = &&SID;
select sid,event,seq#,p1,p2,p3,wait_time from V$session_wait where SID = &&SID;
select sid,event,seq#,p1,p2,p3,wait_time from V$session_wait where SID = &&SID;
See: Document 43718.1 VIEW "V$SESSION_WAIT" Reference
** Important **
v$session_wait is often misinterpreted. Often people will assume we are waiting because see an event and seconds_in_wait is rising. It should be remembered that seconds_in_wait only applies to a current wait if wait_time =0 , otherwise it is actually "seconds since the last wait completed". The other column of use to clear up the misinterpretation is state which will be WAITING if we are waiting and WAITED% if we are no longer waiting.
This select is useful for finding the current session information for tracing later:
select p.pid,p.SPID,s.SID
from v$process p,v$session s
where s.paddr = p.addr
and s.audsid = userenv('SESSIONID')
/
If the database is hung then we need to gather systemstate dumps to try to determine what is happening. At least 3 dumps should be taken as follows:
Login to sqlplus as the internal user:
sqlplus "/ as sysdba"
rem -- set trace file size to unlimited:
alter session set max_dump_file_size = unlimited;
alter session set events '10998 trace name context forever, level 1';
alter session set events 'immediate trace name systemstate level 10';
alter session set events 'immediate trace name systemstate level 10';
alter session set events 'immediate trace name systemstate level 10';
or (If using 10G or higher)
sqlplus "/ as sysdba"
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
quit
For further information refer to:
Document 452358.1Database Hangs: What to collect for support. If no connection is possible at all then please refer to the following article which describes how to collect systemstates in that situation:
Document 121779.1 Taking a SYSTEMSTATE dump when you cannot CONNECT to Oracle.
Errorstack traces are Oracle Call Stack dumps that can be used to gather stack information for a process. Attach to the process and gather at least 3 errorstacks:
login to SQL*Plus:
oradebug unlimit
oradebug event 10046 trace name context forever,level 12
oradebug dump errorstack 3
<< wait 1min>>
oradebug dump errorstack 3
<< wait 1min>>
oradebug dump errorstack 3
exit
connect / as sysdba
oradebug setospid 9834
Pstack is an operating system tool that can be used to gather stack information on some unix platforms. Attach to the process and gather about 10 pstacks while the job is running.
% script pstacks.txt
% /usr/proc/bin/pstack pid
% exit
The PID is the o/s process id of the process to be traced. Repeat the pstack command about 10 times to capture possible stack changes. Further details of pstack are in Document 70609.1How To Display Information About Processes on SUN Solaris PLSQL Profiler.
The PL/SQL profiler provides information abour PL/SQL code with regard to CPU usage and other resource usage information. See Document 243755.1 Implementing and Using the PL/SQL Profiler.
Hanganalyze is often gathered for hang situations. Typically systemstates are more useful. The following describes how to gather hanganalyze dumps Document 175006.1Steps to generate HANGANALYZE trace files.
Statspack reports - look at the Top 5 waiters section and work to reduce the time spent in the top waiter first, then regather a statspack report and see what effect that has had.The following assumptions hold true:-
10046 traces - Run the 10046 trace through tkprof and look at the total time spent in SQL, then search back through the tkprof report looking for a SQL Statement which takes up the most proportion of the report. Then look at the breakdown of time and wait events for that SQL. Always remember that the 'number of executions' is important as although the time for a statement may be high this may be accompanied by an equally high execution count. Assume the following:-
Determine the enqueue which is being waited for and address appropriately.
For further assistance see:
@Document 94160.1 Summary of Oracle DATATYPES
Systemstates - These should be sent to Oracle Support Services to interpret.
Hanganalyze - These should be sent to Oracle Support Services to interpret.
Errorstacks - These should be sent to Oracle Support Services to interpret (Some of the calls on the stack are generic and as a result of how an errorstack works so , if searched for on Metalink, can lead to incorrect analysis.
Typically Library Cache/Shared Pool Latch waits is a contention problem caused by unshared SQL (in the case of the library cache latch), or exhaustion of space in the shared pool (for the shared pool latch). For the shared pool latch, while new space allocations will require the latch it is typically the freeing AND allocation of space through too small a shared pool which causes problem.
Document 62143.1 Understanding and Tuning the Shared Pool
High version counts occur when there are multiple copies of the 'same' statement in the shared pool, but some factor prevents them from being shared wasting space
and causing latch contention.
Document 296377.1 Handling and resolving unshared cursors/large version_counts
Log file sync waits occur when sessions wait for redo data to be written to disk.
Typically this is caused by slow writes or committing too frequently in the application.
See Document 34592.1 WAITEVENT: "log file sync" Reference Note.
It is recommended that customers experiencing log file sync issues on 10.2.0.3 proactively apply the patch for Bug 5896963.
Buffer Busy waits occur when a session wants to access a database block in the buffer cache but it cannot as the buffer is "busy" Cache Buffers Chains Latch waits are caused by contention where multiple sessions waiting to read the same block.
Typical solutions are:-
Further information can be found in:
Document 34405.1 WAITEVENT: "buffer busy waits" Reference Note
Document 42152.1 LATCH: CACHE BUFFERS CHAINS
Document 155971.1 Ext/Pub Resolving Intense and "Random" Buffer Busy Wait Performance Problems:
Document 163424.1 Ext/Pub How To Identify a Hot Block Within The Database Buffer Cache.
TX - Document 62354.1 TX Transaction locks - Example wait scenarios
TM - Document 33453.1 REFERENTIAL INTEGRITY AND LOCKING
This Issue occurs when the database detects that a waiter had waited for a resource for longer than a particular threshold. The message "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" appears in the alert log and trace and systemstates are dumped.
Typically this is caused by two (or more) incompatible operations being run simltaneously.
Document 278316.1 Potential reasons for "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! "
Refer to Document 62365.1 What to do with "ORA-60 Deadlock Detected" Errors.
The reason 'enqueue hash chains latch waits' are here is that, typically, during deadlock detection (ie the routine Oracle uses to determine if a deadlock actually exists), there is a heavy need for the latch which can cause issues for other sessions. If there is a problem with this latch, check if a trace file is generated for the ORA-60 and resolve that issue.
- For RAC, Procwatcher can be used.
Refer to Document459694.1 Procwatcher: Script to Monitor and Examine Oracle DB and Clusterware Processes
-------------------------------------------------------------------------------------------------------
Blog: http://blog.csdn.net/tianlesoftware
Email: [email protected]
DBA1 群:62697716(满); DBA2 群:62697977(满) DBA3 群:62697850(满)
DBA 超级群:63306533(满); DBA4 群: 83829929 DBA5群: 142216823
聊天 群:40132017 聊天2群:69087192
--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请