In this Document
Purpose |
Troubleshooting Steps |
Prevention is the key |
Finding High CPU Utilization processes on Windows |
Finding High CPU Utilization processes on Unix |
What to look for when one process is using the CPU intensively |
What process is hogging the CPU? |
OS Processes and defunct processes |
What to look for when multiple processes are using the CPU intensively |
References |
The purpose of this article is to help in diagnosing the reason for Oracle processes consuming high CPU
High CPU utilization may not necessarily point to a problem. It could just mean that the system
is being well utilized. However, if CPU usage is consistently high when the load on the system is low or system performance is poor together with High CPU usage you should most likely investigate the reason for the high CPU usage. Also, if one or more processes are consistently hogging CPU at the expense of other processes, the processes using CPU should be investigated.
Besides collecting diagnostic information in order to solve any problems, there is little or nothing to do to stop processes from using a lot of CPU once they started to do so.On the other hand, a lot can be done to prevent it from happening.
Oracle provides two ways to limit the CPU being used by individual users :
Following notes may help to collect information on which processes on Windows may be utilizing high CPU.
On UNIX systems there are two basic tools with the capability of evaluating and estimating the CPU usage on the system. These are vmstat and sar. The following articles explain how to use them and other tolls in diagnosing high CPU on Unix:
Background Processes
PMON
The main reasons for PMON high CPU usage are related to specific bugs when cleaning up processes or registering with the listener.
SMON
SMON is responsible for space consolidation and transaction recovery operations which can cause significant overhead if you are using dictionary managed tablespaces.
SMON can bring a database to a halt if a large table with many extents is dropped or truncated and the table exist within a dictionary-managed tablespace. Starting in 9i, locally-managed tablespaces are the default when a tablespace is first created and beginning in 9i Release 2 (9.2.x), the System tablespace can be locally-managed as well.
Excessive space consolidation can consume excessive CPU. See the following note for more information on this and how to troubleshoot.
Recovery operations performed by SMON can also consume high CPU.The following note explains when the SMON is doing recovery and what to do about it:
SMON may do transaction recovery in parallel. This may result in considerable CPU consumption. In such cases you may consider disabling parallel recovery:
LGWR & DBWR
These two processes are usually I/O bound, but when there is a problem on the OS, they may "spin" (wait) until the I/O operation completes. This spinning is a CPU operation. Slowness or Failures in async I/O operations may also manifest themselves by high CPU consumption.
If LGWR appears to be intermittently taking up 100% CPU and AIO is setup, then the AIO configuration should be rechecked. As a temporary measure,the following parameter may be set time to prevent LGWR from spinning
Jobs (CJQ0, Jn, SNPn)
Job processes run user defined and system defined batch-like tasks. The high CPU usage should be investigated in a similar way as when investigating CPU used by a user process.
Check the Views DBA_JOBS_* , DBA_SCHEDULER_*, DBA_AUTOTASK_* for information of what is being run.
Even on their own these processes may consume a fair amount of CPU as they are in a infinite loop querying the job queue.
Advanced Queuing (AQ, QMN)
The AQ processes send and receive messages mostly through tables. Excessive CPU utilization may be because the tables need to be purged or reorganized or other issues related to Advanced Queuing.
Parallel Query (Pnn)
Parallel query processes are used specifically in order to do a lot of work and therefore may indeed high CPU.However it is advised to ensure that the system is set up optimally. The parallel query option is best for data warehouse type environments where only a small number of users will be executing queries at any given time.
Oracle (User) process
Parsing large queries, procedure compilation or execution, space management and sorting are examples of operations that are CPU intensive.
In order to collect more information on a process using high CPU see the following notes that may be of assistance
An AWR or statspack report may also be of assistance in diagnosing which activities are using high CPU and what they are doing
If the problem is found to be a slow running query, then efforts should be made to tune the query in order that it may avoids consuming high CPU.If it is doing a number of hash joins and full table scans, efforts should be made to add indexes and get the indexes used.
The following notes assist in diagnosing problems with queries and assists in tuning them.
Real-time SQL monitoring is a 11g new feature that enables you to monitor the performance of SQL statements while they are executing. See:
Other tracing techniques might proof useful to decide whether to allow a process to continue
or not and help with analyzing the reason for high cpu usage.
As OS processes are not related to Oracle we cannot help in diagnosing the cause behind the CPU usage. Please consult with your OS Vendor.
The following notes include information on some known OS issues
Sometimes the CPU is fairly distributed across several processes. The only thing to do in
that case is to try to see if they share a common task, like the execution of a particular package or query. We recommend that you take some AWR or statspack snapshots when the CPU is at it is peak and run reports against the snapshots. From there, you can look at the top 5 wait events in order to determine where the time may be used and look at the top queries that are running that may be causing the CPU to spike.
If the problem only occurs when a particular application is active, check to make sure that the application is not overloading the database with more connections than it is setup to handle.
Application architecture that does not use persistent connections (connection pooling, shared servers) and connects and disconnects using dedicated servers consume large amounts of CPU and are unstable in nature and may be susceptible to "login storms" .