Oracle QMNC进程消耗 100%cpu

操作系统版本为
localdomain 2.6.18-53.e15PAE
数据库版本为10.2.0.1
经客户反映QMNC进程消耗 100%cpu,经判断为Oracle一bug引起,bug号为5484652
解决办法:
ALTER SYSTEM SET AQ_TM_PROCESSES=5 SCOPE=SPFILE;
重启数据库

以下来自metalink
Bug 5069930: QMNC PROCESS IS SPINNING AND CONSUMING HIGH CPU 

显示 Bug 属性 Bug 属性
类型 B - Defect 已在产品版本中修复 -
严重性 2 - Severe Loss of Service 产品版本 10.2.0.1
状态 96 - Closed, Duplicate Bug 平台 226 - Linux x86-64
创建时间 28-Feb-2006 平台版本 -
更新时间 08-Nov-2007 基本 Bug 5484652
数据库版本 10.2.0.1
影响平台 Generic
产品来源 Oracle

显示相关产品 相关产品
产品线 Oracle Database Products 系列 Oracle Database
区域 Oracle Database 产品 5 - Oracle Server - Enterprise Edition

Hdr: 5069930 10.2.0.1 RDBMS 10.2.0.1 AQ PRODID-5 PORTID-226 5484652
Abstract: QMNC PROCESS IS SPINNING AND CONSUMING HIGH CPU

*** 02/28/06 09:37 pm ***
TAR:
----

PROBLEM:
--------
QMN process ora_qmnc_<SID> is taking upto 99% of CPU.
Customer is on 10.2.0.1.0.

DIAGNOSTIC ANALYSIS:
--------------------

+ looks like the ora_qmnc_<SID> process is not doing anything.
+ tried to take a 10046 trace using oradebug, but does not dump anything.
+ tried to find out if any SQLs executing using
DBMS_SYSTEM.SET_SQL_TRACE_IN_SESION, but again no SQLs found
+ errorstack shows, this process is apparently spinning.
+ system call trace (using strace) shows it is spinning on "times(NULL) =
489352606" system call.


WORKAROUND:
-----------
NONE

RELATED BUGS:
-------------

REPRODUCIBILITY:
----------------
reproducing at production, development and test environments of the customer.

TEST CASE:
----------

STACK TRACE:
------------
SQL> oradebug setospid 8692
Oracle pid: 24, Unix process pid: 8692, image:
[email protected] (QMNC)
SQL> oradebug short_stack
ksdxfstk()+32<-ksdxcb()+1547<-sspuser()+90<-<0x3e13b0c320>
SQL> oradebug short_stack
ksdxfstk()+32<-ksdxcb()+1547<-sspuser()+90<-<0x3e13b0c320>
SQL> oradebug short_stack
ksdxfstk()+32<-ksdxcb()+1547<-sspuser()+90<-<0x3e13b0c320>
SQL> oradebug short_stack
ksdxfstk()+32<-ksdxcb()+1547<-sspuser()+90<-<0x3e13b0c320>

SUPPORTING INFORMATION:
-----------------------
+ looks like bug: 4543871, but this is spinning on a different system call
(times(NULL)).
+ errorstack, strace output is uploaded.

24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------

DIAL-IN INFORMATION:
--------------------

IMPACT DATE:
------------

*** 02/28/06 09:43 pm ***
The uptime of the Linux machine is:

[oracle@ftibprod-db01 ~]$ uptime
14:15:24 up 7 days, 17:58,  1 user,  load average: 2.00, 2.00, 2.00
   
Let me know if you need any further information.

Thank You.
Rijesh
*** 02/28/06 11:35 pm ***
*** 03/02/06 12:09 am ***
*** 03/02/06 01:17 am *** ESCALATED
*** 03/02/06 01:17 am ***
*** 03/02/06 01:20 am *** (CHG: Sta->10 Asg->NEW OWNER OWNER)
*** 03/02/06 01:20 am ***
Correcting platform based on tracefile:
  Release:        2.6.9-22.ELsmp
  Version:        #1 SMP Mon Sep 19 18:00:54 EDT 2005
  Machine:        x86_64

The supplied stacks are not unwound - the only functions
on the stack are those for dumping the stack itself.
Please try to get proper stacks from the spinning process
using "pstack" or "gdb" OS level utilities.
*** 03/02/06 07:23 pm *** (CHG: Sta->16)
*** 03/02/06 07:23 pm ***
*** 03/03/06 02:39 am *** (CHG: Sta->10)
*** 03/03/06 02:39 am ***
From the errorstack tracefile I have:
60029940     (ub4) exptm_kwqmnsgn            43FBA6D7 = 1140565719
                //  Exp time of the first ready task
600299DC     (ub4) failtime_kwqmnsgn         4403D919 = 1141102873
                //  time when ksvcreate of q# failed
600079FC     (ub4) ksusgctm                  4403D91F = 1141102879
                //  current time in seconds

and from the code it seems we are probably spinning trying to start
a slave process but that keeps failing.

Can you get:
  Full init.ora parameter settings
  Alert log from startup covering period when QMN spins
  Set event 10852 level 32 to help show QMN actions and status
        code why it cannot spawn a slave.
*** 03/03/06 03:10 am ***
*** 03/03/06 03:10 am *** (CHG: Sta->16)
*** 03/03/06 03:33 am *** (CHG: Sta->10)
*** 03/03/06 03:33 am ***
We need QMN to see the event before it starts to spin so
it is best in the init.ora or spfile then bounce
*** 03/06/06 09:19 pm ***
*** 03/06/06 09:20 pm *** (CHG: Sta->16)
*** 03/07/06 06:17 am ***
*** 03/08/06 12:37 am ***
BDE Screening
~~~~~~~~~~~~~
Testcase
~~~~~~~~~~~~~~~~~

  Files:  BDETC.tar.Z (containing init.ora setup.sql

  Steps: 
    Merge the "init.ora" into your pfile used to start the instance so
     that you have the following set:
        event="10852 trace name context forever, level 32"
        aq_tm_processes=10

    sqlplus /nolog @setup
      Creates a user TC with an AQ queue.

    Shutdown the instance.

    Startup the instance (using the included init.ora settings)

    Watch the QMNC trace file, or the CPU use of qmnc
    Notice it starts to spin in the loop

Reproduced
~~~~~~~~~~
  Reproduced in 10.2.0.2
  Reproduced in RDBMS_MAIN_LINUX_060228


Workaround/s
~~~~~~~~~~~~
  Do not set AQ_TM_PROCESSES = 10  , use a lower value (eg: 5).

Diagnostic Notes
~~~~~~~~~~~~~~~~
  On startup we set a task of KWQMN_PERSISTENT for each AQ slave
  spawned due to AQ_TM_PROCESSES. If there are 10 of these then this
  is the maximum allowed number of slaves but when a new task is
  added QMNC tries to spawn an extra slave. This fails as ksvcreate()
  returns ksvMAXSPAWNED but it also posts itself and so it tries again
  without sleeping first and so loops.
  eg:
        kwqmnstslv: ksvrv=2
        Couldn't start a new slave
        kwqmnstslv: less than 10s since last failed ksvcreate.
        Couldn't start a new slave
        kwqmnstslv: less than 10s since last failed ksvcreate.

  The fix from bug 4170525 removed messages being written to the alert
  log but it did no tackle this underlying spin. After 10 seconds
  ksvcreate() again tries to spawn a slave but returns ksvMAXSPAWNED
  again and so it spins for a further 10 seconds.

*** 03/08/06 12:39 am *** (CHG: Sta->11)
*** 03/08/06 12:39 am *** (CHG: Asg->NEW OWNER OWNER)
*** 03/08/06 12:39 am ***
*** 03/08/06 12:43 am ***
*** 03/08/06 09:56 am *** (CHG: DevPri->2)
*** 03/08/06 09:56 am *** (CHG: Confirmed Flag->Y)
*** 03/08/06 09:56 am *** (CHG: Sta->30)
*** 03/08/06 09:56 am ***
*** 03/08/06 06:24 pm ***
*** 03/08/06 08:16 pm *** (CHG: Sta->16)
*** 03/08/06 08:16 pm ***
*** 03/09/06 12:24 am *** (CHG: Sta->11)
*** 03/09/06 12:26 am *** -> CLOSED
*** 03/09/06 12:26 am ***
*** 03/12/06 11:34 pm ***
*** 03/13/06 09:01 am *** (CHG: Pri->4)
*** 03/13/06 09:01 am ***
*** 03/13/06 09:17 am *** (CHG: DevPri->4)
*** 03/13/06 09:17 am *** (CHG: Pri->2)
*** 03/13/06 11:34 am ***
10G Documentation which talks about the change in Time manager
architecture :

er.102/b14257/componet.htm#sthref345
....
....
2.14.1 "AQ_TM_PROCESSES Parameter No Longer Needed in init.ora"

Prior to Oracle Database 10g, Oracle Streams AQ time manager processes were
controlled by init.ora parameter AQ_TM_PROCESSES, which had to be set to
nonzero to perform time monitoring on queue messages and for processing
messages with delay and expiration properties specified. These processes were
named QMNO-9, and the number of them could be changed using:

ALTER SYSTEM SET AQ_TM_PROCESSES=X


Parameter X ranged from 0 to 10. When X was set to 1 or more, that number of
QMN processes were then started. If the parameter was not specified, or was
set to 0, then queue monitor processes were not started.

Beginning in Oracle Streams "AQ 10g Release 1 (10.1)", this was changed to a
coordinator-slave architecture, where a coordinator is automatically spawned
if Oracle Streams AQ or Streams is being used in the system. This process,
named QMNC, dynamically spawns slaves depending on the system load. The
slaves, named qXXX, do various background tasks for Oracle Streams AQ or
Streams. Because the number of processes is determined automatically and
tuned constantly, you no longer need set AQ_TM_PROCESSES.

Even though it is no longer necessary to set AQ_TM_PROCESSES when Oracle
Streams AQ or Streams is used, if you do specify a value, then that value is
taken into account. However, the number of qXXX processes can be different
from what was specified by AQ_TM_PROCESSES.

QMNC only runs when you use queues and create new queues. It affects Streams
Replication and Messaging users.

No separate API is needed to disable or enable the background processes. This
is controlled by setting AQ_TM_PROCESSES to zero or nonzero. Oracle
recommends, however, that you leave the AQ_TM_PROCESSES parameter unspecified
and let the system autotune.

Note:
If you want to disable the Queue Monitor Coordinator, then you must set
AQ_TM_PROCESSES = 0 in your pfile or spfile. Oracle strongly recommends that
you do NOT set AQ_TM_PROCESSES = 0. If you are using Oracle Streams, then
setting this parameter to zero (which Oracle Database respects no matter
what) can cause serious problems.
....
*** 05/25/06 05:36 pm *** (CHG: Pri->3)
*** 05/25/06 05:36 pm ***
*** 05/30/06 10:17 am ***
*** 06/28/06 04:25 pm *** (CHG: DevPri->2)
*** 06/28/06 04:25 pm *** (CHG: Pri->2)
*** 06/28/06 04:25 pm ***
*** 11/16/06 02:45 am ***
*** 11/16/06 07:35 am *** (CHG: Sta->36)
*** 11/16/06 07:35 am ***
*** 11/08/07 04:59 am *** (CHG: Sta->96)

你可能感兴趣的:(oracle,sql,linux,SQL Server,OS)