Cloud Control 13.x and seeing high cpu from sql_id 9mpgkshvhabxv which is running from emagent_SQL_oracle_database
SQL ID : 9mpgkshvhabxv
SQL Text : with latest_run as
(
SELECT all_runs.CON_ID,
all_runs.OWNER,
all_runs.JOB_NAME,
all_runs.STATUS
FROM CDB_SCHEDULER_JOB_RUN_DETAILS all_runs,
(SELECT sub.CON_ID,
sub.OWNER,
sub.JOB_NAME,
MAX(sub.ACTUAL_START_DATE) AS START_DATE
FROM CDB_SCHEDULER_JOB_RUN_DETAILS sub
WHERE sub.job_name in (SELECT job_name FROM
CDB_SCHEDULER_JOB_RUN_DETAILS where status = 'FAILED')
GROUP BY sub.CON_ID,OWNER,JOB_NAME) latest_runs
WHERE all_runs.status = 'FAILED'
AND latest_runs.CON_ID =all_runs.CON_ID
AND latest_runs.OWNER=all_runs.OWNER
AND latest_runs.JOB_NAME=all_runs.JOB_NAME
AND all_runs.ACTUAL_START_DATE= latest_runs.START_DATE
)
SELECT pdb_name, NVL(SUM(broken),0), NVL(SUM(failed),0)
FROM (SELECT c.name pdb_name,
DECODE(broken, 'N', 0, 1) broken,
DECODE(NVL(failures,0), 0, 0, 1) failed
FROM cdb_jobs j, v$containers c
WHERE j.con_id = TO_NUMBER(c.con_id)
UNION ALL
SELECT c.name pdb_name,
DECODE(failed_details.STATE, 'BROKEN', 1, 0) broken,
DECODE(failed_details.STATUS , 'FAILED',
DECODE(failed_details.STATE,'BROKEN',0,'DISABLED',0,1), 0)
failed
FROM v$containers c,
(SELECT all_jobs.CON_ID,
all_jobs.OWNER,
all_jobs.JOB_NAME,
latest_run.STATUS,
all_jobs.STATE
FROM CDB_SCHEDULER_JOBS all_jobs,latest_run
WHERE latest_run.CON_ID=all_jobs.CON_ID
AND latest_run.OWNER=all_jobs.OWNER
AND latest_run.JOB_NAME=all_jobs.JOB_NAME
) failed_details
WHERE failed_details.con_id = TO_NUMBER(c.con_id)
)
GROUP BY pdb_name
Metric "Database Job Status (All Pluggable Databases)" was only disabled at the CDB level
"Database Job Status (All Pluggable Databases)" needed to be disable at both the CDB & PDB level
"Oracle Database|Cluster Database" -> "Monitoring" -> "Metric and Collection Settings"
Click on "Other Collected Items" tab and you'll find "Database Job Status (All Pluggable Databases)" there, you can disable that metric
Oracle Enterprise Manager 13c Agent is taking 99% of 8 CPU-Time
When killing the Agent and killing all DBSNP Sessions, CPU Utilization on host comes back to normal state
issue is happening because of following SQL query
SQL_TEXT OPEN CURSORS
------------------------------------------------------------ ------------
SELECT /*+ NO_STATEMENT_QUEUING RESULT_CACHE (SYSOBJ=TRUE) * 441
with latest_run as ( SELECT all_runs.CON_ID, all_runs.OWNER 179
this SQL Query comes from metric "Database Job Status (All Pluggable Databases)".
This SQL query comes from "Database Job Status (All Pluggable Databases)" metrics.
As a workaround to the issue, you can disable the metric causing this SQL query to run by running following commands on OMS host.
cd
$emcli login -username=sysman
$emcli sync
$emcli modify_collection_schedule -targetType="oracle_database" -targetNames="" -collectionName="cdbjob_status" -collectionStatus=Disabled -preview="N"
once disabled, bounce/restart the agent. this should help with CPU usage. please monitor the agent/CPU for few days and confirm.
To identify the exact job name that is in failed or broken state matching the incident alert seen in EM console UI for metrics 'Failed Job Count' and 'Broken Job Count'.
Please execute below query for the dba_jobs view at the Contains level (CDB)
connect to CDB ROOT
select * from dba_jobs;
The output will show Broken and Failure columns of the job state, and the associated job_name.
In Additional, Using Metric Extension Job name information for failed Job can be collected.
Refer : EM 12c, EM 13c: Enterprise Manager Cloud Control - How to Create a SQL-Based Metric Extension (Doc ID 1455942.1)