1、业务现象:业务高峰期时,数据库处理能力陡降,超时严重。截图如下:
查询active_session_history 发现均是平时的业务语句占用:
select t.sql_id,s.SQL_TEXT, count(*)
from v$active_session_history t,v$sqlarea s
where t.SAMPLE_TIME >
to_timestamp('20190306 15:00:00', 'YYYYMMDD hh24:mi:ss')
and
t.SAMPLE_TIME <
to_timestamp('20190306 15:01:00', 'YYYYMMDD hh24:mi:ss')
-- and t.SESSION_TYPE='FOREGROUND'
and t.SQL_ID=s.SQL_ID
group by t.SQL_ID,s.SQL_TEXT
order by count(*) desc
2、查看等待时间及sql 如下:
select t.EVENT,t.WAIT_CLASS,t.SQL_ID,t.p1, count(*)
from v$active_session_history t
where t.SAMPLE_TIME >
to_timestamp('20190306 15:00:00', 'YYYYMMDD hh24:mi:ss')
and
t.SAMPLE_TIME <
to_timestamp('20190306 15:01:00', 'YYYYMMDD hh24:mi:ss')
group by t.EVENT,t.WAIT_CLASS,t.SQL_ID,t.p1
order by count(*) desc
发现有很多latch,且latch 号为33857648584 转换为16进制为7E212B7C8:,查看latch种类:
select * from v$latch_children where addr like '%7E212B7C8';
select distinct s.kqrstcln latch#,r.cache#,r.parameter name,r.type,r.subordinate# from v$rowcache r,x$kqrst s where r.cache#=s.kqrstcid order by 1,4,5
发现latch child号为8的全是dc_users相关,
查看sql 的执行计划 select * from table(dbms_xplan.display_cursor('9cm09btnkgfvc',null,'advanced'));
或
select t.EVENT,t.WAIT_CLASS,t.SQL_ID,t.SQL_PLAN_OPERATION,count(*)
from v$active_session_history t
where t.SAMPLE_TIME >
to_timestamp('20190306 15:00:00', 'YYYYMMDD hh24:mi:ss')
and
t.SAMPLE_TIME <
to_timestamp('20190306 15:01:00', 'YYYYMMDD hh24:mi:ss')
group by t.EVENT,t.WAIT_CLASS,t.SQL_ID,t.SQL_PLAN_OPERATION
order by count(*) desc
查询dc_user, concurrency,hash_join, latch row cache object 等关键词,发现bug如下,比较符合:
Bug 13902396 – Hash joins cause “row cache objects” latch gets and “shared pool” latch gets (disabled fix) (文档 ID 13902396.8)
Slow Performance with High Waits for ‘row cache lock’ With Possible Database Hang (文档 ID 2189126.1)
解决办法:
在11.2.0.4的版本上,打补丁13902396(这个补丁没包括在任何PSU内),然后设置如下Event:
event=’45053 trace name context forever, level 127′