前2天有客户报一套10.2.0.3的数据库个别服务进程PGA使用量暴涨,疑似内存泄露(memory leak);遂提供on-site service,赶到用户现场时问题进程已经消失,系统内存使用量恢复正常,客户之前除了保留了v$process动态性能视图的信息外未抓取其他有用的诊断信息。 查看保存的v$process视图信息可以看到进程991714的PGA内存使用量达到13个G:
select spid,program,PGA_USED_MEM,PGA_ALLOC_MEM from v$process;
SPID PROGRAM PGA_USED_MEM PGA_ALLOC_MEM
------------------------ ------------------------------------------------ ------------ -------------
991714 oracleBTS@oam_app_a 14427510986 14432001786
oracle@oam_app_a@/oracle/product/10.2.0/dbs $ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 4194304
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
SQL> select x.ksppinm name,y.ksppstvl value
2 from sys.x$ksppi x, sys.x$ksppcv y
3 where x.inst_id=USERENV('Instance')
4 and y.inst_id = USERENV('Instance')
5 and x.indx = y.indx
6 and x.ksppinm like '%pga%';
pga_aggregate_target
200715200
NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_pga_max_size
209715200
SQL> select x.ksppinm name,y.ksppstvl value
2 from sys.x$ksppi x, sys.x$ksppcv y
3 where x.inst_id=USERENV('Instance')
4 and y.inst_id = USERENV('Instance')
5 and x.indx = y.indx
6 and x.ksppinm like '%hash_join%';
NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_hash_join_enabled
TRUE
SQL> show parameter work
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fileio_network_adapters string
workarea_size_policy string AUTO
可以看到该系统使用自动PGA管理且pga_aggregate_target参数值为较小的191M,查询隐藏参数_pga_max_size可发现该参数值也为191M。 从告警日志alert.log中找不到任何信息,单个服务进程PGA使用量达到13G居然没有报ORA-04030错误! 到实例的user_dump目录下ls -ltr了一把有意外收获,找到了该991714进程最近的trace文件:
Dump file /oracle/product/10.2.0/admin/BTS/udump/bts_ora_991714.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/product/10.2.0
System name: AIX
Node name: oam_app_a
Release: 3
Version: 5
Machine: 0000E5DBD600
Instance name: BTS
Redo thread mounted by this instance: 1
Oracle process number: 46
Unix process pid: 991714, image: oracleBTS@oam_app_a
*** 2011-04-19 18:27:07.766
*** SERVICE NAME:(SYS$USERS) 2011-04-19 18:27:07.733
*** SESSION ID:(248.45987) 2011-04-19 18:27:07.733
WARNING: out of private memory [1]
以上trace文件中唯一有用的信息就是"WARNING: out of private memory [1]"了,但在metalink上搜索"out of private memory"找不到任何有用的信息;因为未发生ORA-04030错误,而用户也没有手动去收集过PGA Heap的堆使用情况,所以未产生任何对该进程PGA内存使用细节描述的转储信息文件,难以从dump信息中获取线索继续探索,线索断裂。 通过查询PGA状态历史记录视图可以发现在之前的2个快照时间窗口内,inused PGA总量为13G,之后PGA内存使用量又恢复正常:
select * from dba_hist_pgastat where name='total PGA inuse' order by value asc
SNAP_ID DBID INSTANCE_NUMBER NAME VALUE
---------- ---------- --------------- ---------------------------------------------------------------- ----------
16048 3731271451 1 total PGA inuse 1.4883E+10
16047 3731271451 1 total PGA inuse 1.4888E+10
查询在以上时间窗口中991714的活动会话记录,可以发现该进程一直处于"cursor: pin S wait on X"中;Oracle 10.2.0.2以后利用mutex来替代latch保护SQL游标,当硬解析发生时会出现"cursor: pin s wait on X"等待。这里可以看到需要硬解析的SQL的ID为"bp7y9fvtqra1h",而且此处SQL_CHILD_NUMBER为-1,说明Oracle共享池中之前没有缓存该SQL游标,该问题进程是首次为该SQL语句执行硬解析。
SNAP_ID DBID INSTANCE_NUMBER SAMPLE_ID SAMPLE_TIME SESSION_ID SESSION_SERIAL# USER_ID SQL_ID SQL_CHILD_NUMBER SQL_PLAN_HASH_VALUE FORCE_MATCHING_SIGNATURE SQL_OPCODE PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID SERVICE_HASH SESSION_TYPE SESSION_STATE QC_SESSION_ID QC_INSTANCE_ID BLOCKING_SESSION BLOCKING_SESSION_STATUS BLOCKING_SESSION_SERIAL# EVENT EVENT_ID SEQ# P1TEXT P1 P2TEXT P2 P3TEXT P3 WAIT_CLASS WAIT_CLASS_ID WAIT_TIME TIME_WAITED XID CURRENT_OBJ# CURRENT_FILE# CURRENT_BLOCK# PROGRAM MODULE ACTION CLIENT_ID
1 16047 3731271451 1 57716200 19-4月 -11 04.00.35.625 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 1140 idn 4083918896 value 1.14676E+12 where|sleeps 21474836525 Concurrency 3875070507 0 9772 -1 0 0
2 16047 3731271451 1 57716210 19-4月 -11 04.00.45.629 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 2137 idn 4083918896 value 1.14676E+12 where|sleeps 21474837522 Concurrency 3875070507 0 9773 -1 0 0
3 16047 3731271451 1 57716220 19-4月 -11 04.00.55.633 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 3134 idn 4083918896 value 1.14676E+12 where|sleeps 21474838519 Concurrency 3875070507 0 9775 -1 0 0
4 16047 3731271451 1 57716230 19-4月 -11 04.01.05.637 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 4126 idn 4083918896 value 1.14676E+12 where|sleeps 21474839511 Concurrency 3875070507 0 9772 -1 0 0
但无论是从SQL历史记录视图(Workload Repository)还是v$SQL中都无法找到该条语句,可以判断该问题进程最后也没能成功完成对该'bp7y9fvtqra1h'语句的解析!因为没有该SQL的记录,所以也无法了解该SQL的执行计划和workarea工作区的使用状况,线索再次断裂!
SQL> select * from dba_hist_sqltext where sql_id='bp7y9fvtqra1h';
no rows selected
SQL> select * From v$sql where sql_id='bp7y9fvtqra1h';
no rows selected
通过查询进程内存使用总结历史视图dba_hist_process_mem_summary可以发现占用主要内存的分类是Other,猜测这些Other内存是用来解析游标使用的临时内存(call heap).
SQL> select snap_id,category,num_processes,non_zero_allocs,used_total,allocated_total,allocated_max from dba_hist_process_mem_summary where snap_id in (16047,16048,16049) order by allocated_total desc ,snap_id ;
SNAP_ID CATEGORY NUM_PROCESSES NON_ZERO_ALLOCS USED_TOTAL ALLOCATED_TOTAL ALLOCATED_MAX
---------- --------------- ------------- --------------- ---------- --------------- -------------
16047 Other 95 95 1.5062E+10 1.4665E+10
16048 Other 96 96 1.5053E+10 1.4664E+10
16049 Other 94 94 386117660 53794202
16047 Freeable 36 36 0 36372480 3670016
16048 Freeable 33 33 0 32112640 3670016
16047 SQL 78 31 27346656 29785568 28016616
16048 SQL 79 34 26812992 29240400 27885544
16049 Freeable 27 27 0 26738688 3670016
16049 SQL 77 31 591528 1242168 125816
16048 PL/SQL 96 96 272224 601040 68944
16049 PL/SQL 94 94 218104 536592 68944
16047 PL/SQL 95 95 212816 506536 68800
因为证据断裂,无法根据现有的信息为该服务进程过量使用PGA内存来定位Bug。到MOS上搜索找不到类似的Bug,但即使有也很难定论,因为没有heapdump的话即便提交SR,Oracle GCS(Global Customer Service)也不太愿定位到Bug。 那么如果出现以上类似的PGA内存泄露的问题,我们因当如何第一时间收集有用的信息,以供后续诊断呢?我在这里提供一些可选的方案:
- 定期收集系统内PGA/UGA的使用情况,具体可以使用脚本<Script To Monitor RDBMS Session UGA and PGA Current And Maximum Usage>
- 防御式地在系统级别设置4030 dump heapdump 536870917级别的event dump事件,虽然此案例中未发生ORA-04030错误,但不代表下一次也不发生
- 在问题发生时,第一时间使用oradebug PGA_DETAIL_GET命令填充v$process_memory_detail视图,以便了解问题进程PGA内存的使用细节
- 若觉得v$process_memory_detail动态性能视图的信息还不够丰富的话,也可以做systemstate 266和heapdump 536870917级别的dump
虽然以上我们介绍了一些针对PGA内存泄露问题发生时的诊断方法,但可能你还是不了解具体要如何实践,接下来我们通过实践来体会一下,首先我们特意构建一个PGA内存泄露的假象(测试说明,不要用于生产环境!!):
[oracle@rh2 ~]$ wc -l very_large.sql
18526 very_large.sql
/* 这里very_large.sql是我们"精心"构造的一条万分复杂的SQL语句,解析该SQL语句需要消耗大量的PGA内存 !!* /
SQL> @very_large
/* 执行very_large SQL语句开始模拟内存泄露,将持续较长时间 */
SQL> @MEMORY_USAGE_SCRIPT.SQL
Oracle Memory Usage Report: PGA And UGA Memory Usage Per Session
Host........: rh2.oracle.com
Name........: PROD1
Version.....: 11.2.0.2.0
Startup Time: 2011-04-20 19:41:32
Current Time: 2011.04.21-19:57:16
Worst possible value of concurrent PGA + UGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465 SYS 180444688 2011-04-21 19:52:16
152,987 SYS 67781616 2011-04-21 18:40:59
146,1 [email protected] (ARC3) 37598184 2011-04-20 19:41:43
19,5 [email protected] (ARC2) 36484072 2011-04-20 19:41:43
17,7 [email protected] (ARC0) 33141736 2011-04-20 19:41:42
145,1 [email protected] (ARC1) 19837928 2011-04-20 19:41:43
125,7 [email protected] (CJQ0) 15826432 2011-04-20 19:41:50
135,1 [email protected] (LGWR) 13480936 2011-04-20 19:41:33
131,1 [email protected] (LMS0) 11973608 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 11973608 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 11842536 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 10580296 2011-04-20 19:41:33
25,57 SYS 9854112 2011-04-21 19:56:59
10,1 [email protected] (DBW0) 9105992 2011-04-20 19:41:33
136,1 [email protected] (SMON) 8777056 2011-04-20 19:41:33
140,1 [email protected] (MARK) 8565736 2011-04-20 19:41:33
130,1 [email protected] (LMON) 8238120 2011-04-20 19:41:33
138,1 [email protected] (MMON) 7215184 2011-04-20 19:41:33
31,1 [email protected] (SMCO) 7123896 2011-04-20 19:43:52
3,1 [email protected] (DIAG) 6730728 2011-04-20 19:41:33
16,1 [email protected] (RSMN) 5420008 2011-04-20 19:41:35
150,5 [email protected] (Q000) 5001608 2011-04-20 19:41:46
23,1 [email protected] (Q001) 3445984 2011-04-20 19:41:46
22,1 [email protected] (QMNC) 3314960 2011-04-20 19:41:45
12,1 [email protected] (RECO) 3249448 2011-04-20 19:41:33
11,1 [email protected] (CKPT) 3086120 2011-04-20 19:41:33
128,1 [email protected] (DBRM) 2667304 2011-04-20 19:41:33
14,1 [email protected] (MMNL) 2143208 2011-04-20 19:41:33
127,1 [email protected] (GEN0) 2012136 2011-04-20 19:41:33
158,183 SYS 1758344 2011-04-21 07:44:57
143,23 SYS 1692808 2011-04-21 07:45:01
142,1 [email protected] (LCK0) 1299288 2011-04-20 19:41:34
149,1 [email protected] (RCBG) 1160120 2011-04-20 19:41:45
33,59 [email protected] (W000) 963512 2011-04-21 19:55:14
4,1 [email protected] (PING) 898024 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 832488 2011-04-20 19:41:32
13,1 [email protected] (ASMB) 832488 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 832488 2011-04-20 19:41:33
144,1 [email protected] (O000) 832488 2011-04-20 19:41:36
129,1 [email protected] (ACMS) 832488 2011-04-20 19:41:33
133,1 [email protected] (RMS0) 832488 2011-04-20 19:41:33
1,1 [email protected] (PMON) 832488 2011-04-20 19:41:32
9,1 [email protected] (LMHB) 832488 2011-04-20 19:41:33
21,1 [email protected] (GTX0) 832488 2011-04-20 19:41:45
18,3 [email protected] (O001) 832488 2011-04-20 19:41:37
137,1 [email protected] (RBAL) 832488 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 832488 2011-04-20 19:41:33
Worst possible total and average values of concurrent PGA + UGA memory usage:
564679192 bytes (total) and ~6007225 bytes (average), for ~47 sessions.
Approximate value of current PGA + UGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465 SYS 178083824 2011-04-21 19:52:16
146,1 [email protected] (ARC3) 36484072 2011-04-20 19:41:43
19,5 [email protected] (ARC2) 35369960 2011-04-20 19:41:43
17,7 [email protected] (ARC0) 33141736 2011-04-20 19:41:42
145,1 [email protected] (ARC1) 19837928 2011-04-20 19:41:43
135,1 [email protected] (LGWR) 13480936 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 11973608 2011-04-20 19:41:33
131,1 [email protected] (LMS0) 11973608 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 11842536 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 10580296 2011-04-20 19:41:33
10,1 [email protected] (DBW0) 8712776 2011-04-20 19:41:33
140,1 [email protected] (MARK) 8565736 2011-04-20 19:41:33
130,1 [email protected] (LMON) 8238120 2011-04-20 19:41:33
3,1 [email protected] (DIAG) 6730728 2011-04-20 19:41:33
152,987 SYS 6224040 2011-04-21 18:40:59
16,1 [email protected] (RSMN) 5420008 2011-04-20 19:41:35
125,7 [email protected] (CJQ0) 4854824 2011-04-20 19:41:50
25,57 SYS 4738504 2011-04-21 19:56:59
138,1 [email protected] (MMON) 4165448 2011-04-20 19:41:33
136,1 [email protected] (SMON) 3863504 2011-04-20 19:41:33
150,5 [email protected] (Q000) 3108848 2011-04-20 19:41:46
11,1 [email protected] (CKPT) 2561832 2011-04-20 19:41:33
12,1 [email protected] (RECO) 2538120 2011-04-20 19:41:33
31,1 [email protected] (SMCO) 2536376 2011-04-20 19:43:52
128,1 [email protected] (DBRM) 2339768 2011-04-20 19:41:33
23,1 [email protected] (Q001) 2339672 2011-04-20 19:41:46
22,1 [email protected] (QMNC) 2242336 2011-04-20 19:41:45
127,1 [email protected] (GEN0) 2012136 2011-04-20 19:41:33
14,1 [email protected] (MMNL) 1946600 2011-04-20 19:41:33
158,183 SYS 1692856 2011-04-21 07:44:57
143,23 SYS 1561784 2011-04-21 07:45:01
142,1 [email protected] (LCK0) 1299288 2011-04-20 19:41:34
149,1 [email protected] (RCBG) 1160120 2011-04-20 19:41:45
33,59 [email protected] (W000) 963512 2011-04-21 19:55:14
4,1 [email protected] (PING) 898024 2011-04-20 19:41:33
13,1 [email protected] (ASMB) 832488 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 832488 2011-04-20 19:41:33
129,1 [email protected] (ACMS) 832488 2011-04-20 19:41:33
1,1 [email protected] (PMON) 832488 2011-04-20 19:41:32
9,1 [email protected] (LMHB) 832488 2011-04-20 19:41:33
21,1 [email protected] (GTX0) 832488 2011-04-20 19:41:45
18,3 [email protected] (O001) 832488 2011-04-20 19:41:37
137,1 [email protected] (RBAL) 832488 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 832488 2011-04-20 19:41:32
133,1 [email protected] (RMS0) 832488 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 832488 2011-04-20 19:41:33
144,1 [email protected] (O000) 832488 2011-04-20 19:41:36
Current total and average values of concurrent PGA + UGA memory usage:
463473320 bytes (total) and ~4930567 bytes (average), for ~47 sessions.
Maximum value of PGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465 SYS 177212856 2011-04-21 19:52:16
152,987 SYS 57208040 2011-04-21 18:40:59
146,1 [email protected] (ARC3) 37416168 2011-04-20 19:41:43
19,5 [email protected] (ARC2) 36302056 2011-04-20 19:41:43
17,7 [email protected] (ARC0) 32959720 2011-04-20 19:41:42
145,1 [email protected] (ARC1) 19655912 2011-04-20 19:41:43
135,1 [email protected] (LGWR) 13298920 2011-04-20 19:41:33
125,7 [email protected] (CJQ0) 13045176 2011-04-20 19:41:50
131,1 [email protected] (LMS0) 11791592 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 11791592 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 11660520 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 10398280 2011-04-20 19:41:33
10,1 [email protected] (DBW0) 8923976 2011-04-20 19:41:33
140,1 [email protected] (MARK) 8383720 2011-04-20 19:41:33
130,1 [email protected] (LMON) 8056104 2011-04-20 19:41:33
31,1 [email protected] (SMCO) 6876392 2011-04-20 19:43:52
3,1 [email protected] (DIAG) 6548712 2011-04-20 19:41:33
25,57 SYS 6163896 2011-04-21 19:56:59
136,1 [email protected] (SMON) 5893352 2011-04-20 19:41:33
138,1 [email protected] (MMON) 5294872 2011-04-20 19:41:33
16,1 [email protected] (RSMN) 5237992 2011-04-20 19:41:35
150,5 [email protected] (Q000) 3910216 2011-04-20 19:41:46
11,1 [email protected] (CKPT) 2904104 2011-04-20 19:41:33
23,1 [email protected] (Q001) 2551016 2011-04-20 19:41:46
22,1 [email protected] (QMNC) 2485480 2011-04-20 19:41:45
12,1 [email protected] (RECO) 2485480 2011-04-20 19:41:33
128,1 [email protected] (DBRM) 2223336 2011-04-20 19:41:33
14,1 [email protected] (MMNL) 1961192 2011-04-20 19:41:33
127,1 [email protected] (GEN0) 1830120 2011-04-20 19:41:33
158,183 SYS 1510840 2011-04-21 07:44:57
143,23 SYS 1445304 2011-04-21 07:45:01
142,1 [email protected] (LCK0) 1117272 2011-04-20 19:41:34
149,1 [email protected] (RCBG) 912616 2011-04-20 19:41:45
33,59 [email protected] (W000) 716008 2011-04-21 19:55:14
4,1 [email protected] (PING) 716008 2011-04-20 19:41:33
144,1 [email protected] (O000) 650472 2011-04-20 19:41:36
137,1 [email protected] (RBAL) 650472 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 650472 2011-04-20 19:41:33
133,1 [email protected] (RMS0) 650472 2011-04-20 19:41:33
129,1 [email protected] (ACMS) 650472 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 650472 2011-04-20 19:41:32
21,1 [email protected] (GTX0) 650472 2011-04-20 19:41:45
18,3 [email protected] (O001) 650472 2011-04-20 19:41:37
13,1 [email protected] (ASMB) 650472 2011-04-20 19:41:33
9,1 [email protected] (LMHB) 650472 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 650472 2011-04-20 19:41:33
1,1 [email protected] (PMON) 650472 2011-04-20 19:41:32
Worst possible total and average values of concurrent PGA memory usage:
528694504 bytes (total) and ~11248819 bytes (average), for ~47 sessions.
Maximum value of UGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987 SYS 10573576 2011-04-21 18:40:59
25,57 SYS 3690216 2011-04-21 19:56:59
29,465 SYS 3231832 2011-04-21 19:52:16
136,1 [email protected] (SMON) 2883704 2011-04-20 19:41:33
125,7 [email protected] (CJQ0) 2781256 2011-04-20 19:41:50
138,1 [email protected] (MMON) 1920312 2011-04-20 19:41:33
150,5 [email protected] (Q000) 1091392 2011-04-20 19:41:46
23,1 [email protected] (Q001) 894968 2011-04-20 19:41:46
22,1 [email protected] (QMNC) 829480 2011-04-20 19:41:45
12,1 [email protected] (RECO) 763968 2011-04-20 19:41:33
128,1 [email protected] (DBRM) 443968 2011-04-20 19:41:33
158,183 SYS 247504 2011-04-21 07:44:57
149,1 [email protected] (RCBG) 247504 2011-04-20 19:41:45
143,23 SYS 247504 2011-04-21 07:45:01
33,59 [email protected] (W000) 247504 2011-04-21 19:55:14
31,1 [email protected] (SMCO) 247504 2011-04-20 19:43:52
146,1 [email protected] (ARC3) 182016 2011-04-20 19:41:43
145,1 [email protected] (ARC1) 182016 2011-04-20 19:41:43
144,1 [email protected] (O000) 182016 2011-04-20 19:41:36
142,1 [email protected] (LCK0) 182016 2011-04-20 19:41:34
140,1 [email protected] (MARK) 182016 2011-04-20 19:41:33
137,1 [email protected] (RBAL) 182016 2011-04-20 19:41:33
135,1 [email protected] (LGWR) 182016 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 182016 2011-04-20 19:41:33
133,1 [email protected] (RMS0) 182016 2011-04-20 19:41:33
131,1 [email protected] (LMS0) 182016 2011-04-20 19:41:33
130,1 [email protected] (LMON) 182016 2011-04-20 19:41:33
129,1 [email protected] (ACMS) 182016 2011-04-20 19:41:33
127,1 [email protected] (GEN0) 182016 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 182016 2011-04-20 19:41:32
21,1 [email protected] (GTX0) 182016 2011-04-20 19:41:45
19,5 [email protected] (ARC2) 182016 2011-04-20 19:41:43
18,3 [email protected] (O001) 182016 2011-04-20 19:41:37
17,7 [email protected] (ARC0) 182016 2011-04-20 19:41:42
16,1 [email protected] (RSMN) 182016 2011-04-20 19:41:35
14,1 [email protected] (MMNL) 182016 2011-04-20 19:41:33
13,1 [email protected] (ASMB) 182016 2011-04-20 19:41:33
11,1 [email protected] (CKPT) 182016 2011-04-20 19:41:33
10,1 [email protected] (DBW0) 182016 2011-04-20 19:41:33
9,1 [email protected] (LMHB) 182016 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 182016 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 182016 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 182016 2011-04-20 19:41:33
4,1 [email protected] (PING) 182016 2011-04-20 19:41:33
3,1 [email protected] (DIAG) 182016 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 182016 2011-04-20 19:41:33
1,1 [email protected] (PMON) 182016 2011-04-20 19:41:32
Worst possible total and average values of concurrent UGA memory usage:
35984688 bytes (total) and ~765631 bytes (average), for ~47 sessions.
Current value of PGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465 SYS 177802680 2011-04-21 19:52:16
146,1 [email protected] (ARC3) 36302056 2011-04-20 19:41:43
19,5 [email protected] (ARC2) 35187944 2011-04-20 19:41:43
17,7 [email protected] (ARC0) 32959720 2011-04-20 19:41:42
145,1 [email protected] (ARC1) 19655912 2011-04-20 19:41:43
135,1 [email protected] (LGWR) 13298920 2011-04-20 19:41:33
131,1 [email protected] (LMS0) 11791592 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 11791592 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 11660520 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 10398280 2011-04-20 19:41:33
10,1 [email protected] (DBW0) 8530760 2011-04-20 19:41:33
140,1 [email protected] (MARK) 8383720 2011-04-20 19:41:33
130,1 [email protected] (LMON) 8056104 2011-04-20 19:41:33
3,1 [email protected] (DIAG) 6548712 2011-04-20 19:41:33
16,1 [email protected] (RSMN) 5237992 2011-04-20 19:41:35
152,987 SYS 4582632 2011-04-21 18:40:59
125,7 [email protected] (CJQ0) 3935672 2011-04-20 19:41:50
25,57 SYS 3787544 2011-04-21 19:56:59
136,1 [email protected] (SMON) 3140840 2011-04-20 19:41:33
138,1 [email protected] (MMON) 3066648 2011-04-20 19:41:33
150,5 [email protected] (Q000) 2468424 2011-04-20 19:41:46
11,1 [email protected] (CKPT) 2379816 2011-04-20 19:41:33
31,1 [email protected] (SMCO) 2288872 2011-04-20 19:43:52
12,1 [email protected] (RECO) 2223336 2011-04-20 19:41:33
128,1 [email protected] (DBRM) 2092264 2011-04-20 19:41:33
23,1 [email protected] (Q001) 1961192 2011-04-20 19:41:46
22,1 [email protected] (QMNC) 1961192 2011-04-20 19:41:45
127,1 [email protected] (GEN0) 1830120 2011-04-20 19:41:33
14,1 [email protected] (MMNL) 1764584 2011-04-20 19:41:33
158,183 SYS 1510840 2011-04-21 07:44:57
143,23 SYS 1379768 2011-04-21 07:45:01
142,1 [email protected] (LCK0) 1117272 2011-04-20 19:41:34
149,1 [email protected] (RCBG) 912616 2011-04-20 19:41:45
33,59 [email protected] (W000) 716008 2011-04-21 19:55:14
4,1 [email protected] (PING) 716008 2011-04-20 19:41:33
144,1 [email protected] (O000) 650472 2011-04-20 19:41:36
137,1 [email protected] (RBAL) 650472 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 650472 2011-04-20 19:41:33
133,1 [email protected] (RMS0) 650472 2011-04-20 19:41:33
129,1 [email protected] (ACMS) 650472 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 650472 2011-04-20 19:41:32
21,1 [email protected] (GTX0) 650472 2011-04-20 19:41:45
18,3 [email protected] (O001) 650472 2011-04-20 19:41:37
13,1 [email protected] (ASMB) 650472 2011-04-20 19:41:33
9,1 [email protected] (LMHB) 650472 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 650472 2011-04-20 19:41:33
1,1 [email protected] (PMON) 650472 2011-04-20 19:41:32
Current total and average values of concurrent PGA memory usage:
449247816 bytes (total) and ~9558464 bytes (average), for ~47 sessions.
Current value of UGA memory usage per session:
SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987 SYS 1641408 2011-04-21 18:40:59
138,1 [email protected] (MMON) 1098800 2011-04-20 19:41:33
25,57 SYS 950960 2011-04-21 19:56:59
125,7 [email protected] (CJQ0) 919152 2011-04-20 19:41:50
136,1 [email protected] (SMON) 722664 2011-04-20 19:41:33
150,5 [email protected] (Q000) 640424 2011-04-20 19:41:46
23,1 [email protected] (Q001) 378480 2011-04-20 19:41:46
12,1 [email protected] (RECO) 314784 2011-04-20 19:41:33
29,465 SYS 281144 2011-04-21 19:52:16
22,1 [email protected] (QMNC) 281144 2011-04-20 19:41:45
149,1 [email protected] (RCBG) 247504 2011-04-20 19:41:45
128,1 [email protected] (DBRM) 247504 2011-04-20 19:41:33
33,59 [email protected] (W000) 247504 2011-04-21 19:55:14
31,1 [email protected] (SMCO) 247504 2011-04-20 19:43:52
158,183 SYS 182016 2011-04-21 07:44:57
146,1 [email protected] (ARC3) 182016 2011-04-20 19:41:43
145,1 [email protected] (ARC1) 182016 2011-04-20 19:41:43
144,1 [email protected] (O000) 182016 2011-04-20 19:41:36
143,23 SYS 182016 2011-04-21 07:45:01
142,1 [email protected] (LCK0) 182016 2011-04-20 19:41:34
140,1 [email protected] (MARK) 182016 2011-04-20 19:41:33
137,1 [email protected] (RBAL) 182016 2011-04-20 19:41:33
135,1 [email protected] (LGWR) 182016 2011-04-20 19:41:33
134,1 [email protected] (MMAN) 182016 2011-04-20 19:41:33
133,1 [email protected] (RMS0) 182016 2011-04-20 19:41:33
131,1 [email protected] (LMS0) 182016 2011-04-20 19:41:33
130,1 [email protected] (LMON) 182016 2011-04-20 19:41:33
129,1 [email protected] (ACMS) 182016 2011-04-20 19:41:33
127,1 [email protected] (GEN0) 182016 2011-04-20 19:41:33
126,1 [email protected] (PSP0) 182016 2011-04-20 19:41:32
21,1 [email protected] (GTX0) 182016 2011-04-20 19:41:45
19,5 [email protected] (ARC2) 182016 2011-04-20 19:41:43
18,3 [email protected] (O001) 182016 2011-04-20 19:41:37
17,7 [email protected] (ARC0) 182016 2011-04-20 19:41:42
16,1 [email protected] (RSMN) 182016 2011-04-20 19:41:35
14,1 [email protected] (MMNL) 182016 2011-04-20 19:41:33
13,1 [email protected] (ASMB) 182016 2011-04-20 19:41:33
11,1 [email protected] (CKPT) 182016 2011-04-20 19:41:33
10,1 [email protected] (DBW0) 182016 2011-04-20 19:41:33
9,1 [email protected] (LMHB) 182016 2011-04-20 19:41:33
7,1 [email protected] (LMS1) 182016 2011-04-20 19:41:33
6,1 [email protected] (LMD0) 182016 2011-04-20 19:41:33
5,1 [email protected] (DIA0) 182016 2011-04-20 19:41:33
4,1 [email protected] (PING) 182016 2011-04-20 19:41:33
3,1 [email protected] (DIAG) 182016 2011-04-20 19:41:33
2,1 [email protected] (VKTM) 182016 2011-04-20 19:41:33
1,1 [email protected] (PMON) 182016 2011-04-20 19:41:32
Current total and average values of concurrent UGA memory usage:
14225504 bytes (total) and ~302670 bytes (average), for ~47 sessions.
Current SGA structure sizings:
Total System Global Area 939495424 bytes
Fixed Size 2232088 bytes
Variable Size 398459112 bytes
Database Buffers 532676608 bytes
Redo Buffers 6127616 bytes
Some initialization parameter values at instance startup:
large_pool_size=0
pga_aggregate_target=0
sga_target=0
shared_pool_size=0
sort_area_size=65536
streams_pool_size=0
Current Time: 2011.04.21-19:57:16
/* 可以从以上输出看到sid,serial=29,465会话的PGA内存使用量异常,达到了170M,
虽然跟以上案例中的PGA泄露情况比较不算什么 */
/* 使用sid和serial定位到具体的操作系统进程号 */
SQL> select spid,pid,PGA_USED_MEM,PGA_MAX_MEM from v$process
2 where addr=(select paddr from v$session where sid=&1 and serial#=&2);
Enter value for 1: 29
Enter value for 2: 465
old 2: where addr=(select paddr from v$session where sid=&1 and serial#=&2)
new 2: where addr=(select paddr from v$session where sid=29 and serial#=465)
SPID PID PGA_USED_MEM PGA_MAX_MEM
------------------------ ---------- ------------ -----------
26932 48 129716228 130034996
1 row selected.
SQL> oradebug setospid 26932;
Oracle pid: 48, Unix process pid: 26932, image: [email protected] (TNS V1-V3)
SQL> oradebug dump heapdump 536870917;
Statement processed.
SQL> oradebug dump processstate 10;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc
/* 接下来对堆转储文件进行分析,通过grep可以找出其中较大的SubHEAP子堆 */
[oracle@rh2 ~]$ egrep "HEAP DUMP heap name|Total heap size|Permanent space"
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc
HEAP DUMP heap name="session heap" desc=0x2ac0b37a67f8
Total heap size = 130840
Permanent space = 62680
HEAP DUMP heap name="Alloc environm" desc=0x2ac0b37ce090
Total heap size = 4040
Permanent space = 4040
HEAP DUMP heap name="PLS UGA hp" desc=0x2ac0b37be7f0
Total heap size = 1992
Permanent space = 1080
HEAP DUMP heap name="koh-kghu sessi" desc=0x2ac0b37cf660
Total heap size = 1128
Permanent space = 80
HEAP DUMP heap name="pga heap" desc=0xb7c8ba0
Total heap size = 2689432
Permanent space = 660560
HEAP DUMP heap name="Alloc environm" desc=0x2ac0b35ba5c8
Total heap size = 1706816
Permanent space = 464
HEAP DUMP heap name="Alloc server h" desc=0x2ac0b35b9000
Total heap size = 1704400
Permanent space = 1694816
HEAP DUMP heap name="diag pga" desc=0x2ac0b32537e0
Total heap size = 65448
Permanent space = 3672
HEAP DUMP heap name="KFK_IO_SUBHEAP" desc=0x2ac0b35eb2b0
Total heap size = 10992
Permanent space = 80
HEAP DUMP heap name="peshm.c:Proces" desc=0x2ac0b35e7ad0
Total heap size = 4000
Permanent space = 80
HEAP DUMP heap name="KSFQ heap" desc=0x2ac0b35c6d70
Total heap size = 3256
Permanent space = 3256
HEAP DUMP heap name="top call heap" desc=0xb7ce3c0
Total heap size =155918560
Permanent space = 448
HEAP DUMP heap name="callheap" desc=0xb7cd4c0
Total heap size =152906784
Permanent space = 80
HEAP DUMP heap name="TCHK^30c42b7a" desc=0x2ac0b378ff48
Total heap size =151414512
Permanent space = 80
HEAP DUMP heap name="kggec.c.kggfa" desc=0x2ac0b4e76ec8
Total heap size = 1016
Permanent space = 736
HEAP DUMP heap name="kxs-heap-c" desc=0x2ac0b37800c0
Total heap size = 1489464
Permanent space = 1485928
HEAP DUMP heap name="top uga heap" desc=0xb7ce5e0
Total heap size = 131024
Permanent space = 80
HEAP DUMP heap name="session heap" desc=0x2ac0b37a67f8
Total heap size = 130840
Permanent space = 62680
HEAP DUMP heap name="Alloc environm" desc=0x2ac0b37ce090
Total heap size = 4040
Permanent space = 4040
HEAP DUMP heap name="PLS UGA hp" desc=0x2ac0b37be7f0
Total heap size = 1992
Permanent space = 1080
HEAP DUMP heap name="koh-kghu sessi" desc=0x2ac0b37cf660
Total heap size = 1128
Permanent space = 80
HEAP DUMP heap name="SQLA^30c42b7a" desc=0x6f4c3ab8
Total heap size = 4919904
Permanent space = 80
HEAP DUMP heap name="KGLH0^30c42b7a" desc=0x6ef44290
Total heap size = 4032
Permanent space = 2648
/* 以上heapdump表明TCHK^30c42b7a子堆占用了PGA中绝大多数的内存,
其中子堆的包含结构为PGA->top call heap -> call heap -> TCHK
*/
/* 接着processstate dump还可以让我们了解该问题进程的最近活动历史,及之前所运行的SQL语句
以便进一步诊断,以下为其调用堆栈 */
ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1900<-sspuser()+112<-__sighandler()<-qcsfccc()+206
<-qcsIsColInFro()+309<-qcsRslvColWithinQbc()+179<-qcsWeakColRslv()+94<-qcsRslvName()+2541<-qcsridn()+105
<-qcsraic()+455<-qcspqbDescendents()+527<-qcspqb()+260<-qcspqbDescendents()+2744<-qcspqb()+260
<-kkmdrv()+182<-opiSem()+1947<-opiprs()+293<-__PGOSF632_kksParseChildCursor()+572<-rpiswu2()+1618
<-kksLoadChild()+5167<-kxsGetRuntimeLock()+2066<-kksfbc()+14527<-kkspsc0()+2025<-kksParseCursor()+144
<-opiosq0()+2027<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+1670<-opiino()+966
<-opiodr()+910<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+244
<-_start()+36
通过以上获取的信息"call heap-> TCHK"可以从MOS查找到11.2上的”Bug 11782790: ORA-4030 IS GENERATED IN HARD PARSE"和"Bug 12360198: ORA-04030 (TCHK^82665CD8,CHEDEF : QCUATC)";虽然我们这里故意为之而非真实的Bug,也可以看出"TCHK subheap"在11.2中充当SQL解析时的临时调用子堆(parse call subheap)。 to be continued.....