Probe how does your PGA consume

Probe how does your PGA consume

前2天有客户报一套10.2.0.3的数据库个别服务进程PGA使用量暴涨,疑似内存泄露(memory leak);遂提供on-site service,赶到用户现场时问题进程已经消失,系统内存使用量恢复正常,客户之前除了保留了v$process动态性能视图的信息外未抓取其他有用的诊断信息。

查看保存的v$process视图信息可以看到进程991714的PGA内存使用量达到13个G:

select spid,program,PGA_USED_MEM,PGA_ALLOC_MEM from v$process;
SPID                     PROGRAM                                          PGA_USED_MEM PGA_ALLOC_MEM
------------------------ ------------------------------------------------ ------------ -------------
991714                         oracleBTS@oam_app_a             14427510986 14432001786

oracle@oam_app_a@/oracle/product/10.2.0/dbs $ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited

SQL> select x.ksppinm name,y.ksppstvl value
  2  from sys.x$ksppi x, sys.x$ksppcv y
  3  where x.inst_id=USERENV('Instance')
  4  and y.inst_id = USERENV('Instance')
  5  and x.indx = y.indx
  6  and x.ksppinm like '%pga%';

pga_aggregate_target
200715200

NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_pga_max_size
209715200

SQL> select x.ksppinm name,y.ksppstvl value
  2  from sys.x$ksppi x, sys.x$ksppcv y
  3  where x.inst_id=USERENV('Instance')
  4  and y.inst_id = USERENV('Instance')
  5  and x.indx = y.indx
  6  and x.ksppinm like '%hash_join%';

NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_hash_join_enabled
TRUE

SQL> show parameter work

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fileio_network_adapters              string
workarea_size_policy                 string      AUTO

可以看到该系统使用自动PGA管理且pga_aggregate_target参数值为较小的191M,查询隐藏参数_pga_max_size可发现该参数值也为191M。

从告警日志alert.log中找不到任何信息,单个服务进程PGA使用量达到13G居然没有报ORA-04030错误!

到实例的user_dump目录下ls -ltr了一把有意外收获,找到了该991714进程最近的trace文件:

Dump file /oracle/product/10.2.0/admin/BTS/udump/bts_ora_991714.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/product/10.2.0
System name:    AIX
Node name:      oam_app_a
Release:        3
Version:        5
Machine:        0000E5DBD600
Instance name: BTS
Redo thread mounted by this instance: 1
Oracle process number: 46
Unix process pid: 991714, image: oracleBTS@oam_app_a

*** 2011-04-19 18:27:07.766
*** SERVICE NAME:(SYS$USERS) 2011-04-19 18:27:07.733
*** SESSION ID:(248.45987) 2011-04-19 18:27:07.733
WARNING: out of private memory [1]

以上trace文件中唯一有用的信息就是”WARNING: out of private memory [1]”了,但在metalink上搜索”out of private memory”找不到任何有用的信息;因为未发生ORA-04030错误,而用户也没有手动去收集过PGA Heap的堆使用情况,所以未产生任何对该进程PGA内存使用细节描述的转储信息文件,难以从dump信息中获取线索继续探索,线索断裂。

通过查询PGA状态历史记录视图可以发现在之前的2个快照时间窗口内,inused PGA总量为13G,之后PGA内存使用量又恢复正常:

select * from dba_hist_pgastat where name='total PGA inuse' order by value asc
 SNAP_ID       DBID INSTANCE_NUMBER NAME                                                                  VALUE
---------- ---------- --------------- ---------------------------------------------------------------- ----------
     16048 3731271451               1 total PGA inuse                                                  1.4883E+10
     16047 3731271451               1 total PGA inuse                                                  1.4888E+10

查询在以上时间窗口中991714的活动会话记录,可以发现该进程一直处于”cursor: pin S wait on X”中;Oracle 10.2.0.2以后利用mutex来替代latch保护SQL游标,当硬解析发生时会出现”cursor: pin s wait on X”等待。这里可以看到需要硬解析的SQL的ID为”bp7y9fvtqra1h”,而且此处SQL_CHILD_NUMBER为-1,说明Oracle共享池中之前没有缓存该SQL游标,该问题进程是首次为该SQL语句执行硬解析。

     	SNAP_ID	DBID	INSTANCE_NUMBER	SAMPLE_ID	SAMPLE_TIME	SESSION_ID	SESSION_SERIAL#	USER_ID	SQL_ID	SQL_CHILD_NUMBER	SQL_PLAN_HASH_VALUE	FORCE_MATCHING_SIGNATURE	SQL_OPCODE	PLSQL_ENTRY_OBJECT_ID	PLSQL_ENTRY_SUBPROGRAM_ID	PLSQL_OBJECT_ID	PLSQL_SUBPROGRAM_ID	SERVICE_HASH	SESSION_TYPE	SESSION_STATE	QC_SESSION_ID	QC_INSTANCE_ID	BLOCKING_SESSION	BLOCKING_SESSION_STATUS	BLOCKING_SESSION_SERIAL#	EVENT	EVENT_ID	SEQ#	P1TEXT	P1	P2TEXT	P2	P3TEXT	P3	WAIT_CLASS	WAIT_CLASS_ID	WAIT_TIME	TIME_WAITED	XID	CURRENT_OBJ#	CURRENT_FILE#	CURRENT_BLOCK#	PROGRAM	MODULE	ACTION	CLIENT_ID
1	16047	3731271451	1	57716200	19-4月 -11 04.00.35.625 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	1140	idn	4083918896	value	1.14676E+12	where|sleeps	21474836525	Concurrency	3875070507	0	9772		-1	0	0
2	16047	3731271451	1	57716210	19-4月 -11 04.00.45.629 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	2137	idn	4083918896	value	1.14676E+12	where|sleeps	21474837522	Concurrency	3875070507	0	9773		-1	0	0
3	16047	3731271451	1	57716220	19-4月 -11 04.00.55.633 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	3134	idn	4083918896	value	1.14676E+12	where|sleeps	21474838519	Concurrency	3875070507	0	9775		-1	0	0
4	16047	3731271451	1	57716230	19-4月 -11 04.01.05.637 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	4126	idn	4083918896	value	1.14676E+12	where|sleeps	21474839511	Concurrency	3875070507	0	9772		-1	0	0

但无论是从SQL历史记录视图(Workload Repository)还是v$SQL中都无法找到该条语句,可以判断该问题进程最后也没能成功完成对该’bp7y9fvtqra1h’语句的解析!因为没有该SQL的记录,所以也无法了解该SQL的执行计划和workarea工作区的使用状况,线索再次断裂!

SQL> select * from dba_hist_sqltext where sql_id='bp7y9fvtqra1h';

no rows selected

SQL> select * From v$sql where sql_id='bp7y9fvtqra1h';

no rows selected

通过查询进程内存使用总结历史视图dba_hist_process_mem_summary可以发现占用主要内存的分类是Other,猜测这些Other内存是用来解析游标使用的临时内存(call heap).

SQL> select snap_id,category,num_processes,non_zero_allocs,used_total,allocated_total,allocated_max from dba_hist_process_mem_summary where snap_id in (16047,16048,16049) order by allocated_total desc ,snap_id ;

   SNAP_ID CATEGORY        NUM_PROCESSES NON_ZERO_ALLOCS USED_TOTAL ALLOCATED_TOTAL ALLOCATED_MAX
---------- --------------- ------------- --------------- ---------- --------------- -------------
     16047 Other                      95              95                 1.5062E+10    1.4665E+10
     16048 Other                      96              96                 1.5053E+10    1.4664E+10
     16049 Other                      94              94                  386117660      53794202
     16047 Freeable                   36              36          0        36372480       3670016
     16048 Freeable                   33              33          0        32112640       3670016
     16047 SQL                        78              31   27346656        29785568      28016616
     16048 SQL                        79              34   26812992        29240400      27885544
     16049 Freeable                   27              27          0        26738688       3670016
     16049 SQL                        77              31     591528         1242168        125816
     16048 PL/SQL                     96              96     272224          601040         68944
     16049 PL/SQL                     94              94     218104          536592         68944
     16047 PL/SQL                     95              95     212816          506536         68800

因为证据断裂,无法根据现有的信息为该服务进程过量使用PGA内存来定位Bug。到MOS上搜索找不到类似的Bug,但即使有也很难定论,因为没有heapdump的话即便提交SR,Oracle GCS(Global Customer Service)也不太愿定位到Bug。

那么如果出现以上类似的PGA内存泄露的问题,我们因当如何第一时间收集有用的信息,以供后续诊断呢?我在这里提供一些可选的方案:

  1. 定期收集系统内PGA/UGA的使用情况,具体可以使用脚本<Script To Monitor RDBMS Session UGA and PGA Current And Maximum Usage>
  2. 防御式地在系统级别设置4030 dump heapdump 536870917级别的event dump事件,虽然此案例中未发生ORA-04030错误,但不代表下一次也不发生
  3. 在问题发生时,第一时间使用oradebug PGA_DETAIL_GET命令填充v$process_memory_detail视图,以便了解问题进程PGA内存的使用细节
  4. 若觉得v$process_memory_detail动态性能视图的信息还不够丰富的话,也可以做systemstate 266和heapdump 536870917级别的dump

虽然以上我们介绍了一些针对PGA内存泄露问题发生时的诊断方法,但可能你还是不了解具体要如何实践,接下来我们通过实践来体会一下,首先我们特意构建一个PGA内存泄露的假象(测试说明,不要用于生产环境!!):

[oracle@rh2 ~]$ wc -l very_large.sql 
18526 very_large.sql

/* 这里very_large.sql是我们"精心"构造的一条万分复杂的SQL语句,解析该SQL语句需要消耗大量的PGA内存 !!* /

SQL> @very_large

/* 执行very_large SQL语句开始模拟内存泄露,将持续较长时间  */

SQL> @MEMORY_USAGE_SCRIPT.SQL

Oracle Memory Usage Report: PGA And UGA Memory Usage Per Session

Host........: rh2.oracle.com

Name........: PROD1

Version.....: 11.2.0.2.0

Startup Time: 2011-04-20 19:41:32

Current Time: 2011.04.21-19:57:16

Worst possible value of concurrent PGA + UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                              SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               180444688 2011-04-21 19:52:16
152,987             SYS                                                67781616 2011-04-21 18:40:59
146,1               [email protected] (ARC3)                       37598184 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       36484072 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       33141736 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19837928 2011-04-20 19:41:43
125,7               [email protected] (CJQ0)                       15826432 2011-04-20 19:41:50
135,1               [email protected] (LGWR)                       13480936 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11973608 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11973608 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11842536 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10580296 2011-04-20 19:41:33
25,57               SYS                                                 9854112 2011-04-21 19:56:59
10,1                [email protected] (DBW0)                        9105992 2011-04-20 19:41:33
136,1               [email protected] (SMON)                        8777056 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8565736 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8238120 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        7215184 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        7123896 2011-04-20 19:43:52
3,1                 [email protected] (DIAG)                        6730728 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5420008 2011-04-20 19:41:35
150,5               [email protected] (Q000)                        5001608 2011-04-20 19:41:46
23,1                [email protected] (Q001)                        3445984 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        3314960 2011-04-20 19:41:45
12,1                [email protected] (RECO)                        3249448 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                        3086120 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2667304 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        2143208 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                        2012136 2011-04-20 19:41:33
158,183             SYS                                                 1758344 2011-04-21 07:44:57
143,23              SYS                                                 1692808 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1299288 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                        1160120 2011-04-20 19:41:45
33,59               [email protected] (W000)                         963512 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         898024 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         832488 2011-04-20 19:41:32
13,1                [email protected] (ASMB)                         832488 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         832488 2011-04-20 19:41:33
144,1               [email protected] (O000)                         832488 2011-04-20 19:41:36
129,1               [email protected] (ACMS)                         832488 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         832488 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         832488 2011-04-20 19:41:32
9,1                 [email protected] (LMHB)                         832488 2011-04-20 19:41:33
21,1                [email protected] (GTX0)                         832488 2011-04-20 19:41:45
18,3                [email protected] (O001)                         832488 2011-04-20 19:41:37
137,1               [email protected] (RBAL)                         832488 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         832488 2011-04-20 19:41:33

Worst possible total and average values of concurrent PGA + UGA memory usage:

564679192 bytes (total) and ~6007225 bytes (average), for ~47 sessions.

Approximate value of current PGA + UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                              SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               178083824 2011-04-21 19:52:16
146,1               [email protected] (ARC3)                       36484072 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       35369960 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       33141736 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19837928 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13480936 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11973608 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11973608 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11842536 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10580296 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8712776 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8565736 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8238120 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                        6730728 2011-04-20 19:41:33
152,987             SYS                                                 6224040 2011-04-21 18:40:59
16,1                [email protected] (RSMN)                        5420008 2011-04-20 19:41:35
125,7               [email protected] (CJQ0)                        4854824 2011-04-20 19:41:50
25,57               SYS                                                 4738504 2011-04-21 19:56:59
138,1               [email protected] (MMON)                        4165448 2011-04-20 19:41:33
136,1               [email protected] (SMON)                        3863504 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        3108848 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2561832 2011-04-20 19:41:33
12,1                [email protected] (RECO)                        2538120 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        2536376 2011-04-20 19:43:52
128,1               [email protected] (DBRM)                        2339768 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        2339672 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        2242336 2011-04-20 19:41:45
127,1               [email protected] (GEN0)                        2012136 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1946600 2011-04-20 19:41:33
158,183             SYS                                                 1692856 2011-04-21 07:44:57
143,23              SYS                                                 1561784 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1299288 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                        1160120 2011-04-20 19:41:45
33,59               [email protected] (W000)                         963512 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         898024 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         832488 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         832488 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         832488 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         832488 2011-04-20 19:41:32
9,1                 [email protected] (LMHB)                         832488 2011-04-20 19:41:33
21,1                [email protected] (GTX0)                         832488 2011-04-20 19:41:45
18,3                [email protected] (O001)                         832488 2011-04-20 19:41:37
137,1               [email protected] (RBAL)                         832488 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         832488 2011-04-20 19:41:32
133,1               [email protected] (RMS0)                         832488 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         832488 2011-04-20 19:41:33
144,1               [email protected] (O000)                         832488 2011-04-20 19:41:36

Current total and average values of concurrent PGA + UGA memory usage:

463473320 bytes (total) and ~4930567 bytes (average), for ~47 sessions.

Maximum value of PGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               177212856 2011-04-21 19:52:16
152,987             SYS                                                57208040 2011-04-21 18:40:59
146,1               [email protected] (ARC3)                       37416168 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       36302056 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       32959720 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19655912 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13298920 2011-04-20 19:41:33
125,7               [email protected] (CJQ0)                       13045176 2011-04-20 19:41:50
131,1               [email protected] (LMS0)                       11791592 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11791592 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11660520 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10398280 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8923976 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8383720 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8056104 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        6876392 2011-04-20 19:43:52
3,1                 [email protected] (DIAG)                        6548712 2011-04-20 19:41:33
25,57               SYS                                                 6163896 2011-04-21 19:56:59
136,1               [email protected] (SMON)                        5893352 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        5294872 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5237992 2011-04-20 19:41:35
150,5               [email protected] (Q000)                        3910216 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2904104 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        2551016 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        2485480 2011-04-20 19:41:45
12,1                [email protected] (RECO)                        2485480 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2223336 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1961192 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                        1830120 2011-04-20 19:41:33
158,183             SYS                                                 1510840 2011-04-21 07:44:57
143,23              SYS                                                 1445304 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1117272 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                         912616 2011-04-20 19:41:45
33,59               [email protected] (W000)                         716008 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         716008 2011-04-20 19:41:33
144,1               [email protected] (O000)                         650472 2011-04-20 19:41:36
137,1               [email protected] (RBAL)                         650472 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         650472 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         650472 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         650472 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         650472 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         650472 2011-04-20 19:41:45
18,3                [email protected] (O001)                         650472 2011-04-20 19:41:37
13,1                [email protected] (ASMB)                         650472 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         650472 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         650472 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         650472 2011-04-20 19:41:32

Worst possible total and average values of concurrent PGA memory usage:

528694504 bytes (total) and ~11248819 bytes (average), for ~47 sessions.

Maximum value of UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987             SYS                                                10573576 2011-04-21 18:40:59
25,57               SYS                                                 3690216 2011-04-21 19:56:59
29,465              SYS                                                 3231832 2011-04-21 19:52:16
136,1               [email protected] (SMON)                        2883704 2011-04-20 19:41:33
125,7               [email protected] (CJQ0)                        2781256 2011-04-20 19:41:50
138,1               [email protected] (MMON)                        1920312 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        1091392 2011-04-20 19:41:46
23,1                [email protected] (Q001)                         894968 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                         829480 2011-04-20 19:41:45
12,1                [email protected] (RECO)                         763968 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                         443968 2011-04-20 19:41:33
158,183             SYS                                                  247504 2011-04-21 07:44:57
149,1               [email protected] (RCBG)                         247504 2011-04-20 19:41:45
143,23              SYS                                                  247504 2011-04-21 07:45:01
33,59               [email protected] (W000)                         247504 2011-04-21 19:55:14
31,1                [email protected] (SMCO)                         247504 2011-04-20 19:43:52
146,1               [email protected] (ARC3)                         182016 2011-04-20 19:41:43
145,1               [email protected] (ARC1)                         182016 2011-04-20 19:41:43
144,1               [email protected] (O000)                         182016 2011-04-20 19:41:36
142,1               [email protected] (LCK0)                         182016 2011-04-20 19:41:34
140,1               [email protected] (MARK)                         182016 2011-04-20 19:41:33
137,1               [email protected] (RBAL)                         182016 2011-04-20 19:41:33
135,1               [email protected] (LGWR)                         182016 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         182016 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         182016 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                         182016 2011-04-20 19:41:33
130,1               [email protected] (LMON)                         182016 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         182016 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                         182016 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         182016 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         182016 2011-04-20 19:41:45
19,5                [email protected] (ARC2)                         182016 2011-04-20 19:41:43
18,3                [email protected] (O001)                         182016 2011-04-20 19:41:37
17,7                [email protected] (ARC0)                         182016 2011-04-20 19:41:42
16,1                [email protected] (RSMN)                         182016 2011-04-20 19:41:35
14,1                [email protected] (MMNL)                         182016 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         182016 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                         182016 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                         182016 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         182016 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                         182016 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                         182016 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                         182016 2011-04-20 19:41:33
4,1                 [email protected] (PING)                         182016 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                         182016 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         182016 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         182016 2011-04-20 19:41:32

Worst possible total and average values of concurrent UGA memory usage:

35984688 bytes (total) and ~765631 bytes (average), for ~47 sessions.

Current value of PGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               177802680 2011-04-21 19:52:16
146,1               [email protected] (ARC3)                       36302056 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       35187944 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       32959720 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19655912 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13298920 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11791592 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11791592 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11660520 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10398280 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8530760 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8383720 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8056104 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                        6548712 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5237992 2011-04-20 19:41:35
152,987             SYS                                                 4582632 2011-04-21 18:40:59
125,7               [email protected] (CJQ0)                        3935672 2011-04-20 19:41:50
25,57               SYS                                                 3787544 2011-04-21 19:56:59
136,1               [email protected] (SMON)                        3140840 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        3066648 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        2468424 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2379816 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        2288872 2011-04-20 19:43:52
12,1                [email protected] (RECO)                        2223336 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2092264 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        1961192 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        1961192 2011-04-20 19:41:45
127,1               [email protected] (GEN0)                        1830120 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1764584 2011-04-20 19:41:33
158,183             SYS                                                 1510840 2011-04-21 07:44:57
143,23              SYS                                                 1379768 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1117272 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                         912616 2011-04-20 19:41:45
33,59               [email protected] (W000)                         716008 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         716008 2011-04-20 19:41:33
144,1               [email protected] (O000)                         650472 2011-04-20 19:41:36
137,1               [email protected] (RBAL)                         650472 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         650472 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         650472 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         650472 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         650472 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         650472 2011-04-20 19:41:45
18,3                [email protected] (O001)                         650472 2011-04-20 19:41:37
13,1                [email protected] (ASMB)                         650472 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         650472 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         650472 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         650472 2011-04-20 19:41:32

Current total and average values of concurrent PGA memory usage:

449247816 bytes (total) and ~9558464 bytes (average), for ~47 sessions.

Current value of UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987             SYS                                                 1641408 2011-04-21 18:40:59
138,1               [email protected] (MMON)                        1098800 2011-04-20 19:41:33
25,57               SYS                                                  950960 2011-04-21 19:56:59
125,7               [email protected] (CJQ0)                         919152 2011-04-20 19:41:50
136,1               [email protected] (SMON)                         722664 2011-04-20 19:41:33
150,5               [email protected] (Q000)                         640424 2011-04-20 19:41:46
23,1                [email protected] (Q001)                         378480 2011-04-20 19:41:46
12,1                [email protected] (RECO)                         314784 2011-04-20 19:41:33
29,465              SYS                                                  281144 2011-04-21 19:52:16
22,1                [email protected] (QMNC)                         281144 2011-04-20 19:41:45
149,1               [email protected] (RCBG)                         247504 2011-04-20 19:41:45
128,1               [email protected] (DBRM)                         247504 2011-04-20 19:41:33
33,59               [email protected] (W000)                         247504 2011-04-21 19:55:14
31,1                [email protected] (SMCO)                         247504 2011-04-20 19:43:52
158,183             SYS                                                  182016 2011-04-21 07:44:57
146,1               [email protected] (ARC3)                         182016 2011-04-20 19:41:43
145,1               [email protected] (ARC1)                         182016 2011-04-20 19:41:43
144,1               [email protected] (O000)                         182016 2011-04-20 19:41:36
143,23              SYS                                                  182016 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                         182016 2011-04-20 19:41:34
140,1               [email protected] (MARK)                         182016 2011-04-20 19:41:33
137,1               [email protected] (RBAL)                         182016 2011-04-20 19:41:33
135,1               [email protected] (LGWR)                         182016 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         182016 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         182016 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                         182016 2011-04-20 19:41:33
130,1               [email protected] (LMON)                         182016 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         182016 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                         182016 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         182016 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         182016 2011-04-20 19:41:45
19,5                [email protected] (ARC2)                         182016 2011-04-20 19:41:43
18,3                [email protected] (O001)                         182016 2011-04-20 19:41:37
17,7                [email protected] (ARC0)                         182016 2011-04-20 19:41:42
16,1                [email protected] (RSMN)                         182016 2011-04-20 19:41:35
14,1                [email protected] (MMNL)                         182016 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         182016 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                         182016 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                         182016 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         182016 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                         182016 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                         182016 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                         182016 2011-04-20 19:41:33
4,1                 [email protected] (PING)                         182016 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                         182016 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         182016 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         182016 2011-04-20 19:41:32

Current total and average values of concurrent UGA memory usage:

14225504 bytes (total) and ~302670 bytes (average), for ~47 sessions.

Current SGA structure sizings:

Total System Global Area  939495424 bytes
Fixed Size                  2232088 bytes
Variable Size             398459112 bytes
Database Buffers          532676608 bytes
Redo Buffers                6127616 bytes

Some initialization parameter values at instance startup:

large_pool_size=0
pga_aggregate_target=0
sga_target=0
shared_pool_size=0
sort_area_size=65536
streams_pool_size=0

Current Time: 2011.04.21-19:57:16

/* 可以从以上输出看到sid,serial=29,465会话的PGA内存使用量异常,达到了170M,
    虽然跟以上案例中的PGA泄露情况比较不算什么  */

/* 使用sid和serial定位到具体的操作系统进程号 */

SQL> select spid,pid,PGA_USED_MEM,PGA_MAX_MEM from v$process
  2  where addr=(select paddr from v$session where sid=&1 and serial#=&2);
Enter value for 1: 29
Enter value for 2: 465
old   2: where addr=(select paddr from v$session where sid=&1 and serial#=&2)
new   2: where addr=(select paddr from v$session where sid=29 and serial#=465)

SPID                            PID PGA_USED_MEM PGA_MAX_MEM
------------------------ ---------- ------------ -----------
26932                            48    129716228   130034996

1 row selected.

SQL> oradebug setospid 26932;
Oracle pid: 48, Unix process pid: 26932, image: [email protected] (TNS V1-V3)

SQL> oradebug dump heapdump 536870917;
Statement processed.

SQL> oradebug dump processstate 10;
Statement processed.

SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc

/* 接下来对堆转储文件进行分析,通过grep可以找出其中较大的SubHEAP子堆  */

[oracle@rh2 ~]$ egrep "HEAP DUMP heap name|Total heap size|Permanent space"
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc

HEAP DUMP heap name="session heap"  desc=0x2ac0b37a67f8
Total heap size    =   130840
Permanent space    =    62680
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b37ce090
Total heap size    =     4040
Permanent space    =     4040
HEAP DUMP heap name="PLS UGA hp"  desc=0x2ac0b37be7f0
Total heap size    =     1992
Permanent space    =     1080
HEAP DUMP heap name="koh-kghu sessi"  desc=0x2ac0b37cf660
Total heap size    =     1128
Permanent space    =       80
HEAP DUMP heap name="pga heap"  desc=0xb7c8ba0
Total heap size    =  2689432
Permanent space    =   660560
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b35ba5c8
Total heap size    =  1706816
Permanent space    =      464
HEAP DUMP heap name="Alloc server h"  desc=0x2ac0b35b9000
Total heap size    =  1704400
Permanent space    =  1694816
HEAP DUMP heap name="diag pga"  desc=0x2ac0b32537e0
Total heap size    =    65448
Permanent space    =     3672
HEAP DUMP heap name="KFK_IO_SUBHEAP"  desc=0x2ac0b35eb2b0
Total heap size    =    10992
Permanent space    =       80
HEAP DUMP heap name="peshm.c:Proces"  desc=0x2ac0b35e7ad0
Total heap size    =     4000
Permanent space    =       80
HEAP DUMP heap name="KSFQ heap"  desc=0x2ac0b35c6d70
Total heap size    =     3256
Permanent space    =     3256
HEAP DUMP heap name="top call heap"  desc=0xb7ce3c0
Total heap size    =155918560
Permanent space    =      448
HEAP DUMP heap name="callheap"  desc=0xb7cd4c0
Total heap size    =152906784
Permanent space    =       80
HEAP DUMP heap name="TCHK^30c42b7a"  desc=0x2ac0b378ff48
Total heap size    =151414512
Permanent space    =       80
HEAP DUMP heap name="kggec.c.kggfa"  desc=0x2ac0b4e76ec8
Total heap size    =     1016
Permanent space    =      736
HEAP DUMP heap name="kxs-heap-c"  desc=0x2ac0b37800c0
Total heap size    =  1489464
Permanent space    =  1485928
HEAP DUMP heap name="top uga heap"  desc=0xb7ce5e0
Total heap size    =   131024
Permanent space    =       80
HEAP DUMP heap name="session heap"  desc=0x2ac0b37a67f8
Total heap size    =   130840
Permanent space    =    62680
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b37ce090
Total heap size    =     4040
Permanent space    =     4040
HEAP DUMP heap name="PLS UGA hp"  desc=0x2ac0b37be7f0
Total heap size    =     1992
Permanent space    =     1080
HEAP DUMP heap name="koh-kghu sessi"  desc=0x2ac0b37cf660
Total heap size    =     1128
Permanent space    =       80
HEAP DUMP heap name="SQLA^30c42b7a"  desc=0x6f4c3ab8
Total heap size    =  4919904
Permanent space    =       80
HEAP DUMP heap name="KGLH0^30c42b7a"  desc=0x6ef44290
Total heap size    =     4032
Permanent space    =     2648


/* 以上heapdump表明TCHK^30c42b7a子堆占用了PGA中绝大多数的内存,
    其中子堆的包含结构为PGA->top call heap -> call heap  -> TCHK 
*/

/* 接着processstate dump还可以让我们了解该问题进程的最近活动历史,及之前所运行的SQL语句
    以便进一步诊断,以下为其调用堆栈      */

ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1900<-sspuser()+112<-__sighandler()<-qcsfccc()+206
<-qcsIsColInFro()+309<-qcsRslvColWithinQbc()+179<-qcsWeakColRslv()+94<-qcsRslvName()+2541<-qcsridn()+105
<-qcsraic()+455<-qcspqbDescendents()+527<-qcspqb()+260<-qcspqbDescendents()+2744<-qcspqb()+260
<-kkmdrv()+182<-opiSem()+1947<-opiprs()+293<-__PGOSF632_kksParseChildCursor()+572<-rpiswu2()+1618
<-kksLoadChild()+5167<-kxsGetRuntimeLock()+2066<-kksfbc()+14527<-kkspsc0()+2025<-kksParseCursor()+144
<-opiosq0()+2027<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+1670<-opiino()+966
<-opiodr()+910<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+244
<-_start()+36

通过以上获取的信息"call heap-> TCHK"可以从MOS查找到11.2上的”Bug 11782790: ORA-4030 IS GENERATED IN HARD PARSE"和"Bug 12360198: ORA-04030 (TCHK^82665CD8,CHEDEF : QCUATC)";虽然我们这里故意为之而非真实的Bug,也可以看出"TCHK subheap"在11.2中充当SQL解析时的临时调用子堆(parse call subheap)。

to be continued.....

你可能感兴趣的:(Probe how does your PGA consume)