Oracle Share Pool内部管理机制
SHARE POOL利用堆(HEAP)的内存管理方式管理,在物理上由多个内存区(EXTENT)组成,内存区又由多个不同大小的CHUNK组成。而CHUNK又有可重用和空闲之分,并且它们分别有LRU LIST、FREE LIST、RESERVED LIST串联起来。
堆管理
Shared Pool是利用堆内存管理方式管理的(KGH:Kernel Generic Heap).从Oracle 9i开始,可以有多个最高级堆(TOP-LEVLE HEAP),最高级堆可以分成多个副堆,副堆下面还拥有子堆。堆和副对结构基本相同。从物理上讲,一个堆由多个内存区已link list的形式连接组成。一个内存区物理上使用一个Granule,一个内存区由多个chunk组成,所以chunk是heap的最小内存单位。
Chunk的使用情况可由X$KSMSP内部视图查看。每个堆头上则包含了可使用的chunk列表和已使用的chunk列表。通过dump heap命令可以在trace文件中观察heap和extent的关系。
alter system set events 'immediate trace name heapdump level 2';
案例:
SQL> select name,bytes/1024/1024 from v$sgainfo; NAME BYTES/1024/1024 -------------------------------- --------------- Fixed SGA Size 1.27566528 Redo Buffers 5.59765625 Buffer Cache Size 180 Shared Pool Size 104 Large Pool Size 4 Java Pool Size 4 Streams Pool Size 0 Shared IO Pool Size 0 Granule Size 4 Maximum SGA Size 498.875 Startup overhead in Shared Pool 52 NAME BYTES/1024/1024 -------------------------------- --------------- Free SGA Memory Available 200 12 rows selected. [oracle@node1 ~]$ more /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_ora_14983.trc Trace file /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_ora_14983.trc Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1 System name: Linux Node name: node1 Release: 2.6.18-194.el5 Version: #1 SMP Tue Mar 16 21:52:43 EDT 2010 Machine: i686 Instance name: prod Redo thread mounted by this instance: 1 Oracle process number: 32 Unix process pid: 14983, image: oracle@node1 (TNS V1-V3) *** 2014-07-02 14:58:47.532 *** SESSION ID:(32.758) 2014-07-02 14:58:47.532 *** CLIENT ID:() 2014-07-02 14:58:47.532 *** SERVICE NAME:(SYS$USERS) 2014-07-02 14:58:47.532 *** MODULE NAME:(sqlplus@node1 (TNS V1-V3)) 2014-07-02 14:58:47.532 *** ACTION NAME:() 2014-07-02 14:58:47.532 KGH Latch Directory Information ldir state: 2 last allocated slot: 99 Slot [ 1] Latch: 0x200065ec Index: 1 Flags: 3 State: 2 next: (nil) Slot [ 2] Latch: 0x3e75571c Index: 1 Flags: 3 State: 2 next: (nil) ......(省略) Slot [ 99] Latch: 0x2002616c Index: 1 Flags: 3 State: 2 next: 0x1 ****************************************************** HEAP DUMP heap name="sga heap" desc=0x200010b4 extent sz=0x7ad4 alt=124 het=32767 rec=9 flg=-126 opc=0 parent=(nil) owner=(nil) nex=(nil) xsz=0x0 heap=(nil) fl2=0x60, nex=(nil) ds for latch 1: 0x2002a990 0x2002b5c8 0x2002c200 0x2002ce38 reserved granule count 0 (granule size 4194304) ****************************************************** HEAP DUMP heap name="sga heap(1,0)" desc=0x2002a990 extent sz=0xfc4 alt=124 het=32767 rec=9 flg=-126 opc=0 parent=(nil) owner=(nil) nex=(nil) xsz=0x400000 heap=(nil) fl2=0x20, nex=(nil) latch set 1 of 1 durations enabled for this heap reserved granules for root 0 (granule size 4194304) EXTENT 0 addr=0x3b800000 Chunk 3b800038 sz= 24 R-freeable "reserved stoppe" Chunk 3b800050 sz= 212888 R-free " " Chunk 3b833fe8 sz= 24 R-freeable "reserved stoppe" Chunk 3b834000 sz= 3981312 perm "perm " alo=3290272 EXTENT 1 addr=0x3bc00000 Chunk 3bc00038 sz= 24 R-freeable "reserved stoppe" Chunk 3bc00050 sz= 212888 R-free " " Chunk 3bc33fe8 sz= 24 R-freeable "reserved stoppe" Chunk 3bc34000 sz= 3510272 perm "perm " alo=3510272 Chunk 3bf8d000 sz= 465920 perm "perm " alo=465920 Chunk 3bffec00 sz= 5120 free " " EXTENT 2 addr=0x3c000000 Chunk 3c000038 sz= 24 R-freeable "reserved stoppe" Chunk 3c000050 sz= 212888 R-free " " Chunk 3c033fe8 sz= 24 R-freeable "reserved stoppe" Chunk 3c034000 sz= 3850176 perm "perm " alo=3850176 Chunk 3c3dffc0 sz= 131088 perm "perm " alo=131088 Chunk 3c3fffd0 sz= 48 free " " EXTENT 3 addr=0x3c400000 Chunk 3c400038 sz= 24 R-freeable "reserved stoppe" Chunk 3c400050 sz= 212888 R-free " " Chunk 3c433fe8 sz= 24 R-freeable "reserved stoppe" Chunk 3c434000 sz= 3980368 perm "perm " alo=3980368 Chunk 3c7ffc50 sz= 944 free " " EXTENT 4 addr=0x3c800000 Chunk 3c800038 sz= 24 R-freeable "reserved stoppe" Chunk 3c800050 sz= 212888 R-free " " Chunk 3c833fe8 sz= 24 R-freeable "reserved stoppe" Chunk 3c834000 sz= 3672700 perm "perm " alo=3672700 Chunk 3cbb4a7c sz= 308240 perm "perm " alo=308240 Chunk 3cbffe8c sz= 372 free " "
CHUNK
Chunk是以链条(chain)的方式存在于内存区,每个chunk包含header和body两部分,chunk的状态大体上可以分为free(可立即使用)、recreatable(可再生)、freeable(只有在session或call期间内保存必要的对象状态)、permanent(永久、不可再生)等,其中free和recreatable状态下的chunk可以重复使用。
FREE LIST
Free list主要用于管理空闲的chunk,并且是用bucket管理的。从Oracle 9i开始,一个heap总共255个bucket,bucket所包含的free chunk大小随bucket的编号增加而递增,个bucket下的free chunk已linked list的形式链接。
(1)获得shared pool latch在free list中查找合适大小的空闲chunk。如果在获取shared pool latch时发生了争用,则会出现latch:shared pool等待事件。这时Oracle会一直持有shared pool latch,直到获得所需的内存为止。所在内存碎片化比较严重的shared pool中,进程持有shared pool latch的时间也会相应变长。
(2)如果不存在合适大小的空闲chunk,则在查找到更大的空闲chunk后分割(split)使用,分割后剩下的内存区域则重新登记到free list中。分割内存意味着内存中的碎片开始增多。由于每个cursor所需内存的大小不同,所以shared pool的空闲内存不像buffer cache中的空闲内存一样具有固定大小。
(3)如果检索了free list也没有找到所有合适的空闲chunk,则检索lru list。lru list上的chunk是重建(recreatable)的,而且是当前不使用的(没有处于pin状态)。
(4)如果在lru list上也没有找到合适的chunk,且所请求的内存还没有达到隐含参数_shared_pool_reserved_min_alloc的阈值,则追加分配share pool中剩余的内存空间。
(5)如果以上请求的内存均失败,则出现ORA-4031错误
FREE LISTS: Bucket 0 size=16 Bucket 1 size=20 Bucket 2 size=24 Bucket 3 size=28 Bucket 4 size=32 Bucket 5 size=36 Bucket 6 size=40 Bucket 7 size=44 Bucket 8 size=48 Chunk 3c3fffd0 sz= 48 free " " Chunk 3ebfffd0 sz= 48 free " " Bucket 9 size=52 Bucket 10 size=56 Bucket 11 size=60 Bucket 12 size=64 Bucket 13 size=68 Bucket 14 size=72 Bucket 15 size=76 Bucket 16 size=80 Bucket 17 size=84 Bucket 18 size=88 Chunk 3e7fffa8 sz= 88 free " " Bucket 19 size=92 Bucket 20 size=96 Bucket 21 size=100 Bucket 22 size=104 Bucket 23 size=108 Bucket 24 size=112 Bucket 25 size=116 Bucket 26 size=120 Bucket 27 size=124 ......
LRU LIST
Lru list主要保存着当前未使用而且可以重建的chunk。当会话在free list中找不到空闲的chunk时,在会在lru list中寻找。
UNPINNED RECREATABLE CHUNKS (lru first): Chunk 3b1f4000 sz= 4096 recreate "CCUR^3de2d7f5 " latch=(nil) Chunk 3a649000 sz= 4096 recreate "CCUR^f5a8106a " latch=(nil) Chunk 383b3000 sz= 4096 recreate "CCUR^2b855a04 " latch=(nil) Chunk 383b1000 sz= 4096 recreate "CCUR^f65cc3ed " latch=(nil) Chunk 3ee4c524 sz= 332 recreate "KGLHD " latch=(nil) Chunk 3af76da8 sz= 4096 recreate "SQLA^13df1501 " latch=(nil) Chunk 3a72e000 sz= 4096 recreate "CCUR^13df1501 " latch=(nil) Chunk 39ff96e0 sz= 332 recreate "KGLHD " latch=(nil) Chunk 3a37e740 sz= 4096 recreate "SQLA^7a9f4a60 " latch=(nil) Chunk 38330000 sz= 4096 recreate "CCUR^7a9f4a60 " latch=(nil) Chunk 39f8732c sz= 332 recreate "KGLHD " latch=(nil) Chunk 3ab0dec4 sz= 4096 recreate "SQLA^8005cf3c " latch=(nil) Chunk 3a670000 sz= 4096 recreate "CCUR^8005cf3c " latch=(nil) Chunk 3ed28b1c sz= 1524 recreate "KGLHD " latch=(nil) Chunk 39cb0d20 sz= 332 recreate "KGLHD " latch=(nil) Chunk 3aaeb214 sz= 4096 recreate "SQLA^63c15ff " latch=(nil) Chunk 3a646000 sz= 4096 recreate "CCUR^63c15ff " latch=(nil) Chunk 39ff9a38 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3ee4c88c sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3ee4caa8 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 39fabd2c sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 39fabf48 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3ee481f8 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3ee48414 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3ee48630 sz= 540 recreate "KQR PO " latch=0x3ca28438 Chunk 3a1a1228 sz= 4096 recreate "KGLS^9b4819a8 " latch=(nil) Chunk 39b4a504 sz= 4096 recreate "KGLS^6ccccfe " latch=(nil) Chunk 3abf0fb4 sz= 4096 recreate "KGLS^8db54da " latch=(nil) Chunk 3aa65508 sz= 4096 recreate "KGLS^e2d7481e " latch=(nil) Chunk 3a2839d0 sz= 4096 recreate "KGLS^8dd845ff " latch=(nil) Chunk 3abd4f78 sz= 4096 recreate "KGLS^f79d229a " latch=(nil)
RESERVED FREE LIST
Oracle 设置了保留内存区域,该区域用 Reserved free list管理。其大小由参数shared_pool_reserved_size决定(最小为5000字节,最大不能超过shared pool的50%)。
RESERVED FREE LISTS: Reserved bucket 0 size=16 Chunk 3dfffc14 sz= 980 R-free " " Chunk 3dbffbe0 sz= 1032 R-free " " Reserved bucket 1 size=4400 Reserved bucket 2 size=8204 Reserved bucket 3 size=8460 Reserved bucket 4 size=8464 Reserved bucket 5 size=8468 Reserved bucket 6 size=8472 Reserved bucket 7 size=9296 Reserved bucket 8 size=9300 Reserved bucket 9 size=12320 Reserved bucket 10 size=12324 Reserved bucket 11 size=16396 Reserved bucket 12 size=32780 Reserved bucket 13 size=65548 Chunk 3b800050 sz= 212888 R-free " " Chunk 3bc00050 sz= 212888 R-free " " Chunk 3c000050 sz= 212888 R-free " " Chunk 3c400050 sz= 212888 R-free " " Chunk 3c800050 sz= 212888 R-free " " Chunk 3cc00050 sz= 212888 R-free " " Chunk 3d3cb8b8 sz= 214832 R-free " " Chunk 3d400050 sz= 212888 R-free " " Chunk 3e000050 sz= 212888 R-free " " Chunk 3e400050 sz= 212888 R-free " " Chunk 3e800050 sz= 212888 R-free " " Chunk 3f000050 sz= 212888 R-free " " Reserved bucket 14 size=1990630 Total reserved free space = 2558612 UNPINNED RECREATABLE CHUNKS (lru first): Chunk 3b1f4000 sz= 4096 recreate "CCUR^3de2d7f5 " latch=(nil) Chunk 3a649000 sz= 4096 recreate "CCUR^f5a8106a " latch=(nil) ......
Oracle对进入reserved free list的对象大小有限制,即只有大于_shared_pool_reserved_min_alloc隐含参数阈值(默认值4400)的cursor才能进入到reserved free list。
SHARED POOL的SUB POOL技术
从Oracle 9i开始,shared pool可以分为多个sub pool,其数量受一下几个因素影响:
1、系统的CPU数量,默认情况下,在Oracle中每4个CPU分配一个sub pool,最多不能超过7个。
2、共享池的大小。sub pool的最小容量随着Oracle版本不同而不同。
9i ----- 128M(Minimum subpool size)
10g<10.2.0.3 ---- 256M
10.2.0.3 and higher ---512M
3、隐含参数_kghdisdx_count值
每个sub pool拥有独立的free list 、lru list和shared pool latch。从这个角度来讲,当系统拥有足够的内存和CPU时,将shared pool分为多个sub pool时能有效的减少shared pool latch的争用。
查看sub pool的数量:
kghlushrpool: =1: shared pool subpools =0: java pool SQL> select count(kghluidx) num_pools 2 from x$kghlu 3 where kghlushrpool=1 4 / NUM_POOLS ---------- 1 SQL> show parameter cpu NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ cpu_count integer 1
查看sub pool latch信息:
SQL> col name for a40 SQL> set linesize 120 SQL> r 1 select addr,name,gets,misses,spin_gets 2 from v$latch_children 3* where name ='shared pool' ADDR NAME GETS MISSES SPIN_GETS -------- ---------------------------------------- ---------- ---------- ---------- 200A80FC shared pool 21 0 0 200A8098 shared pool 21 0 0 200A8034 shared pool 21 0 0 200A7FD0 shared pool 21 0 0 200A7F6C shared pool 21 0 0 200A7F08 shared pool 21 0 0 200A7EA4 shared pool 493378 7 0 7 rows selected.
查看sub pool 内存分配:
X$KSMSS([K]ernal [S]torage [M]emory Management [S]GA [S]atistics (lengths of SGA objects)
SQL> select 2 'shared pool(' || nvl(decode(to_char(ksmdsidx),'0','0-unused',ksmdsidx),'Total') 3 || '):' subpool, 4 sum(ksmsslen) bytes,round(sum (ksmsslen)/1048576 ,2 ) mb 5 from x$ksmss 6 where ksmsslen >0 7 group by rollup (ksmdsidx) order by subpool asc; SUBPOOL BYTES MB ------------------------------------------------------ ---------- ---------- shared pool(1): 121639892 116 shared pool(Total): 121639892 116
SQL>
Oracle 硬解析
1、获得shared pool latch ,从free list的bucket 中查找合适大小的free chunk。如果free list中的bucket list过长或者shared pool碎片化严重,那么在多个进程同时请求分配内存时,则会发生shared pool latch的争用。
2、如果不存在大小合适的free chunk,则分割较大的free chunk,分割后的free chunk重新挂载到适当大小的bucket下。如果不存在free chunk,则检索lru list。若在lru list中也不能获得合适大小的bucket,则从shared pool的剩余空闲内存中分配。如果cursor大小大于_shared_pool_reserved_min_alloc隐含参数设定的阀值,那么在reserved list中寻找free chunk。若果以上过程均失败,则出现ORA-4031错误。
3、若找到合适大小的chunk,则对cursor相应的handle(library cache handle)以exclusive 模式获得library cache lock,并创建LCO信息。在创建LCO信息后,library cache lock变换为null模式,然后以exclusive模式获得library cache pin,并创建执行计划等信息。硬解析成功后Oracle增加parse count(hard)统计值。
4、对sql cursor已shared模式获得library cache lock和library cache pin,并执行sql,这个阶段称之为执行阶段。
5、sql cursor执行结束后进入fetch阶段。在fetch阶段,sql cursor将library cache pin变为null模式,并释放library cache pin。
软软解析
由于在软解析过程中需要获得library cache latch,所以在高并发软解析的系统中,依然会出现与latch:library cache相关的等待事件,从而导致性能缓慢。
软软解析(Tom Kety)核心原理是通过设置session_cache_cursors参数将某个会话中常用的sql放入UGA的会话缓存区中,当会话发起相同的sql时,可以快速的从UGA取得cursor的信息,从而减少共享池的争用。当一个cursor被解析3次以上(包括3次)就会被放入到UG会话缓存区中。
案例:shared pool latch 争用
案例1: 业务运行前: 17:07:30 SYS@ prod>select name,GETS,MISSES from v$latch where upper(name) like '%LIBRARY%' OR upper(name) like '%SHARE%'; NAME GETS MISSES ---------------------------------------------------------------- ---------- ---------- test shared non-parent l0 0 0 ksxp shared latch 0 0 kcfis stats shared latch 0 0 shared pool 126676 61 library cache load lock 0 0 shared pool simulator 6576 0 shared pool sim alloc 45 0 Shared B-Tree 302 0 shared server configuration 6 0 shared server info 1 0 运行业务: 17:08:34 SCOTT@ prod>begin 17:08:38 2 for i in 1..100000 loop 17:08:52 3 execute immediate 'insert into t1 values ('||i||')'; 17:09:18 4 end loop; 17:09:26 5 end; 17:09:27 6 / PL/SQL procedure successfully completed. 业务运行后: 17:11:05 SYS@ prod>select name,GETS,MISSES from v$latch where upper(name) like '%LIBRARY%' OR upper(name) like '%SHARE%' NAME GETS MISSES ---------------------------------------------------------------- ---------- ---------- test shared non-parent l0 0 0 ksxp shared latch 0 0 kcfis stats shared latch 0 0 shared pool 4526672 214 library cache load lock 0 0 shared pool simulator 1086437 0 shared pool sim alloc 2048 0 Shared B-Tree 316 0 shared server configuration 6 0 shared server info 1 0 10 rows selected. 17:15:42 SYS@ prod>select sid,event,WAIT_TIME,state from v$session_wait where sid=42 SID EVENT WAIT_TIME STATE ---------- ---------------------------------------------------------------- ---------- ------------------- 42 latch: shared pool -1 WAITED SHORT TIME Elapsed: 00:00:00.08 案例2: 业务运行前: 17:18:35 SYS@ prod>select sid,EVENT,TOTAL_WAITS,AVERAGE_WAIT from v$session_event where sid in (42,46); SID EVENT TOTAL_WAITS AVERAGE_WAIT ---------- ---------------------------------------------------------------- ----------- ------------ 42 Disk file operations I/O 4 .03 42 log file switch (private strand flush incomplete) 1 10.03 42 log file sync 4 1.76 42 db file sequential read 385 .23 42 latch: row cache objects 5 .44 42 latch: shared pool 194 .25 42 SQL*Net message to client 24 0 42 SQL*Net message from client 23 5318.9 42 SQL*Net break/reset to client 2 .08 42 events in waitclass Other 1 0 46 Disk file operations I/O 1 .03 46 db file sequential read 33 .02 46 SQL*Net message to client 13 0 46 SQL*Net message from client 12 79.9 14 rows selected. 运行业务: 17:16:39 SYS@ prod>select sid ,username from v$session where username is not null; SID USERNAME ---------- ------------------------------ 1 SYS 42 SCOTT 46 HR 17:17:22 SCOTT@ prod>begin 17:20:46 2 for i in 1..100000 loop 17:20:52 3 execute immediate 'insert into t1 values ('||i||')'; 17:20:58 4 end loop; 17:21:02 5 end; 17:21:05 6 / PL/SQL procedure successfully completed. 17:17:42 HR@ prod>begin 17:21:16 2 for i in 1..100000 loop 17:21:24 3 execute immediate 'insert into scott.t1 values ('||i||')'; 17:21:49 4 end loop; 17:21:51 5 end; 17:21:52 6 / PL/SQL procedure successfully completed. 业务运行后: 17:22:32 SYS@ prod>select sid,EVENT,TOTAL_WAITS,AVERAGE_WAIT from v$session_event where sid in (42,46); SID EVENT TOTAL_WAITS AVERAGE_WAIT ---------- ---------------------------------------------------------------- ----------- ------------ 42 Disk file operations I/O 4 .03 42 latch: cache buffers chains 16 .18 42 buffer busy waits 2 .15 42 log file switch (private strand flush incomplete) 1 10.03 42 log file sync 4 1.76 42 db file sequential read 413 .21 42 latch: row cache objects 58 .13 42 latch: shared pool 1008 .19 42 library cache: mutex X 123 .33 42 SQL*Net message to client 24 0 42 SQL*Net message from client 24 6044.43 42 SQL*Net break/reset to client 2 .08 42 events in waitclass Other 87 .09 46 Disk file operations I/O 3 .03 46 latch: cache buffers chains 13 .21 46 buffer busy waits 1 .35 46 latch: redo copy 1 1.26 SID EVENT TOTAL_WAITS AVERAGE_WAIT ---------- ---------------------------------------------------------------- ----------- ------------ 46 db file sequential read 38 .02 46 enq: HW - contention 1 .01 46 latch: row cache objects 58 .14 46 row cache lock 1 .08 46 latch: shared pool 666 .17 46 library cache: mutex X 99 .29 46 SQL*Net message to client 13 0 46 SQL*Net message from client 13 2010.63 46 events in waitclass Other 68 .14 26 rows selected. Elapsed: 00:00:00.37 17:22:42 SYS@ prod> 17:22:02 SYS@ prod>select sid,event,WAIT_TIME,state from v$session_wait where sid=42 17:22:25 2 or sid=46; SID EVENT WAIT_TIME STATE ---------- ---------------------------------------------------------------- ---------- ------------------- 42 latch: shared pool -1 WAITED SHORT TIME 46 latch: shared pool -1 WAITED SHORT TIME
----以上部分内容整理自《Oracle DBA实战攻略》 周亮著,感谢作者的辛勤付出!