本博客主要参考MOS NOTE(Doc ID 123754.1):
AIX: Determining Oracle Memory Usage On AIX (Doc ID 123754.1)
Applies to:
Oracle Database - Enterprise Edition - Version 8.1.7.0 and later
IBM AIX on POWER Systems (64-bit)
IBM AIX on POWER Systems (32-bit)
The first program type you should be aware of are the Oracle background processes.
These processes are created when you start a database instance. Common examples of Oracle background processes are log writer (lgwr), database writer (dbw0), system monitor (smon), process monitor (pmon), recovery (reco), and check point (ckpt), but may include others. These processes run with the name ora_ProcessName_SID, where ProcessName is the name of the background process and SID is the value of ORACLE_SID.
For example, the process monitor background process for a database instance named DEV would be "ora_pmon_DEV". The second program type you should be aware of are the Oracle user processes. These processes are created when you start a program which will work with an Oracle database.
Common examples of Oracle user processes are sqlplus, imp, exp, and sqlldr, but may include many others. These user processes are frequently referred to as client processes and are usually named the same as the command you used to start the program.
For example, the sqlplus user process would be named "sqlplus". The third program type you should be aware of are the Oracle server processes.
Server processes work directly with the database instance to carry out the requests from the user processes. Server processes are often referred to as shadow processes, and this is the term that will be used in this article. Shadow processes may be dedicated to a single user process or part of a multi-threaded server (MTS) configuration. The shadow processes are named oracleSID, where SID is the value of ORACLE_SID. For example, any shadow process connected to the database instance "DEV" would be named "oracleDEV".
##ORACLE进程主要有三种类型:后台进程,服务器进程(也就前台进程),用户进程。
你首先需要了解的进程就是ORACLE后台进程。ORACLE后台进程在你启动实例的时候就被创建。常见的ORACLE后台进程包括lgwr(日志写入进程),dbwr(数据文件写入进程),smon(系统监控进程),pmon(程序监控进程),reco(分布式恢复进程),ckpt(检查点进程),还有一些其他的就不列举了。在unix上这些进程的具体命名格式是“ora_ProcessName_SID”,此处的ProcessName即上面所说的后台进程名如smon等,sid就是数据库指数据库的实例名。
例如,一个实例名为“DEV”的数据库的程序监控进程(pmon)在操作系统上的进程名为"ora_pmon_DEV"。第二类你需要了解的数据库进程是用户进程。它可能是一个sqlplus命令行,可能是imp/exp工具,也可能是用户开发的一个java程序。这些用户进程也可以被称为客户端进程,这些进程通常以你所使用的客户端来命名。例如sqlplus启动的用户进程会被命令为“sqlplus”
第三类我们应该知道的是服务器进程。服务器进程直接同数据库实例交互执行来自用户进程的请求。服务器进程通常也被叫做影子进程,这篇文章中我们将以影子进程来称呼服务器进程。通常一个服务器进程是被一个用户进程所专有的,除非我们配置了MTS(multi-threaded server)。服务器进程的命名格式为oracleSID,SID表示数据库的实例名。例如一个实例名为“DEV”的数据库的服务器进程的名字为“oracleDEV”
二:理解ORACLE内存使用
UNDERSTANDING ORACLE MEMORY USAGE ##理解oracle内存使用
---------------------------------
Oracle memory usage can be broken down into 2 basic types, private and shared.
Private memory is used only by a single process. In contrast, shared memory is used by more than 1 process and this is where most confusion over memory usage happens. When determining how much memory Oracle is using, the shared memory segments should only be counted once for all processes sharing a given memory segment. The largest segment of shared memory with Oracle is usually the Shared Global Area (SGA). The SGA is mapped into the virtual address space for all background and shadow processes. Many programs which display memory usage, like "top" or "ps -lf" do not distinguish between shared and private memory and show the SGA usage in each background and shadow process. Subsequently, it may appear as though Oracle is using several times more memory than what is actually installed on the system.
##ORACLE 使用的内存可以分为两大类,即私有内存和共享内存。私有内存仅能供某一个进程使用。相反,共享内存可以被多个进程使用,这也是对ORACLE内存使用容易产生困惑的地方。当判断有多少内存被ORACLE使用的时候,对于所有的进程共享的内存段只应该被计算一次。对于ORACLE来说最大的共享内存段是SGA。我们看到的SGA被映射成虚拟地址且被每一个后台进程和前台进程attach到自己身上,以便随时能够利用到SGA。很多工具都可以查看内存使用情况,像top或者ps -lf,但是它们并不区分后台进程和服务器进程所使用的总内存中私有内存和共享内存分别的使用情况。这样就会出现一种情况如果我们把通过这些工具获得每个进程的内存使用量相加(计算oracle内存使用总量),可能统计的结果是ORACLE使用的总内存是所在服务器内存的好几倍,甚至好几十倍。(这是因为我们把共享内存部分重复计算了)
To properly determine how much memory Oracle is using, you must use a tool which separates private and shared memory.
One such tool is "svmon". This program can be located on the IBM AIX CD-ROM as part of the AIX fileset "perfagent.tools". Use the command "smit install_latest" to install this fileset.
For information about using svmon and other tools to determine memory usage, please refer to the AIX "Performance Management Guide" from IBM,
chapter 7 "Monitoring and Tuning Memory Use", and the heading "Determining How Much Memory Is Being Used".
This guide is available online at the following IBM web site... http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c71.htm
为了正确的判断oracle内存使用情况,我们必须使用能分离内存中私有和共享部分的工具。svmon就是这样一个工具(AIX平台上),你能够在AIX安装盘中找到该工具,它是文件集"perfagent.tools"的一部分。使用smit install_latest安装该文件集。(在其他的unix平台上pmap工具也能实现该功能)
关于svmon和其他一些用来判断内存使用的工具,请参考AIX "Performance Management Guide" from IBM, chapter 7 "Monitoring and Tuning Memory Use", and the heading "Determining How Much Memory Is Being Used".
关于svmon 的使用我们可以参考一下下面这篇文档:
The svmon Command The svmon command provides a more in-depth analysis of memory usage. It is more informative, but also more intrusive, than the vmstat and ps commands. The svmon command captures a snapshot of the current state of memory. However, it is not a true snapshot because it runs at the user level with interrupts enabled. To determine whether svmon is installed and available, run the following command: # lslpp -lI perfagent.tools The svmon command can only be executed by the root user. If an interval is used (-i option), statistics will be displayed until the command is killed or until the number of intervals, which can be specified right after the interval, is reached. You can use four different reports to analyze the displayed information: Global (-G) Displays statistics describing the real memory and paging space in use for the whole system. Process (-P) Displays memory usage statistics for active processes. Segment (-S) Displays memory usage for a specified number of segments or the top ten highest memory-usage processes in descending order. Detailed Segment (-D) Displays detailed information on specified segments. Additional reports are available in AIX 4.3.3 and later, as follows: User (-U) Displays memory usage statistics for the specified login names. If no list of login names is supplied, memory usage statistics display all defined login names. Command (-C) Displays memory usage statistics for the processes specified by command name. Workload Management Class (-W) Displays memory usage statistics for the specified workload management classes. If no classes are supplied, memory usage statistics display all defined classes. To support 64-bit applications, the output format of the svmon command was modified in AIX 4.3.3 and later. Additional reports are available in operating system versions later than 4.3.3, as follows: Frame (-F) Displays information about frames. When no frame number is specified, the percentage of used memory is reported. When a frame number is specified, information about that frame is reported. Tier (-T) Displays information about tiers, such as the tier number, the superclass name when the -a flag is used, and the total number of pages in real memory from segments belonging to the tier. How Much Memory is in Use To print out global statistics, use the -G flag. In this example, we will repeat it five times at two-second intervals. # svmon -G -i 2 5 m e m o r y i n u s e p i n p g s p a c e size inuse free pin work pers clnt work pers clnt size inuse 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676 16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676 16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676 16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676 The columns on the resulting svmon report are described as follows: memory Statistics describing the use of real memory, shown in 4 K pages. size Total size of memory in 4 K pages. inuse Number of pages in RAM that are in use by a process plus the number of persistent pages that belonged to a terminated process and are still resident in RAM. This value is the total size of memory minus the number of pages on the free list. free Number of pages on the free list. pin Number of pages pinned in RAM (a pinned page is a page that is always resident in RAM and cannot be paged out). in use Detailed statistics on the subset of real memory in use, shown in 4 K frames. work Number of working pages in RAM. pers Number of persistent pages in RAM. clnt Number of client pages in RAM (client page is a remote file page). pin Detailed statistics on the subset of real memory containing pinned pages, shown in 4 K frames. work Number of working pages pinned in RAM. pers Number of persistent pages pinned in RAM. clnt Number of client pages pinned in RAM. pg space Statistics describing the use of paging space, shown in 4 K pages. This data is reported only if the -r flag is not used. The value reported starting with AIX 4.3.2 is the actual number of paging-space pages used (which indicates that these pages were paged out to the paging space). This differs from the vmstat command in that vmstat’s avm column which shows the virtual memory accessed but not necessarily paged out. size Total size of paging space in 4 K pages. inuse Total number of allocated pages. In our example, there are 16384 pages of total size of memory. Multiply this number by 4096 to see the total real memory size (64 MB). While 16250 pages are in use, there are 134 pages on the free list and 2006 pages are pinned in RAM. Of the total pages in use, there are 10675 working pages in RAM, 2939 persistent pages in RAM, and 2636 client pages in RAM. The sum of these three parts is equal to the inuse column of the memory part. The pin part divides the pinned memory size into working, persistent and client categories. The sum of them is equal to the pin column of the memory part. There are 40960 pages (160 MB) of total paging space, and 12676 pages are in use. The inuse column of memory is usually greater than the inuse column of pg spage because memory for file pages is not freed when a program completes, while paging-space allocation is. In AIX 4.3.3 and later, systems the output of the same command looks similar to the following: # svmon -G -i 2 5 size inuse free pin virtual memory 65527 64087 1440 5909 81136 pg space 131072 55824 work pers clnt pin 5918 0 0 in use 47554 13838 2695 size inuse free pin virtual memory 65527 64091 1436 5909 81137 pg space 131072 55824 work pers clnt pin 5918 0 0 in use 47558 13838 2695 size inuse free pin virtual memory 65527 64091 1436 5909 81137 pg space 131072 55824 work pers clnt pin 5918 0 0 in use 47558 13838 2695 size inuse free pin virtual memory 65527 64090 1437 5909 81137 pg space 131072 55824 work pers clnt pin 5918 0 0 in use 47558 13837 2695 size inuse free pin virtual memory 65527 64168 1359 5912 81206 pg space 131072 55824 work pers clnt pin 5921 0 0 in use 47636 13837 2695 The additional output field is the virtual field, which shows the number of pages allocated in the system virtual space. Who is Using Memory? The following command displays the memory usage statistics for the top ten processes. If you do not specify a number, it will display all the processes currently running in this system. # svmon -Pau 10 Pid Command Inuse Pin Pgspace 15012 maker4X.exe 4783 1174 4781 2750 X 4353 1178 5544 15706 dtwm 3257 1174 4003 17172 dtsession 2986 1174 3827 21150 dtterm 2941 1174 3697 17764 aixterm 2862 1174 3644 2910 dtterm 2813 1174 3705 19334 dtterm 2813 1174 3704 13664 dtterm 2804 1174 3706 17520 aixterm 2801 1174 3619 Pid: 15012 Command: maker4X.exe Segid Type Description Inuse Pin Pgspace Address Range 1572 pers /dev/hd3:62 0 0 0 0..-1 142 pers /dev/hd3:51 0 0 0 0..-1 1bde pers /dev/hd3:50 0 0 0 0..-1 2c1 pers /dev/hd3:49 1 0 0 0..7 9ab pers /dev/hd2:53289 1 0 0 0..0 404 work kernel extension 27 27 0 0..24580 1d9b work lib data 39 0 23 0..607 909 work shared library text 864 0 7 0..65535 5a3 work sreg[4] 9 0 12 0..32768 1096 work sreg[3] 32 0 32 0..32783 1b9d work private 1057 1 1219 0..1306 : 65307..65535 1af8 clnt 961 0 0 0..1716 0 work kernel 1792 1146 3488 0..32767 : 32768..65535 … The output is divided into summary and detail sections. The summary section lists the top ten highest memory-usage processes in descending order. Pid 15012 is the process ID that has the highest memory usage. The Command indicates the command name, in this case maker4X.exe. The Inuse column (total number of pages in real memory from segments that are used by the process) shows 4783 pages (each page is 4 KB). The Pin column (total number of pages pinned from segments that are used by the process) shows 1174 pages. The Pgspace column (total number of paging-space pages that are used by the process) shows 4781 pages. The detailed section displays information about each segment for each process that is shown in the summary section. This includes the segment ID, the type of the segment, description (a textual description of the segment, including the volume name and i-node of the file for persistent segments), number of pages in RAM, number of pinned pages in RAM, number of pages in paging space, and address range. The Address Range specifies one range for a persistent or client segment and two ranges for a working segment. The range for a persistent or a client segment takes the form ’0..x,’ where x is the maximum number of virtual pages that have been used. The range field for a working segment can be ’0..x : y..65535′, where 0..x contains global data and grows upward, and y..65535 contains stack area and grows downward. For the address range, in a working segment, space is allocated starting from both ends and working towards the middle. If the working segment is non-private (kernel or shared library), space is allocated differently. In this example, the segment ID 1b9d is a private working segment; its address range is 0..1306 : 65307..65535. The segment ID 909 is a shared library text working segment; its address range is 0..65535. A segment can be used by multiple processes. Each page in real memory from such a segment is accounted for in the Inuse field for each process using that segment. Thus, the total for Inuse may exceed the total number of pages in real memory. The same is true for the Pgspace and Pin fields. The sum of Inuse, Pin, and Pgspace of all segments of a process is equal to the numbers in the summary section. You can use one of the following commands to display the file name associated with the i-node: * ncheck -i i-node_number volume_name * find file_system_associated_with_lv_name -xdev -inum inode_number -print To get a similar output in AIX 4.3.3 and later, use the following command: # svmon -Put 10 —————————————————————————— Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 2164 X 15535 1461 34577 37869 N N Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range 1966 2 work process private 9984 4 31892 32234 0..32272 : 65309..65535 4411 d work shared library text 3165 0 1264 1315 0..65535 0 0 work kernel seg 2044 1455 1370 4170 0..32767 : 65475..65535 396e 1 pers code,/dev/hd2:18950 200 0 – - 0..706 2ca3 – work 32 0 0 32 0..32783 43d5 – work 31 0 6 32 0..32783 2661 – work 29 0 0 29 0..32783 681f – work 29 0 25 29 0..32783 356d f work shared library data 18 0 18 24 0..310 34e8 3 work shmat/mmap 2 2 2 4 0..32767 5c97 – pers /dev/hd4:2 1 0 – - 0..0 5575 – pers /dev/hd2:19315 0 0 – - 0..0 4972 – pers /dev/hd2:19316 0 0 – - 0..5 4170 – pers /dev/hd3:28 0 0 – - 0..0 755d – pers /dev/hd9var:94 0 0 – - 0..0 6158 – pers /dev/hd9var:90 0 0 – - 0..0 —————————————————————————— Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 25336 austin.ibm. 12466 1456 2797 11638 N N Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range 14c3 2 work process private 5644 1 161 5993 0..6550 : 65293..65535 4411 d work shared library text 3165 0 1264 1315 0..65535 0 0 work kernel seg 2044 1455 1370 4170 0..32767 : 65475..65535 13c5 1 clnt code 735 0 – - 0..4424 d21 – pers /dev/andy:563 603 0 – - 0..618 9e6 f work shared library data 190 0 2 128 0..3303 942 – pers /dev/cache:16 43 0 – - 0..42 2ca3 – work 32 0 0 32 0..32783 49f0 – clnt 10 0 – - 0..471 1b07 – pers /dev/andy:8568 0 0 – - 0..0 623 – pers /dev/hd2:22539 0 0 – - 0..1 2de9 – clnt 0 0 – - 0..0 1541 5 mmap mapped to sid 761b 0 0 – - 5d15 – pers /dev/andy:487 0 0 – - 0..3 4513 – pers /dev/andy:486 0 0 – - 0..45 cc4 4 mmap mapped to sid 803 0 0 – - 242a – pers /dev/andy:485 0 0 – - 0..0 … The Vsid column is the virtual segment ID, and the Esid column is the effective segment ID. The effective segment ID reflects the segment register that is used to access the corresponding pages. Detailed Information on a Specific Segment ID The -D option displays detailed memory-usage statistics for segments. # svmon -D 404 Segid: 404 Type: working Description: kernel extension Address Range: 0..24580 Size of page space allocation: 0 pages ( 0.0 Mb) Inuse: 28 frames ( 0.1 Mb) Page Frame Pin Ref Mod 12294 3320 pin ref mod 24580 1052 pin ref mod 12293 52774 pin ref mod 24579 20109 pin ref mod 12292 19494 pin ref mod 12291 52108 pin ref mod 24578 50685 pin ref mod 12290 51024 pin ref mod 24577 1598 pin ref mod 12289 35007 pin ref mod 24576 204 pin ref mod 12288 206 pin ref mod 4112 53007 pin mod 4111 53006 pin mod 4110 53005 pin mod 4109 53004 pin mod 4108 53003 pin mod 4107 53002 pin mod 4106 53001 pin mod 4105 53000 pin mod 4104 52999 pin mod 4103 52998 pin mod 4102 52997 pin mod 4101 52996 pin mod 4100 52995 pin mod 4099 52994 pin mod 4098 52993 pin mod 4097 52992 pin ref mod The detail columns are explained as follows: Page Specifies the index of the page within the segment. Frame Specifies the index of the real memory frame that the page resides in. Pin Specifies a flag indicating whether the page is pinned. Ref Specifies a flag indicating whether the page’s reference bit is on. Mod Specifies a flag indicating whether the page is modified. The size of page space allocation is 0 because all the pages are pinned in real memory. An example output from AIX 4.3.3 and later, is very similar to the following: # svmon -D 629 -b Segid: 629 Type: working Address Range: 0..77 Size of page space allocation: 7 pages ( 0.0 Mb) Virtual: 11 frames ( 0.0 Mb) Inuse: 7 frames ( 0.0 Mb) Page Frame Pin Ref Mod 0 32304 N Y Y 3 32167 N Y Y 7 32321 N Y Y 8 32320 N Y Y 5 32941 N Y Y 1 48357 N N Y 77 47897 N N Y The -b flag shows the status of the reference and modified bits of all the displayed frames. After it is shown, the reference bit of the frame is reset. When used with the -i flag, it detects which frames are accessed between each interval. Note: Use this flag with caution because of its performance impacts. List of Top Memory Usage of Segments The -S option is used to sort segments by memory usage and to display the memory-usage statistics for the top memory-usage segments. If count is not specified, then a count of 10 is implicit. The following command sorts system and non-system segments by the number of pages in real memory and prints out the top 10 segments of the resulting list. # svmon -Sau Segid Type Description Inuse Pin Pgspace Address Range 0 work kernel 1990 1408 3722 0..32767 : 32768..65535 1 work private, pid=4042 1553 1 1497 0..1907 : 65307..65535 1435 work private, pid=3006 1391 3 1800 0..4565 : 65309..65535 11f5 work private, pid=14248 1049 1 1081 0..1104 : 65307..65535 11f3 clnt 991 0 0 0..1716 681 clnt 960 0 0 0..1880 909 work shared library text 900 0 8 0..65535 101 work vmm data 497 496 1 0..27115 : 43464..65535 a0a work shared library data 247 0 718 0..65535 1bf9 work private, pid=21094 221 1 320 0..290 : 65277..65535 All output fields are described in the previous examples. An example output from AIX 4.3.3 and later is similar to the following: # svmon -Sut 10 Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range 1966 – work 9985 4 31892 32234 0..32272 : 65309..65535 14c3 – work 5644 1 161 5993 0..6550 : 65293..65535 5453 – work 3437 1 2971 4187 0..4141 : 65303..65535 4411 – work 3165 0 1264 1315 0..65535 5a1e – work 2986 1 13 2994 0..3036 : 65295..65535 340d – work misc kernel tables 2643 0 993 2645 0..15038 : 63488..65535 380e – work kernel pinned heap 2183 1055 1416 2936 0..65535 0 – work kernel seg 2044 1455 1370 4170 0..32767 : 65475..65535 6afb – pers /dev/notes:92 1522 0 – - 0..10295 2faa – clnt 1189 0 – - 0..2324 Correlating svmon and vmstat Outputs There are some relationships between the svmon and vmstat outputs. The svmon report of AIX 4.3.2 follows (the example is the same with AIX 4.3.3 and later, although the output format is different): # svmon -G m e m o r y i n u s e p i n p g s p a c e size inuse free pin work pers clnt work pers clnt size inuse 16384 16254 130 2016 11198 2537 2519 2016 0 0 40960 13392 The vmstat command was run in a separate window while the svmon command was running. The vmstat report follows: # vmstat 5 kthr memory page faults cpu —– ———– ———————— ———— ———– r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 13392 130 0 0 0 0 2 0 125 140 36 2 1 97 0 0 0 13336 199 0 0 0 0 0 0 145 14028 38 11 22 67 0 0 0 13336 199 0 0 0 0 0 0 141 49 31 1 1 98 0 0 0 13336 199 0 0 0 0 0 0 142 49 32 1 1 98 0 0 0 13336 199 0 0 0 0 0 0 145 49 32 1 1 99 0 0 0 13336 199 0 0 0 0 0 0 163 49 33 1 1 92 6 0 0 13336 199 0 0 0 0 0 0 142 49 32 0 1 98 0 The global svmon report shows related numbers. The vmstatfre column relates to the svmon memory free column. The number that vmstat reports as Active Virtual Memory (avm) is reported by the svmon command as pg space inuse (13392). The vmstat avm column provides the same figures as the pg space inuse column of the svmon command except starting with AIX 4.3.2 where Deferred Page Space Allocation is used. In that case, the svmon command shows the number of pages actually paged out to paging space whereas the vmstat command shows the number of virtual pages accessed but not necessarily paged out (see Looking at Paging Space and Virtual Memory). Correlating svmon and ps Outputs There are some relationships between the svmon and ps outputs. The svmon report of AIX 4.3.2 follows (the example is the same with AIX 4.3.3 and later, although the output format is different): # svmon -P 7226 Pid Command Inuse Pin Pgspace 7226 telnetd 936 1 69 Pid: 7226 Command: telnetd Segid Type Description Inuse Pin Pgspace Address Range 828 pers /dev/hd2:15333 0 0 0 0..0 1d3e work lib data 0 0 28 0..559 909 work shared library text 930 0 8 0..65535 1cbb work sreg[3] 0 0 1 0..0 1694 work private 6 1 32 0..24 : 65310..65535 12f6 pers code,/dev/hd2:69914 0 0 0 0..11 Compare with the ps report, which follows: # ps v 7226 PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND 7226 – A 0:00 51 240 24 32768 33 0 0.0 0.0 telnetd SIZE refers to the virtual size in KB of the data section of the process (in paging space). This number is equal to the number of working segment pages of the process that have been touched (that is, the number of paging-space pages that have been allocated) times 4. It must be multiplied by 4 because pages are in 4 K units and SIZE is in 1 K units. If some working segment pages are currently paged out, this number is larger than the amount of real memory being used. The SIZE value (240) correlates with the Pgspace number from the svmon command for private (32) plus lib data (28) in 1 K units. RSS refers to the real memory (resident set) size in KB of the process. This number is equal to the sum of the number of working segment and code segment pages in memory times 4. Remember that code segment pages are shared among all of the currently running instances of the program. If 26 ksh processes are running, only one copy of any given page of the ksh executable program would be in memory, but the ps command would report that code segment size as part of the RSS of each instance of the ksh program. The RSS value (24) correlates with the Inuse numbers from the svmon command for private (6) working-storage segments, for code (0) segments, and for lib data (0) of the process in 1-K units. TRS refers to the size of the resident set (real memory) of text. This is the number of code segment pages times four. As was noted earlier, this number exaggerates memory use for programs of which multiple instances are running. This does not include the shared text of the process. The TRS value (0) correlates with the number of the svmon pages in the code segment (0) of the Inuse column in 1 K units. The TRS value can be higher than the TSIZ value because other pages, such as the XCOFF header and the loader section, may be included in the code segment. The following calculations can be made for the values mentioned: SIZE = 4 * Pgspace of (work lib data + work private) RSS = 4 * Inuse of (work lib data + work private + pers code) TRS = 4 * Inuse of (pers code) Calculating the Minimum Memory Requirement of a Program To calculate the minimum memory requirement of a program, the formula would be: Total memory pages (4 KB units) = T + ( N * ( PD + LD ) ) + F where: T = Number of pages for text (shared by all users) N = Number of copies of this program running simultaneously PD = Number of working segment pages in process private segment LD = Number of shared library data pages used by the process F = Number of file pages (shared by all users) Multiply the result by 4 to obtain the number of kilobytes required. You may want to add in the kernel, kernel extension, and shared library text segment values to this as well even though they are shared by all processes on the system. For example, some applications like CATIA and databases use very large shared library modules. Note that because we have only used statistics from a single snapshot of the process, there is no guarantee that the value we get from the formula will be the correct value for the minimum working set size of a process. To get working set size, one would need to run a tool such as the rmss command or take many snapshots during the life of the process and determine the average values from these snapshots (see Assessing Memory Requirements Through the rmss Command). If we estimate the minimum memory requirement for the program pacman, shown in Finding Memory-Leaking Programs, the formula would be: T = 2 (Inuse of code,/dev/lv01:12302 of pers) PD = 1632 (Inuse of private of work) LD = 12 (Inuse of lib data of work) F = 1 (Inuse of /dev/hd2:53289 of pers That is: 2 + (N * (1632+ 12)) + 1, equal to 1644 * N + 3 in 4 KB units.
三:计算ORACLE 进程使用内存
This article will not discuss the use of svmon, except to address a common misunderstanding with its output. The svmon command will associate memory used for buffering persistent file pages (also know as the Unix filesystem buffer cache) with the process that requested the file page. However, the physical memory used to buffer the persistent file pages is not allocated or controlled by Oracle. This type of memory is allocated and controlled exclusively by the AIX operating system. The memory used for this purpose should not be considered when determining how much memory Oracle is using since it is actually AIX that is allocating and controlling the persistent file pages buffer. That does not mean that the memory for buffering persistent file pages should be ignored. It is possible that this type of memory could account for the majority of physical memory used on the system and could lead to unnecessary paging.
This kind of memory can be identified in the output of commands like "svmon -Pau 10" by "pers" in the "Type" field and a disk device in the "Description" field. The AIX vmtune command can be used to modify the amount of physical memory used to buffer persistent file pages. In particular, the minperm, maxperm, and strict_maxperm parameters of vmtune.
123754.1这篇文章不会去讨论svmon的使用。svmon会将UNIX上的文件系统缓存对应到曾经申请过这些文件页的进程身上。但是这些用作文件系统缓存的内存(从上面的描述得知讲的是pers持久性存储,持久性存储都是为客户端分页,即JFS分页)是不受Oracle分配和控制的。这部分内存是受AIX操作系统分配并被排他式控制。当计算ORACLE使用内存时,这部分内存不应该被考虑在内。但是这并不意味着我们可以忽略这部分内存。因为有可能这部分内存会占用物理内存的很大一部分,从而导致不必要的换页。
这类的内存可以通过如下方法来判断:
AIX 4.3.3 之前我们可以用 “svmon -Pau 10”,AIX 4.3.3及之后版本我们可以用“svmon -Put 10” 该命令输出类似如下内容(AIX6.1)
svmon -Put 10
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
5374346 oracle 8106780 17312 0 8083145 Y N N
PageSize Inuse Pin Pgsp Virtual
s 4 KB 56188 0 0 32553
m 64 KB 503162 1082 0 503162
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
c128c 70000032 work default shmat/mmap m 4096 0 0 4096
b11331 70000021 work default shmat/mmap m 4096 0 0 4096
3812b8 7000002e work default shmat/mmap m 4096 0 0 4096
b01230 70000013 work default shmat/mmap m 4096 0 0 4096
该命令最上面一部分是进程(5374346)使用内存的总体概述,下面是一些详细信息,在type列为“pers”,Description列为磁盘设备时这部分内存就是我们所说的“persistent file pages”内存
For information about using vmtune, please refer to the AIX "Performance Management Guide" from IBM,
chapter 7 "Monitoring and Tuning Memory Use", and the heading "Tuning VMM Page Replacement with the vmtune Command".
This guide is available online at the following IBM web site... http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c75.htm
在AIX上著名的性能调优工具virtual memory optimizer,原先的vmtune,现在的vmo工具,可以帮助我们调节文件系统内存的具体阀值如 maxperm,minperm,strict_maxperm(这里不做展开)。有兴趣的话可以参考下面引用的这篇文档:
Tuning VMM Page Replacement with the vmtune Command The memory management algorithm, discussed in Real-Memory Management, tries to keep the size of the free list and the percentage of real memory occupied by persistent segment pages within specified bounds. These bounds can be altered with the vmtune command, which can only be run by the root user. Changes made by this tool remain in effect until the next reboot of the system. To determine whether the vmtune command is installed and available, run the following command: # lslpp -lI bos.adt.samples Note: The vmtune command is in the samples directory because it is very VMM-implementation dependent. The vmtune code that accompanies each release of the operating system is tailored specifically to the VMM in that release. Running the vmtune command from one release on a different release might result in an operating-system failure. It is also possible that the functions of vmtune may change from release to release. Do not propagate shell scripts or /etc/inittab entries that include the vmtune command to a new release without checking the vmtune documentation for the new release to make sure that the scripts will still have the desired effect. Executing the vmtune command on AIX 4.3.3 with no options results in the following output: # /usr/samples/kernel/vmtune vmtune: current values: -p -P -r -R -f -F -N -W minperm maxperm minpgahead maxpgahead minfree maxfree pd_npages maxrandwrt 52190 208760 2 8 120 128 524288 0 -M -w -k -c -b -B -u -l -d maxpin npswarn npskill numclust numfsbufs hd_pbuf_cnt lvm_bufcnt lrubucket defps 209581 4096 1024 1 93 96 9 131072 1 -s -n -S -h sync_release_ilock nokillroot v_pinshm strict_maxperm 0 0 0 0 number of valid memory pages = 261976 maxperm=79.7% of real memory maximum pinable=80.0% of real memory minperm=19.9% of real memory number of file memory pages = 19772 numperm=7.5% of real memory The output shows the current settings for all the parameters. Choosing minfree and maxfree Settings The purpose of the free list is to keep track of real-memory page frames released by terminating processes and to supply page frames to requestors immediately, without forcing them to wait for page steals and the accompanying I/O to complete. The minfree limit specifies the free-list size below which page stealing to replenish the free list is to be started. The maxfree parameter is the size above which stealing will end. The objectives in tuning these limits are to ensure that: * Any activity that has critical response-time objectives can always get the page frames it needs from the free list. * The system does not experience unnecessarily high levels of I/O because of premature stealing of pages to expand the free list. The default value of minfree and maxfree depend on the memory size of the machine. The default value of maxfree is determined by this formula: maxfree = minimum (# of memory pages/128, 128) By default the minfree value is the value of maxfree – 8. However, the difference between minfree and maxfree should always be equal to or greater than maxpgahead. Or in other words, the value of maxfree should always be greater than or equal to minfree plus the size of maxpgahead. The minfree/maxfree values will be different if there is more than one memory pool. Memory pools were introduced in AIX 4.3.3 for MP systems with large amounts of RAM. Each memory pool will have its own minfree/maxfree which are determined by the previous formulas, but the minfree/maxfree values shown by the vmtune command will be the sum of the minfree/maxfree for all memory pools. Remember, that minfree pages in some sense are wasted, because they are available, but not in use. If you have a short list of the programs you want to run fast, you can investigate their memory requirements with the svmon command (see Determining How Much Memory Is Being Used), and set minfree to the size of the largest. This technique risks being too conservative because not all of the pages that a process uses are acquired in one burst. At the same time, you might be missing dynamic demands that come from programs not on your list that may lower the average size of the free list when your critical programs run. A less precise but more comprehensive tool for investigating an appropriate size for minfree is the vmstat command. The following is a portion of a vmstat command output obtained while running an C compilation on an otherwise idle system. # vmstat 1 kthr memory page faults cpu —– ———– ———————— ———— ———– r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 3085 118 0 0 0 0 0 0 115 2 19 0 0 99 0 0 0 3086 117 0 0 0 0 0 0 119 134 24 1 3 96 0 2 0 3141 55 2 0 6 24 98 0 175 223 60 3 9 54 34 0 1 3254 57 0 0 6 176 814 0 205 219 110 22 14 0 64 0 1 3342 59 0 0 42 104 249 0 163 314 57 43 16 0 42 1 0 3411 78 0 0 49 104 169 0 176 306 51 30 15 0 55 1 0 3528 160 1 0 10 216 487 0 143 387 54 50 22 0 27 1 0 3627 94 0 0 0 72 160 0 148 292 79 57 9 0 34 1 0 3444 327 0 0 0 64 102 0 132 150 41 82 8 0 11 1 0 3505 251 0 0 0 0 0 0 128 189 50 79 11 0 11 1 0 3550 206 0 0 0 0 0 0 124 150 22 94 6 0 0 1 0 3576 180 0 0 0 0 0 0 121 145 30 96 4 0 0 0 1 3654 100 0 0 0 0 0 0 124 145 28 91 8 0 1 1 0 3586 208 0 0 0 40 68 0 123 139 24 91 9 0 0 Because the compiler has not been run recently, the code of the compiler itself must be read in. All told, the compiler acquires about 2 MB in about 6 seconds. On this 32 MB system, maxfree is 64 and minfree is 56. The compiler almost instantly drives the free list size below minfree, and several seconds of rapid page-stealing activity take place. Some of the steals require that dirty working segment pages be written to paging space, which shows up in the po column. If the steals cause the writing of dirty permanent segment pages, that I/O does not appear in the vmstat report (unless you have directed the vmstat command to report on the I/O activity of the physical volumes to which the permanent pages are being written). This example describes a fork() and exec() environment (not an environment where a process is long lived, such as in a database) and is not intended to suggest that you set minfree to 500 to accommodate large compiles. It suggests how to use the vmstat command to identify situations in which the free list has to be replenished while a program is waiting for space. In this case, about 2 seconds were added to the compiler execution time because there were not enough page frames immediately available. If you observe the page frame consumption of your program, either during initialization or during normal processing, you will soon have an idea of the number page frames that need to be in the free list to keep the program from waiting for memory. If we concluded from the example above that minfree needed to be 128, and we had set maxpgahead to 16 to improve sequential performance, we would use the following vmtune command: # /usr/samples/kernel/vmtune -f 128 -F 144 Tuning Memory Pools In operating system versions later than AIX 4.3.3, the vmtune -m number_of_memory_pools command allows you to change the number of memory pools that are configured at system boot time. The -m flag is therefore not a dynamic change. The change is written to the kernel file if it is an MP kernel (the change is not allowed on a UP kernel). A value of 0 restores the default number of memory pools. By default, the vmtune -m command writes to the file /usr/lib/boot/unix_mp, but this can be changed with the command vmtune -U path_to_unix_file. Before changing the kernel file, the vmtune command saves the original file as name_of_original_file.sav. Tuning lrubucket to Reduce Memory Scanning Overhead Tuning lrubucket can reduce scanning overhead on large memory systems. In AIX 4.3, a new parameter lrubucket was added. The page-replacement algorithm scans memory frames looking for a free frame. During this scan, reference bits of pages are reset, and if a free frame has not been found, a second scan is done. In the second scan, if the reference bit is still off, the frame will be used for a new page (page replacement). On large memory systems, there may be too many frames to scan, so now memory is divided up into buckets of frames. The page-replacement algorithm will scan the frames in the bucket and then start over on that bucket for the second scan before moving on to the next bucket. The default number of frames in this bucket is 131072 or 512 MB of RAM. The number of frames is tunable with the command vmtune -l, and the value is in 4 K frames. Choosing minperm and maxperm Settings The operating system takes advantage of the varying requirements for real memory by leaving in memory pages of files that have been read or written. If the file pages are requested again before their page frames are reassigned, this technique saves an I/O operation. These file pages may be from local or remote (for example, NFS) file systems. The ratio of page frames used for files versus those used for computational (working or program text) segments is loosely controlled by the minperm and maxperm values: * If percentage of RAM occupied by file pages rises above maxperm, page-replacement steals only file pages. * If percentage of RAM occupied by file pages falls below minperm, page-replacement steals both file and computational pages. * If percentage of RAM occupied by file pages is between minperm and maxperm, page-replacement steals only file pages unless the number of file repages is higher than the number of computational repages. In a particular workload, it might be worthwhile to emphasize the avoidance of file I/O. In another workload, keeping computational segment pages in memory might be more important. To understand what the ratio is in the untuned state, we use the vmtune command with no arguments. # /usr/samples/kernel/vmtune vmtune: current values: -p -P -r -R -f -F -N -W minperm maxperm minpgahead maxpgahead minfree maxfree pd_npages maxrandwrt 52190 208760 2 8 120 128 524288 0 -M -w -k -c -b -B -u -l -d maxpin npswarn npskill numclust numfsbufs hd_pbuf_cnt lvm_bufcnt lrubucket defps 209581 4096 1024 1 93 96 9 131072 1 -s -n -S -h sync_release_ilock nokillroot v_pinshm strict_maxperm 0 0 0 0 number of valid memory pages = 261976 maxperm=79.7% of real memory maximum pinable=80.0% of real memory minperm=19.9% of real memory number of file memory pages = 19772 numperm=7.5% of real memory The default values are calculated by the following algorithm: minperm (in pages) = ((number of memory frames) – 1024) * .2 maxperm (in pages) = ((number of memory frames) – 1024) * .8 The numperm value gives the number of file pages in memory, 19772. This is 7.5 percent of real memory. If we know that our workload makes little use of recently read or written files, we may want to constrain the amount of memory used for that purpose. The following command: # /usr/samples/kernel/vmtune -p 15 -P 50 sets minperm to 15 percent and maxperm to 50 percent of real memory. This would ensure that the VMM would steal page frames only from file pages when the ratio of file pages to total memory pages exceeded 50 percent. This should reduce the paging to page space with no detrimental effect on the persistent storage. The maxperm value is not a strict limit, it is only considered when the VMM needs to perform page replacement. Because of this, it is usually safe to reduce the maxperm value on most systems. On the other hand, if our application frequently references a small set of existing files (especially if those files are in an NFS-mounted file system), we might want to allow more space for local caching of the file pages by using the following command: # /usr/samples/kernel/vmtune -p 30 -P 90 NFS servers that are used mostly for reads with large amounts of RAM can benefit from increasing the value of maxperm. This allows more pages to reside in RAM so that NFS clients can access them without forcing the NFS server to retrieve the pages from disk again. Another example would be a program that reads 1.5 GB of sequential file data into the working storage of a system with 2 GB of real memory. You may want to set maxperm to 50 percent or less, because you do not need to keep the file data in memory. Placing a Hard Limit on Persistent File Cache with strict_maxperm Starting with AIX 4.3.3, a new vmtune option (-h) called strict_maxperm has been added. This option, when set to 1, places a hard limit on how much memory is used for a persistent file cache by making the maxperm value be the upper limit for this file cache. When the upper limit is reached, the least recently used (LRU) is performed on persistent pages.
There is another tool that can be used, and it is available by default on every AIX system. The command is "ps v" followed by the process id number of the process you are checking memory usage of.
我们也可以通过另外一个工具来计算内存的使用,这个工具就是 ps v命令,这个命令在AIX系统上都默认存在。ps v命令后加你需要检查内存使用情况的进程号
Please note that there is no dash ( - ) before the "v".
Here is a comparison of output from the "ps -lf" and "ps v" command...
# ps -lfp 13288
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
240001 A oracle 13288 1 0 60 20 1ba2f 34032 Nov 03 - 0:06 ora_pmon_DEV
# ps v 13288
PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
13288 - A 0:08 225 5616 13904 32768 28420 13512 0.0 1.0 ora_pmon_DEV
The "ps v" fields that are of interest to us are RSS and TRS. The RSS number is equal to the sum of the number of working-segment pages in memory times 4 and the code-segment pages in memory times 4. The TRS number is equal to just the code-segment pages in memory times 4.
ps v命令的输出中我们感兴趣的是RSS和TRS两部分。RSS就等于工作段分页乘以4K和代码段分页乘以4K的和,TRS代表的就是代码段分页的值。
Please note that an AIX memory page is 4096 bytes, which is why the number of memory pages must be multiplied by 4 to get the value for RSS and TRS (which are each reported in kilobytes).
For example, if the actual memory used by the code-segment was 2 pages ( 2 pages * 4096 byte page size = 8192 bytes), then the TRS value would be reported as 8 ( 2 {number of pages} * 4 = 8 {kilobytes} ). Since RSS includes both working-segment and code-segment pages, if we subtract TRS, which is just the code-segment pages, from RSS, we are left with only the working-segment pages, or private memory. In the example above, the oracle pmon background process is using... 13904 (RSS) - 13512 (TRS) = 392 392 * 1024 = 401408 bytes The correct amount of memory used by the pmon background process is 392k (401408 bytes), not 34 MB as reported by the "ps -lf" command. The value of TRS will be approximately the same for all oracle background processes, since all of these processes are just different invocations of the same $ORACLE_HOME/bin/oracle executable.
To get a good estimate of memory usage for all oracle background processes, sum the private memory for each background process plus the value of TRS for only one of the background processes, plus the size of the SGA.
需要注意的是AIX的内存分页单位是4K,也就是说一个分页就代表4K的内存大小。这就是上面上面在计算RSS和TRS值时为什么要乘以4K的原因。
举个例子,如果实际使用的代码段内存为2个分页,那么TRS使用的内存值为8K。因为RSS既包含工作段分页又包含代码段分页,则RSS-TRS剩余的就是工作段分页,或者叫私有内存(注意这里的私有内存跟pga不是一个概念,实际是上包含pga和其他的一部分私有性质的内存)。在上面的例子中,ORACLE pmon后台进程的
使用的私有内存为(RSS-TRS)*1K=(13904-13512)*1k=392k(注意RSS及TRS的单位是K),而不是“ps -lf”命令所现在的34M。因为所有的ORACLE前后台进程都是调用$ORACLE_HOME/bin目录下的oracle可执行文件,所以所有oracle前后台进程的TRS值大致相同。
为了准确的计算所有ORACLE后台进程使用的内存,我们应该用 每个进程使用的私有内存+TRS(只计算一次)+SGA。
(P1.RSS-P1.TRS)+(P2.RSS-P2.TRS)+(P3.RSS-P3.TRS)+…+(Pn.RSS-Pn.TRS)+ TRS + SGA
For information about determining the size of the SGA, please see...Note 1008866.6 How to determine SGA Size (7.x, 8.x, 9.x, 10g)
Determining memory usage for shadow processes is a little more complicated since the amount of memory used can fluctuate greatly from one moment to the next depending on what the user is doing. You should subtract TRS from RSS to get the private memory used by the shadow process, but remember this is only a snapshot and the value will change if the process is active. To get a good estimate of memory used for shadow processes, you should run the "ps v" command repeatedly at regular intervals while the process is under peak load to get an average value. You can now take this value and multiply it by the peak number of expected users to estimate how much memory will be needed on the system.
计算ORACLE服务器进程使用的内存就有一点复杂了,因为随着用户行为的不同,服务进程内存使用量波动可能会很大。你依然可以通过RSS-TRS得到ORACLE服务器进程所使用的私有内存量,但是如果你的进程是活跃的,那么这个值只是执行时的一个快照,下一刻再次执行ps v 时(RSS - TRS)可能就会变成另外一个值。如果想比较准确的计算出ORACLE服务器进程所使用的私有内存,我们应在进程负载处在峰值的时候重复执行“ps v”然后取平均值。
For more (generic) information about memory usage, please see...Note 17094.1 TECH: Unix Virtual Memory, Paging & Swapping explained
-------------------------------------------------------------------------------------------------------------------------------------
Accessibility of Links to External Web Sites This documentation may contain links to Web sites of other companies or organizations that Oracle Corporation does not own or control. Oracle Corporation neither evaluates nor makes any representations regarding the accessibility of these Web sites.
四:下面介绍一些ORACLE 及AIX内存统计脚本
1.查看session PGA和UGA使用情况
SELECT s.sid, n.name, s.value
FROM v$sesstat s, v$statname n
WHERE s.statistic# = n.statistic#
AND n.name LIKE 'session%memory%'
ORDER BY s.sid, s.statistic#;
2.求数据库中PGA及SGA使用和
SELECT sum(s.value)/2
FROM v$sesstat s, v$statname n
WHERE s.statistic# = n.statistic#
AND n.name LIKE 'session%memory%'
ORDER BY s.sid, s.statistic#;
3.查看各session PGA分配,使用,空闲情况
SET HEADING ON
COLUMN alme HEADING "Allocated MB" FORMAT 99999D9;
COLUMN usme HEADING "Used MB" FORMAT 99999D9;
COLUMN frme HEADING "Freeable MB" FORMAT 99999D9;
COLUMN mame HEADING "Max MB" FORMAT 99999D9;
COLUMN username FORMAT a15;
COLUMN program FORMAT a22;
COLUMN sid FORMAT a5;
COLUMN spid FORMAT a8;
SET LINESIZE 300;
SET PAGES 100
COMPUTE SUM LABEL 'Total' OF usme alme frme mame ON report
BREAK ON REPORT
SELECT s.username,s.username,
SUBSTR(s.sid,1,5) sid, p.spid, logon_time,
SUBSTR(s.program,1,22) program, p.background, s.process pid_remote,
ROUND(pga_used_mem/1024/1024) usme,
ROUND(pga_alloc_mem/1024/1024) alme,
ROUND(pga_freeable_mem/1024/1024) frme,
ROUND(pga_max_mem/1024/1024) mame
FROM v$session s,v$process p
WHERE p.addr=s.paddr
ORDER BY mame;
spool /tmp/pga_useage_output.html
set markup html on
##ORACLE MOS上提供了两个脚本来收集PGA和UGA使用情况
“NOTE:239846.1 - Script To Monitor Memory Usage By Database Sessions”
“Script To Monitor RDBMS Session UGA and PGA Current And Maximum Usage Over Time (Doc ID 835254.1)”
“Monitor Oracle Resource Consumption in UNIX (Doc ID 148466.1)”
4.查看数据库中PGA和SGA 使用量总和
SELECT SUM(bytes)/1024/1024 Mbytes
FROM (SELECT bytes
FROM v$sgastat
UNION ALL
SELECT value bytes
FROM v$sesstat s, v$statname n
WHERE n.statistic# = s.statistic#
AND n.name = 'session pga memory' );
或者:
SELECT SUM(bytes)/1024/1024 Mbytes
FROM (SELECT bytes
FROM v$sgastat
UNION ALL
SELECT pga_alloc_mem AS bytes
FROM v$process);