在Oracle 11.2.0.3.0上开启大页(hugepages)的详细解析

转载请注明出处:http://blog.csdn.net/guoyjoe/article/details/17138391

      什么时侯使用大页呢,当你主机的物理内存为64G,设SGA>=32G时,建议开启大页,步骤如下:

1、 关闭Oracle Database 11g中的AMM(Automatic Memory Management),即把两个参数MEMORY_TARGET / MEMORY_MAX_TARGET设为0

如果设考数 MEMORY_MAX_TARGET为0不成功,那么请考参http://blog.csdn.net/guoyjoe/article/details/12845965

gyj@OCM> show parameter memory_max_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
memory_max_target                    big integer 0
gyj@OCM> show parameter memory_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
memory_target                        big integer 0


2、参考metalink(文档 ID 401749.1)提供的脚本,计算hugepages的大小
[oracle@mydb admin]$ vi /u01/app/oracle/product/11.2.0/rdbms/admin/hugepages_settings.sh

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support 
# http://support.oracle.com

# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support 
(http://support.oracle.com) where it is intended to compute values for 
the recommended HugePages/HugeTLB configuration for the current shared 
memory segments. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and 
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size, 
   as the new SGA will not fit in the previous HugePages configuration, 
   it had better disable the whole HugePages, 
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup 
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m


Press Enter to proceed..."

read

# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
    echo "The hugepages may not be supported in the system where the script is being executed."
    exit 1
fi

# Initialize the counter
NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
    MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
    if [ $MIN_PG -gt 0 ]; then
        NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
    fi
done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
    echo "***********"
    echo "** ERROR **"
    echo "***********"
    echo "Sorry! There are not enough total of shared memory segments allocated for 
HugePages configuration. HugePages can only be used for shared memory segments 
that you can list by command:

    # ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running 
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
    exit 1
fi

# Finish with results
case $KERN in    '2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
    '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
           echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
    '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac

# End

3、对hugepages_settings.sh这个脚本授可执行的权限

[oracle@mydb admin]$  chmod +x hugepages_settings.sh


4、执行hugepages_settings.sh,计算hugepages的值为1028M

[oracle@mydb admin]$ ./hugepages_settings.sh

This script is provided by Doc ID 401749.1 from My Oracle Support 
(http://support.oracle.com) where it is intended to compute values for 
the recommended HugePages/HugeTLB configuration for the current shared 
memory segments. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and 
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size, 
   as the new SGA will not fit in the previous HugePages configuration, 
   it had better disable the whole HugePages, 
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup 
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m


Press Enter to proceed...

Recommended setting: vm.nr_hugepages = 1028

      得出大页的大小为1028页(注:一页为2M,这个值不可改,1028*2M=2056M),实际上hugepages与参数sga_max_size有关,比sga_max_size的值稍微大一点点(比SGA_MAX_SIZE最少要多加一页,2M的页不要分配超过sga_max_size太多,会造成内存的浪费):

gyj@OCM> show parameter sga_max_size


NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
sga_max_size                         big integer 2G


5、设置hugepages,在内核参数中添加一行,vi /etc/sysctl.conf

vm.nr_hugepages = 1028


6、修改内核参数立即生效
[root@mydb ~]# sysctl -p

net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
fs.file-max = 6815744
vm.nr_hugepages = 1028

7、别忘记设定/etc/security/limits.conf文件,以K为单位,必须大于sga_max_size,这里设定为2056000

[root@mydb ~]# vi /etc/security/limits.conf

oracle          soft    memlock 2056000
oracle          hard    memlock 2056000


8、检查limits是否正确
[root@mydb ~]# su - oracle
[oracle@mydb ~]$ ulimit -l
2056000

9、重启数据库---注原来的orale用户的窗口退到root用户,重新su - oracle

sys@OCM> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@mydb ~]$ exit
logout

You have new mail in /var/spool/mail/root
[root@mydb ~]# su -  oracle
[oracle@mydb ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Dec 5 10:56:20 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

sys@OCM> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
sys@OCM> startup
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2230072 bytes
Variable Size            1409288392 bytes
Database Buffers          603979776 bytes
Redo Buffers              122388480 bytes
Database mounted.
Database opened.

10、查看大页,已被使用

[oracle@mydb ~]$ watch -n1 'cat /proc/meminfo |grep -i HugePage'

Every 1.0s: cat /proc/meminfo |grep -i HugePage                                                   Thu Dec  5 11:09:06 2013

HugePages_Total:  1028
HugePages_Free:    869
HugePages_Rsvd:    842
Hugepagesize:     2048 kB

注:
HugePages_Total:  1028    ---总共1028页
HugePages_Free:    869    ---空闲548页,即当前大页被使用了1028-869=159页,即被用了159*2M=118M,小于sga_target。
HugePages_Rsvd:    842    ---操作系统承诺给Oracle预留842页,即842*2M=1684M(1684+118==SGA_MAX_SIZE)
Hugepagesize:     2048 kB --每页是2M,不可修改


使用了hugepage之后,SGA就默认pin在内存里了,那么就不用lock sga了。接下来我们研究一下参数:pre_page_sga,这个参数默认是false,我把它打开。

sys@OCM> alter system set pre_page_sga=true scope=spfile;

System altered.

sys@OCM> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     TRUE
sga_max_size                         big integer 2G
sga_target                           big integer 1G

HugePages_Total:  1028    ---总共1028页
HugePages_Free:    548    ---空闲548页,即当前大页被使用了1028-548=480页,即被用了480*2M=960M,约等于sga_target,参数pre_page_sga起作用了。
HugePages_Rsvd:    521    ---操作系统承诺给Oracle预留521页,即521*2M=1042M(理解为sga_max_size-sga_target)
Hugepagesize:     2048 kB --每页是2M,不可修改

参考metalink:USE_LARGE_PAGES To Enable HugePages (文档 ID 1392497.1)

For 11.2.0.2 and further, the Oracle Database Server has added a new parameter that helps managing the hugepages for use by the database. 
The initialization parameter that was added is USE_LARGE_PAGES. 

USE_LARGE_PAGES parameter has these possible values: "true" (default), "only", "false".

1. The default value of "true" preserves the current behavior of trying to use hugepages if they are available on the OS. 

In 11.2.0.2 if there are not enough hugepages, only small pages will be used for SGA memory. This may lead to ORA-4030 errors due to the remaining hugepages going unused and more memory being used by the kernel for page tables. 

In 11.2.0.3 the behavior was changed such that Oracle will now allocate what it can of the SGA in hugepages and if it runs out, it will allocate the rest of the SGA using small pages. With this new behavior additional shared memory segments are an expected side effect. Part of the change is to ensure that each shared memory segment making up the SGA only contains sub-areas with an identical alignment requirement - hence the SGA will spread over more separate SHM segments. In this supported mixed page mode the database will exhaust the available hugepages, before switching to regular sized pages.
 
2. Setting it to "false" means do not use hugepages
 
3. A setting of "only" means do not start up the instance if hugepages cannot be used for the whole memory (to avoid an out-of-memory situation).


补充关于内存申请的OverCommit:

Linux下的OverCommit机制,主要是为了应对可能的异常的大量内存申请对OS本身造成冲击。
Linux有三种OverCommit机制,可以通过:/proc/sys/vm/overcommit_memory来配置,三种配置的具体含义:
0:启发式策略,后果比较严重的Overcommit将不能成功,而轻微的Overcommit将被允许。
1:永远允许Overcommit,这种策略适合那些不能承受内存分配失败的应用,比如某些科学计算应用。
2:永远禁止Overcommit,在这个情况下,系统所能分配的内存不会超过swap+RAM*系数(/proc/sys/vm /overcmmit_ratio,默认50%,你可以调整),如果这么多资源已经用光,那么后面任何尝试申请内存的行为都会返回错误,这通常意味着此时 没法运行任何新程序。

[root@mydb vm]# cd /proc/sys/vm
[root@mydb vm]# ls
block_dump                 flush_mmap_pages      min_free_kbytes     overcommit_memory         swappiness
dirty_background_ratio     hugetlb_shm_group     min_slab_ratio      overcommit_ratio          swap_token_timeout
dirty_expire_centisecs     laptop_mode           min_unmapped_ratio  pagecache                 vfs_cache_pressure
dirty_ratio                legacy_va_layout      mmap_min_addr       page-cluster              zone_reclaim_mode
dirty_writeback_centisecs  lowmem_reserve_ratio  nr_hugepages        panic_on_oom
drop_caches                max_map_count         nr_pdflush_threads  percpu_pagelist_fraction
假设操作系统只有1000M内存,有个应用请求操作系统需要1200M内存,操作系统会承诺给1200M,即由OverCommit承诺,这时还没有真正分配空间。

你可能感兴趣的:(在Oracle 11.2.0.3.0上开启大页(hugepages)的详细解析)