RAC数据库的 ORA-27123: Unable To Attach To Shared Memory Segment Linux-x86_64 Error: 22: Invalid argument
RAC数据库的场景
由于生产环境的某业务大量写操作,导致出现了如下的错误:
Making DataFactory:com.raqsoft.report.dataset.SQLDataSetFactory failure (dataset named):ds1 Caused:ORA-01034: ORACLE not available ORA-27123: unable to attach to shared memory segment Linux-x86_64 Error: 22: Invalid argument Additional information: 25 Additional information: 1605653 Additional information: -1073741824
环境 | 版本信息 |
---|---|
数据库的版本 | Release 11.2.0.4.0 Production |
操作系统版本 | 2.6.32-696.el6.x86_64 |
查看环境的命令:
[root@newarpdb01 ~]# uname -r
2.6.32-696.el6.x86_64
[root@newarpdb01 ~]# grep memlock /etc/security/limits.conf
# - memlock - max locked-in-memory address space (KB)
[root@newarpdb01 ~]# grep HugePsges /proc/meminfo
[root@newarpdb01 ~]# grep HugePages /proc/meminfo
AnonHugePages: 22540288 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
[root@newarpdb01 ~]# grep Hugepages /proc/meminfo
Hugepagesize: 2048 kB
[root@newarpdb01 ~]# sysctl -a |grep shm
kernel.shmmax = 536870912000
kernel.shmall = 131072000
kernel.shmmni = 4096
kernel.shm_rmid_forced = 0
vm.hugetlb_shm_group = 0
[root@newarpdb01 ~]# uname -r
2.6.32-696.el6.x86_64
基本定位在HugePages
上面
系统进程是通过虚拟地址访问内存,但是CPU必须把它转换程物理内存地址才能真正访问内存。为了提高这个转换效率,CPU会缓存最近的虚拟内存地址和物理内存地址的映射关系,并保存在一个由CPU维护的映射表中。为了尽量提高内存的访问速度,需要在映射表中保存尽量多的映射关系。
而在Linux中,内存都是以页的形式划分的,默认情况下每页是4K,这就意味着如果物理内存很大,则映射表的条目将会非常多,会影响CPU的检索效率。因为内存大小是固定的,为了减少映射表的条目,可采取的办法只有增加页的尺寸。
page table
是操作系统上的虚拟内存系统的数据结构模型,用于存储虚拟地址与物理地址的对应关系。当我们访问内存时,首先访问page table,然后Linux在通过page table的mapping来访问真实物理内存(ram+swap)
解决思路
如何计算nr_hugepages,可以使用如下的shell,hugepages_settings.sh
#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating the overall size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
echo "The hugepages may not be supported in the system where the script is being executed."
exit 1
fi
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo "***********"
echo "** ERROR **"
echo "***********"
echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured"
exit 1
fi
# Finish with results
case $KERN in
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'4.14') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
*) echo "Kernel version $KERN is not supported by this script (yet). Exiting." ;;
esac
# End
使用如下,可得:
[root@newarpdb01 u01]# ./hugepages_settings.sh
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating the overall size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed...
Recommended setting: vm.nr_hugepages = 117001
修改vm.nr_hugepages参数
# vi /etc/sysctl.conf
修改vm.nr_hugepages参数
vm.nr_hugepages = 117001
执行sysctl –p使配置生效:
# sysctl -p
备注: Hugepages是在分配后就会预留出来的,其大小一定要比服务器上所有实例的SGA总和要大
关于memlock
, 修改内核参数memlock,单位是KB,如果内存是16G,memlock的大小要稍微小于物理内存。计划lock 12GB的内存大小。参数设置为大于SGA是没有坏处的。
然后重启各个实例的RAC的数据库。
RAC数据库无法重启,出现ORA-27100: shared memory realm already exists Linux-x86_64 Error: 17: File exists Additional information: 2097152
查看shared memory
[root@newarpdb01 u01]# su - oracle
[oracle@newarpdb01 ~]$ $ORACLE_HOME/bin/sysresv
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1441809 grid 640 4096 0
0x00000000 1474578 grid 640 4096 0
0x6d9f45cc 1507347 grid 640 4096 0
0x00000000 1572884 oracle 640 1610612736 102
0x00000000 1605653 oracle 640 243739394048 101
0xb0b15d50 1638422 oracle 640 2097152 102
定位了1605653。占用了
Linux:
% ipcrm shm 1605653
然后 重启某实例的RAC数据库
srvctl stop instance -d newarpdb -i newarpdb2
srvctl start instance -d newarpdb -i newarpdb2
文献参考:
Semaphores and Shared Memory - An Overview (Doc ID 153961.1)
Startup Error ORA-27123: Unable To Attach To Shared Memory Segment Linux-x86_64 Error: 22: Invalid argument (Doc ID 1607545.1)
Configuring HugePages for Oracle on Linux (x86-64)