很久不装小鸡不搞AIX了,最近因为工作需要又整了一份,分享给大家参考,希望还有人用得到;-)
RAC and Oracle Clusterware Best Practices and Starter Kit (AIX) (Doc ID 811293.1)
RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent) (Doc ID 810394.1)
OS Configuration Considerations: https://www.oracle.com/technetwork/database/clusterware/overview/rac-aix-system-stability-131022.pdf
IBM Technical Technical Brief: 《Managing the Stability and Performance of current Oracle Database versions running AIX on Power Systems including POWER9》
AIX7.1 TL05及以上版本 (AIX7.1 TL05 recommended minimum AIX versions)
AIX 7.1 required packages: //(Doc ID 169706.1)
bos.adt.base
bos.adt.lib
bos.adt.libm
bos.perf.libperfstat
bos.perf.perfstat
bos.perf.proctools
xlC.rte.11.1.0.2 or later
gpfs.base 3.3.0.11 or later (Only for RAC systems that will use GPFS cluster filesystems)
(AIX 7.x版本默认值就是建议值)
参数值 建议值
minperm% 3
maxperm% 90
maxclient% 90
strict_maxclient 1
strict_maxperm 0
lru_file_repage 0 ##–> AIX7.x已经废弃该参数
lru_poll_interval 10
minfree 960
maxfree 1088
page_steal_method 1
memory_affinity 1
v_pinshm 0
lgpg_regions 0
lgpg_size 0
maxpin% 90
esid_allocator 1
vmm_klock_mode 2
## 检查方法:
% vmo -Fa|egrep "minperm%|maxperm%|maxclient%|strict_maxclient|strict_maxperm|lru_file_repage|lru_poll_interval|minfree |maxfree |page_steal_method| memory_affinity|v_pinshm|lgpg_regions|lgpg_size|maxpin%|esid_allocator|vmm_klock_mode"
## 设置方法:
% vmo -p -o Tunable=Newvalue
% vmo -r -o Tunable=Newvalue (need to reboot to take into effect)
Page sizes set at 64 KB and 16 MB have been shown to benefit Oracle performance by reducing kernel look aside processing to resolve virtual to physical addresses. Oracle 11.2, 12c, 18c and 19c use 64 KB pages for SGA by default. The general recommendation for most Oracle databases on AIX is to utilize 64KB page size and not 16MB page size for the SGA.
Use of AIX LDR_CNTRL ENVIRONMENTAL Settings with Oracle (Doc ID 2066837.1)
设置方法:
--> Oracle 11.2.0.4 (Single Instance)
Set for EACH database (oracle) user as well as for the Database TNS listener user:
$ export LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K oracle
$ export VMM_CNTRL=vmm_fork_policy=COR --> Listener user
--> Oracle 11g (RAC)
As an Oracle user:
$ srvctl setenv database -d -t "LDR_CNTRL=TEXTPSIZE=64K@DATAPSIZE=64K@STACKPSIZE=64K"
As an Oracle Grid user:
$ srvctl setenv listener -l LISTENER -t "LDR_CNTRL=TEXTPSIZE=64K@DATAPSIZE=64K@STACKPSIZE=64K"
检查方法:
$ crsctl stat res ora.LISTENER.lsnr -p |grep USR_ORA_ENV
USR_ORA_ENV=LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K VMM_CNTRL=vmm_fork_policy=COR --> 已经设置
$ ps -ef|grep tnslsnr|grep -v grep |awk '{print $2}' | xargs -I {} svmon -P {} |grep BSS
3aceba 11 work text data BSS heap sm 2885 0 0 2885 --> 输出'sm'为没有设置, 输出'm'为设置
1d0e9d 10 clnt text data BSS heap, s 193 0 - -
Note: export AIXTHREAD_SCOPE=S, only necessary on AIX5L. A change was introduced in AIX 6.1 which means that the variable does not need to be set.
Ensure that the GI and ORACLE owner account has the CAP_NUMA_ATTACH, CAP_BYPASS_RAC_VMM, and CAP_PROPAGATE capabilities. This is required per the 11gR2 and above installation guide and it is also required for all pre-11gR2 installations. Check and Set example for GRID user is as follows:
#/usr/bin/lsuser -a capabilities grid
#/usr/bin/chuser capabilities=CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE grid
For AIX executing on POWER9 based systems, SMT8 mode typically provides best performance. SMT1 mode on POWER9 bases systems will typically result in significant performance degradation with Oracle databases.
POWER9 CPU开启SMT8模式(默认)可以获取更好性能。
检查命令: # smtctl |grep SMT
Virtual processor folding: This is a feature of Power Systems in which unused virtual processors are taken offline until the demand requires that they be activated.
The default is to allow virtual processor folding, and this should not be altered without consulting AIX support. (schedo parameter vpm_fold_policy=2).
For Oracle database environments it is strongly suggested to set schedo parameter vpm_xvcpus to a value of 2 as we have seen AIX incorrectly folding too many processors if the parameter is left at default of 0.
*** Note that this is a critical setting in a RAC environment when using LPARs with processor folding enabled. If this setting is not adjusted, there is a high risk of RAC node evictions under light database workload conditions ***
Processor Folding feature (default), 在服务器空闲时候关闭(offline)虚拟CPU, 在需要时候再激活, Oracle数据库环境强烈建议将vpm_xvcpus设置为>=2, 避免过多的CPU被关闭。RAC环境当负载低的时候很大风险会导致节点被驱逐。
设置方法: 不要去修改vpm_fold_policy参数, 但将vpm_xvcpus设置为2(至少2个CPU online);
# schedo -p -o vpm_xvcpus=2
检查方法:
# schedo -a|grep vpm_xvcpus
使用ASM, 不考虑pp条带, 不考虑CIO相关设置
For clustered ASM (e.g. RAC) configurations, SCSI reservation must be disabled on all ASM hdisk and hdiskpower devices (e.g. reserve_policy=no_reserve).
#MOS 422075.1, SCSI reservation must be disabled, for:
# 1. SSA, FAStT, or non-MPIO-capable disks -> reserve_lock=no
# 2. SS, EMC, HDS, CLARiiON, or MPIO-capable disks -> reserve_policy=no_reserve
检查方法: lsattr -E -l hdiskX | grep reserve_
设置方法: chdev -l hdiskX -a reserve_policy=no_reserve -P
AIO kernel extensions are loaded at system boot (always loaded), AIO servers stay active as long as there are service requests, and the number of AIO servers is dynamically increased or reduced based on demand of the workload. The aio_server_inactivity parameter defines after how many seconds idle time an AIO server will exit. AIO tunables are now based on logical CPU count, and hence it is usually not necessary to tune minservers, maxservers, and maxreqs as in the past.
sb_max >= 1MB (1048576) and must be greater than maximum tpc or udp send or recv space
tcp_sendspace = 262144
tcp_recvspace = 262144
udp_sendspace = db_block_size * db_file_multiblock_read_count + 4K
udp_recvspace = 10 * (udp_sendspace)
tcp_fastlo = 1 #?? AIX 7+ only, This can be extremely useful for improving performance folding or loop back (bequeath) connections and should be evaluated with application testing.
rfc1323 = 1
// Ephemerals (non-defaults suggested for a large number of connecting hosts or a high degree of parallel query; also to avoid install-time warnings)
// Importance of TCP/UDP Parameters in 11gR2 (Doc ID 1307804.1)
tcp_ephemeral_low = 9000
tcp_ephemeral_high = 65500
udp_ephemeral_low = 9000
udp_ephemeral_high = 65500
设置方法:
no -p -o rfc1323=1
no -p -o sb_max = 4194304 # (Doc ID 811293.1)
no -p -o udp_sendspace=135168 #((DB_BLOCK_SIZE*DB_FILE_MULTIBLOCK_READ_COUNT)+4 KB), but no lower than 65536;
# 案例: 135168 = 8192 * 16 + 4096 : db_block_size=8192(8K), db_file_multiblock_read_count=16;
no -p -o udp_recvspace=1351680 # Minimum recommended value is 10x udp_sendspace, parameter value must be less than sb_max;
no -p -o tcp_sendspace=262144
no -p -o tcp_recvspace=262144
no -p -o tcp_ephemeral_low=9000 -o tcp_ephemeral_high=65500
no -p -o udp_ephemeral_low=9000 -o udp_ephemeral_high=65500
no -p -o tcp_fastlo=1 # MOS 2141756.1, AIX低版本有Bug: AIX 6.1 TL9: IV67463, AIX 7.1 TL3 SP3/SP4: IV66228, AIX 7.1 TL4: IV67443
Jumbo frames are used to reduce the number of frames to transmit a given volume of network traffic, but they only work if enabled on every hop in the network infrastructure. Jumbo frames help to reduce network and CPU overheads. The use of Jumbo frames between the database RAC nodes in a cluster (RAC interconnect) is strongly recommended.
# 检查方法
Recommendation for the Real Application Cluster Interconnect and Jumbo Frames (Doc ID 341788.1)
Traceroute: Notice the 9000 packet goes through with no error, while the 9001 fails, this is a correct configuration that supports a message of up to 9000 bytes with no fragmentation:
[node01] $ traceroute -F node02-priv 9000
traceroute to node02-priv (10.x.x.2), 30 hops max, 9000 byte packets
1 node02-priv (10.x.x.2) 0.232 ms 0.176 ms 0.160 ms
[node01] $ traceroute -F node02-priv 9001
traceroute to node02-priv (10.x.x.2), 30 hops max, 9001 byte packets
traceroute: sendto: Message too long
1 traceroute: wrote node02-priv 9001 chars, ret=-1
Ping: With ping we have to take into account an overhead of about 28 bytes per packet, so 8972 bytes go through with no errors, while 8973 fail, this is a correct configuration that supports a message of up to 9000 bytes with no fragmentation:
[node01]$ ping -c 2 -M do -s 8972 node02-priv
PING node02-priv (10.x.x.2) 1472(1500) bytes of data.
1480 bytes from node02-priv (10.x.x.2): icmp_seq=0 ttl=64 time=0.220 ms
1480 bytes from node02-priv (10.x.x.2): icmp_seq=1 ttl=64 time=0.197 ms
[node01]$ ping -c 2 -M do -s 8973 node02-priv
From node02-priv (10.x.x.1) icmp_seq=0 Frag needed and DF set (mtu = 9000)
From node02-priv (10.x.x.1) icmp_seq=0 Frag needed and DF set (mtu = 9000)
--- node02-priv ping statistics ---
0 packets transmitted, 0 received, +2 errors
ulimits (smit chuser or edit /etc/security/limits to create a stanza for Oracle/grid user and set -1 (unlimited) for everything except core
oracle:
data = -1
stack = -1
fsize_hard = -1
cpu_hard = -1
data_hard = -1
stack_hard = -1
fsize = -1
nofiles = -1
cpu = -1
rss = -1
grid:
data = -1
stack = -1
fsize_hard = -1
cpu_hard = -1
data_hard = -1
stack_hard = -1
fsize = -1
nofiles = -1
cpu = -1
rss = -1
Maximum number of PROCESSES allowed per user (smit chgsys). Set this value to 16386 (16k)
检查方法: lsattr -EHl sys0 |egrep "maxuproc|ncargs"
设置方法:
chdev -l sys0 -a maxuproc=16384 # 不低于16384
chdev -l sys0 -a ncargs=128 # AIX6.1 = 256, 不低于128, 默认就可以;
AIX : OLSON TZ result in performance and eviction (Doc ID 2215971.1)
The default timezone format for AIX 6.1 and AIX 7 is Olson Time, but not POSIX time zone Format, and may be tuned to POSIX time zone for performance sensitive applications。
检查方法: echo $TZ , 输出应为BEIST-8;
设置方法:
smitty chtz_date
"Change Time Zone Using User Inputted Values"
Standard Time ID(only alphabets) = BEIST
Standard Time Offset from CUT([+|-]HH:MM:SS) = -8
#MPIO路径切换(Doc ID 560077.1)
检查: lsattr -El fscsi0
设置: chdev -l fscsi0 -a dyntrk=yes -a fc_err_recov=fast_fail -P
#16Gb FC ??
检查: lsattr -El fscsi0
设置: chdev -l fcs0 -a num_cmd_elems=1024 -a max_xfer_size=0x200000 -P
max_xfer_size should be increased from default 1MB to 2MB. The default adapter DMA memory size is 16 MB which increases to 128 MB when a non default max_xfer_size is used. Larger DMA size can be important for performance with many concurrent large block I/O’s.
num_cmd_elems might need to be increased if fcstat -e reports a persistent nonzero value for No Command Resource Count. Please verify with your storage provider possible limits and recommended storage best practices before changing num_cmd_elems.