11gR2新特性:Heavy swapping observed on system in last 5 mins

11gR2新特性:Heavy swapping observed on system in last 5 mins.

问题:
今天发现数据库运行缓慢,多个update语句一直持续运行,给数据造成了很大压力。

11gR2新特性:Heavy swapping observed on system in last 5 mins_第1张图片

查看日志发现了下面的错误:
LNS: Standby redo logfile selected for thread 1 sequence 15058 for destination LOG_ARCHIVE_DEST_4
Fri Jan 04 13:46:04 2013
Archived Log entry 74894 added for thread 1 sequence 15057 ID 0xe9b400d dest 1:
Fri Jan 04 13:48:00 2013
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [2.17%] pct of memory swapped out [1.30%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.
Fri Jan 04 13:51:54 2013
LGWR: Standby redo logfile selected to archive thread 1 sequence 15059
LGWR: Standby redo logfile selected for thread 1 sequence 15059 for destination LOG_ARCHIVE_DEST_2
......
ORA-00020: 超出最大进程数 (600)
 ORA-20 errors will not be written to the alert log for
 the next minute. Please look at trace files to see all
 the ORA-20 errors.
 
分析:
在11gR2中DBRM(database resource manager,11gR2中新的后台进程,见《Learning 11g New Background Processes》)会在Alert.log告警日志中反映OS操作系统最近5分钟是否有剧烈的swap活动了

可以观察DBRM后台进程的trace:
oracle     503 21966  0 16:30 pts/4    00:00:00 grep dbrm
oracle    6866     1  0  2012 ?        00:01:01 ora_dbrm_nticket1
[oracle@nticket1 trace]$ pwd
/u01/app/ora11g/diag/rdbms/nticket1/nticket1/trace
[oracle@nticket1 trace]$ ls -ltr *dbrm*
-rw-r----- 1 oracle asmadmin  100447 01-04 13:58 nticket1_dbrm_6866.trm
-rw-r----- 1 oracle asmadmin 1586637 01-04 13:58 nticket1_dbrm_6866.trc

[oracle@nticket1 trace]$ vi nticket1_dbrm_6866.trc
oracle    6866     1  0  2012 ?        00:01:01 ora_dbrm_nticket1
oracle   28938 21966  0 15:44 pts/4    00:00:00 grep dbrm

*** 2013-01-04 01:57:26.942
   1       12571           1       0.20      0.00
     26838         57         37           3          29           0           0

*** 2013-01-04 01:58:26.961
   1       11716           1       0.15      0.00
     31681         58         31           8          11           0           0

*** 2013-01-04 01:59:26.981
   1       11607           0       0.25      0.00
     38195         66         45           8           4           0           0

*** 2013-01-04 02:00:00.014
kgsksysstop: blocking mode (2) timestamp: 1357236000014692
kgsksysstop: successful
kgsksysresume: successful
Heavy swapping observed in last 5 mins:    [pct of total memory][bytes]

*** 2013-01-04 13:48:00.385
total swpin [ 2.17%][1078326K], total swpout [ 1.30%][648770K]
vm stats captured every 30 secs for last 5 mins:
swpin:                 swpout:
[ 0.19%][     96052K]  [ 0.06%][     34155K]
[ 0.23%][    117853K]  [ 0.10%][     54080K]
[ 0.23%][    117220K]  [ 0.20%][    101830K]
[ 0.23%][    116923K]  [ 0.11%][     58593K]
[ 0.19%][     97252K]  [ 0.14%][     73923K]
[ 0.18%][     90623K]  [ 0.11%][     56906K]
[ 0.18%][     93366K]  [ 0.10%][     52694K]
[ 0.21%][    108634K]  [ 0.19%][     94206K]
[ 0.24%][    122591K]  [ 0.11%][     58344K]
[ 0.23%][    117812K]  [ 0.12%][     64039K]
Heavy swapping observed in last 5 mins:    [pct of total memory][bytes]

*** 2013-01-04 13:58:00.731
total swpin [ 2.81%][1393938K], total swpout [ 1.10%][547194K]
vm stats captured every 30 secs for last 5 mins:
swpin:                 swpout:
[ 0.17%][     86056K]  [ 0.16%][     83332K]
[ 0.21%][    106479K]  [ 0.08%][     40506K]
[ 0.30%][    148646K]  [ 0.13%][     65545K]
[ 0.30%][    152775K]  [ 0.11%][     58557K]
[ 0.31%][    154886K]  [ 0.09%][     47802K]
[ 0.29%][    146032K]  [ 0.07%][     37422K]
[ 0.30%][    152861K]  [ 0.06%][     31298K]
[ 0.29%][    147241K]  [ 0.10%][     49991K]
[ 0.30%][    151591K]  [ 0.10%][     53233K]
[ 0.29%][    147371K]  [ 0.16%][     79508K]

可以看到dbrm收集到了短期内的swapin和swapout数据,这样便于我们诊断由swap造成的性能或者hang问题。

 

解决:
1.适当增加processes参数
alter system set processes=800 scope=spfile;
2.修改应用程序写法,优化程序,减少并发。

 

总结:

解决OS 系统严重swap的一些思路:

1.  诊断是否存在内存泄露的进程,解决内存泄露
2.  调优SGA/PGA ,减少oracle对内存的占用
3.  利用  echo 3 > /proc/sys/vm/drop_caches 命令可以暂时释放一些cache的内存
4. 调整系统VM内存管理参数, 例如Linux上sysctl.conf中的以下几个参数

vm.min_free_kbytes   :Raising the value in /proc/sys/vm/min_free_kbytes will cause the system to start reclaiming memory at an earlier time than it would have before.

vm.vfs_cache_pressure :        At the default value of vfs_cache_pressure = 100 the kernel will attempt to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

vm.swappiness  : default 60 ;Apparently /proc/sys/vm/swappiness on Red Hat Linux allows the admin to tune how aggressively the kernel swaps out processes’ memory. Decreasing the  swappiness setting may result in improved Directory performance as the kernel
holds more of the server process in memory longer before swapping it out.

设置以下值,减少out of memory的可能性:

# Oracle-Validated setting for vm.min_free_kbytes is 51200 to avoid OOM killer
vm.min_free_kbytes = 51200
#vm.swappiness = 40
vm.vfs_cache_pressure = 200


参考:
http://www.oracledatabase12g.com/archives/11gr2-dbrm-collect-swap-info.html

 

你可能感兴趣的:(11gR2新特性:Heavy swapping observed on system in last 5 mins)