操作系统版本:centos5.8 64bit
数据库版本:11.2.0.1
问题描述:
最近对数据库添加内存到32G,并调大了MEMORY_MAX_SIZE参数为28G,MEMORY_TARGET参数为24G;运行了一小段时间,没有发现大的问题,只是nagios监控到主库和备库的swap使用率偏高,而在添加内存之前并未出现过这种情况,调整前操作系统内存大小为16G,调整后MEMORY_MAX_SIZE和MEMORY_TARGET参数均为11G!
调整前交换分区使用率:
- 1:主库的交换分区使用率平均值为7.11%
- [root@db1 ~]# sar -f /var/log/sa/sa11 -r
- 11:41:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
- 11:42:01 PM 93760 16332056 99.43 124792 13621548 7610880 582260 7.11 280
- 11:43:01 PM 87104 16338712 99.47 125084 13622556 7610880 582260 7.11 280
- 11:44:01 PM 79484 16346332 99.52 125384 13624140 7610880 582260 7.11 280
- 11:45:01 PM 77632 16348184 99.53 125684 13625580 7610880 582260 7.11 280
- 11:46:01 PM 62852 16362964 99.62 125936 13634280 7610880 582260 7.11 280
- 11:47:01 PM 58452 16367364 99.64 126484 13635080 7610880 582260 7.11 280
- 11:48:01 PM 61328 16364488 99.63 126856 13639772 7610880 582260 7.11 280
- 11:49:01 PM 64116 16361700 99.61 127040 13630580 7610880 582260 7.11 280
- 11:50:01 PM 65820 16359996 99.60 127268 13625496 7610880 582260 7.11 280
- 11:51:01 PM 59888 16365928 99.64 127148 13588072 7610880 582260 7.11 280
- 11:52:01 PM 84152 16341664 99.49 127412 13589552 7610880 582260 7.11 280
- 11:53:01 PM 102492 16323324 99.38 127716 13590332 7610880 582260 7.11 280
- 11:54:01 PM 96444 16329372 99.41 128076 13602516 7610880 582260 7.11 280
- 11:55:01 PM 88752 16337064 99.46 128408 13607108 7610880 582260 7.11 280
- 11:56:01 PM 78936 16346880 99.52 128708 13608816 7610880 582260 7.11 280
- 11:57:01 PM 57192 16368624 99.65 128936 13609668 7610880 582260 7.11 280
- 11:58:01 PM 64308 16361508 99.61 129192 13611012 7610880 582260 7.11 280
- 11:59:01 PM 62620 16363196 99.62 129476 13612704 7610880 582260 7.11 280
- Average: 94185 16331631 99.43 125388 13559392 7610581 582559 7.11 289
- 2:备库的交换分区使用率平均值为12.27%
- 11:41:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
- 11:42:01 PM 97084 16328732 99.41 283368 14770184 7187548 1005592 12.27 23924
- 11:43:01 PM 95968 16329848 99.42 283396 14770264 7187548 1005592 12.27 23924
- 11:44:01 PM 96760 16329056 99.41 283440 14770392 7187548 1005592 12.27 23924
- 11:45:01 PM 94872 16330944 99.42 283480 14770532 7187548 1005592 12.27 23924
- 11:46:01 PM 95392 16330424 99.42 283520 14770680 7187548 1005592 12.27 23924
- 11:47:01 PM 90196 16335620 99.45 283568 14776592 7187548 1005592 12.27 23924
- 11:48:01 PM 91524 16334292 99.44 283596 14778728 7187548 1005592 12.27 23924
- 11:49:01 PM 91256 16334560 99.44 283648 14778792 7187552 1005588 12.27 23920
- 11:50:01 PM 92560 16333256 99.44 283712 14778824 7187552 1005588 12.27 23920
- 11:51:01 PM 90748 16335068 99.45 283772 14778912 7187552 1005588 12.27 23920
- 11:52:01 PM 91484 16334332 99.44 283800 14779068 7187552 1005588 12.27 23920
- 11:53:01 PM 89964 16335852 99.45 283844 14779136 7187572 1005568 12.27 23944
- 11:54:01 PM 80092 16345724 99.51 283980 14790212 7187572 1005568 12.27 23944
- 11:55:01 PM 72728 16353088 99.56 284052 14792660 7187572 1005568 12.27 23944
- 11:56:01 PM 72300 16353516 99.56 284092 14792844 7187612 1005528 12.27 23936
- 11:57:01 PM 72264 16353552 99.56 284152 14792908 7187612 1005528 12.27 23936
- 11:58:01 PM 73680 16352136 99.55 284248 14793040 7187612 1005528 12.27 23936
- 11:59:01 PM 73836 16351980 99.55 284300 14793172 7187612 1005528 12.27 23936
- Average: 71240 16354576 99.57 265982 14818731 7172819 1020321 12.45 23450
调整后交换分区使用率:
- 1:主库的交换分区使用率平均值为34.18%
- [root@db1 ~]# sar -f /var/log/sa/sa13 -r
- 11:41:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
- 11:42:01 PM 189612 32759404 99.42 423500 29933996 5353540 2839600 34.66 633656
- 11:43:01 PM 166404 32782612 99.49 423528 29934144 5353540 2839600 34.66 633656
- 11:44:01 PM 167176 32781840 99.49 423560 29934448 5353544 2839596 34.66 633652
- 11:45:01 PM 105964 32843052 99.68 423100 30125676 5353548 2839592 34.66 513156
- 11:46:01 PM 91348 32857668 99.72 423116 30108604 5353548 2839592 34.66 497332
- 11:47:01 PM 101608 32847408 99.69 423136 30104324 5353552 2839588 34.66 492240
- 11:48:01 PM 119196 32829820 99.64 423180 30104556 5353552 2839588 34.66 492240
- 11:49:01 PM 131556 32817460 99.60 423220 30104752 5353556 2839584 34.66 492236
- 11:50:01 PM 128396 32820620 99.61 423256 30105528 5353560 2839580 34.66 492232
- 11:51:01 PM 134268 32814748 99.59 423292 30107804 5353568 2839572 34.66 492224
- 11:52:01 PM 137028 32811988 99.58 423316 30107968 5353572 2839568 34.66 492220
- 11:53:01 PM 131340 32817676 99.60 423348 30108064 5353572 2839568 34.66 492220
- 11:54:01 PM 135292 32813724 99.59 423452 30119040 5353572 2839568 34.66 492220
- 11:55:01 PM 125736 32823280 99.62 423488 30119164 5353576 2839564 34.66 492216
- 11:56:01 PM 120496 32828520 99.63 423528 30119412 5353580 2839560 34.66 492212
- 11:57:01 PM 125356 32823660 99.62 423568 30119588 5353580 2839560 34.66 492212
- 11:58:01 PM 102808 32846208 99.69 423584 30119912 5353580 2839560 34.66 492212
- 11:59:01 PM 104964 32844052 99.68 423636 30115728 5353584 2839556 34.66 487984
- Average: 170975 32778041 99.48 377625 29129299 5392719 2800421 34.18 1536258
- 2:备库的交换分区使用率甚至出现100%的情况
- 12:58:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
- 12:59:01 AM 133468 32815548 99.59 382232 30133608 6997136 1196004 14.60 1196004
- 01:00:01 AM 133564 32815452 99.59 382244 30133784 6997136 1196004 14.60 1196004
- 01:01:01 AM 92316 32856700 99.72 358832 24656912 1520804 6672336 81.44 6672332
- 01:02:01 AM 96292 32852724 99.71 313508 25143044 8 8193132 100.00 6223732
- 01:03:01 AM 100724 32848292 99.69 229156 28053604 0 8193140 100.00 3377964
- 01:04:01 AM 94672 32854344 99.71 141536 30348384 84 8193056 100.00 1189700
- 01:05:01 AM 99560 32849456 99.70 118908 31574272 0 8193140 100.00 2012
- 01:06:01 AM 91352 32857664 99.72 96656 31646268 0 8193140 100.00 1644
- 01:07:01 AM 96028 32852988 99.71 90408 31694700 0 8193140 100.00 552
- 01:08:01 AM 93512 32855504 99.72 74632 31747136 0 8193140 100.00 420
- 01:09:01 AM 93272 32855744 99.72 72944 31787152 0 8193140 100.00 452
- 01:10:01 AM 92996 32856020 99.72 71336 31840408 0 8193140 100.00 424
- 01:11:01 AM 96912 32852104 99.71 70420 31867152 0 8193140 100.00 356
- 01:12:01 AM 92136 32856880 99.72 72880 31890820 0 8193140 100.00 352
- 01:13:01 AM 94672 32854344 99.71 66760 31917328 0 8193140 100.00 268
- 01:14:01 AM 96804 32852212 99.71 64648 31940808 0 8193140 100.00 208
- 01:15:01 AM 92628 32856388 99.72 56732 31954208 0 8193140 100.00 124
- 01:16:01 AM 93524 32855492 99.72 55568 31962064 0 8193140 100.00 244
- 01:17:01 AM 96796 32852220 99.71 56176 31957600 0 8193140 100.00 168
- 01:18:02 AM 95520 32853496 99.71 51832 31968804 0 8193140 100.00 204
- 01:19:01 AM 91320 32857696 99.72 51224 31936896 0 8193140 100.00 148
- 01:20:01 AM 93032 32855984 99.72 51716 31962408 0 8193140 100.00 184
- 01:21:01 AM 97792 32851224 99.70 51564 31970344 0 8193140 100.00 216
- 01:22:01 AM 93368 32855648 99.72 50852 31952144 0 8193140 100.00 220
- 01:23:01 AM 90496 32858520 99.73 51452 31888096 0 8193140 100.00 196
- 01:24:01 AM 139524 32809492 99.58 52468 31860700 44 8193096 100.00 844
- 01:25:01 AM 97008 32852008 99.71 52760 31787568 0 8193140 100.00 1024
- 01:26:01 AM 96100 32852916 99.71 52668 31734260 8 8193132 100.00 1092
- 01:27:01 AM 91124 32857892 99.72 52796 31699052 0 8193140 100.00 856
问题分析:
先来说说下交换分区的作用,在操作系统范畴内,交换分区的作用是在物理内存使用完之后,将磁盘空间(也就是SWAP分区)虚拟成内存来使用,换句话说在内存未使用光之前,是不会用到交换分区的,真的是这样吗?下面来看下nagios的内存监控和oracle AWR报告中关于内存方面的信息!
通过查看nagios内存监控和AWR报告可以很明显的知道,实际内存使用率只到45%左右,可是为什么会用到交换分区呢?而且奇怪的是交换分区使用率到了100%后,oracle照样能正常运行,未收到任何的alert告警信息,如果真的出现内存不足,连接数据库的时候应该会出现类似下图所示的报错信息。
问题处理:
于是简单的处理了下这个问题,分别在主备库上敲下面的命令释放交换分区空间,但是问题依旧,过一段时间又会收到报警!
[root@db1 ~]# swapoff -a
[root@db1 ~]# swapon -a
于是参考下面的文章对内核参数进行调整,将swappiness参数从默认的60改为0,告诉操作系统,尽量不要使用交换分区!
http://www.linuxvox.com/2009/10/what-is-the-linux-kernel-parameter-vm-swappiness/
http://blog.yannickjaquier.com/linux/linux-hugepages-and-virtual-memory-vm-tuning.html
后续跟踪:
修改后观察一段时间,交换分区使用率低且并无大幅度增长情况出现!