ulimit是Linux内建的命令可以用于限制用户的系统资源使用。其中的nofile(ulimit -n)限制当前用户允许打开的最大文件数(open files), 最近因为要监控系统中每个用户的资源使用情况,做了一些测试在这里做个记录,希望可以帮助有需要的朋友。
ulimit is a built-in Linux shell command that allows viewing or limiting system resource amounts that individual users consume.
先说结论:
###1. 设置用户oracle的limit为7
[root@oracle03 ~]# cat /etc/security/limits.d/99-grid-oracle-limits.conf |grep oracle|grep nofile
oracle soft nofile 7
oracle hard nofile 7
[oracle@oracle03 test]$ ulimit -Hn
7 >>>>> 当前用户的nofile为7.
###2. 准备一个脚本"n.sh", 用于显示oracle用户的打开文件总数,和每个进程的打开的句柄.
cnValue=0
for p in $(ps -u oracle -o pid); do
echo "pid : $p"
ls -l /proc/$p/fd/ |egrep -v "^total "
count=$(ls -l /proc/$p/fd/|egrep -v "^total "|wc -l)
cnValue=$((count + cnValue))
done 2>/dev/null
echo "total open fd: "$cnValue
###3. /home/oracle/test目录创建3个文件, 用于测试:
[oracle@oracle03 test]$ ls -lt
total 12
-rw-r--r-- 1 oracle oinstall 2 Apr 10 10:37 3.txt
-rw-r--r-- 1 oracle oinstall 2 Apr 10 10:36 2.txt
-rw-r--r-- 1 oracle oinstall 2 Apr 10 10:36 1.txt
下面通过tail命令模拟打开文件进行测试验证。
[oracle@oracle03 test]$ tail -f *.txt
==> 1.txt <==
1
==> 2.txt <==
2
==> 3.txt <==
3
[root@oracle03 ~]# sh n.sh
pid : PID
pid : 4017
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 2 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 6 -> /dev/pts/0
pid : 5935 >>>>> tail命令同时打开7个文件(fd: 0~6)
lrwx------ 1 oracle oinstall 64 Apr 10 11:08 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:08 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:08 2 -> /dev/pts/0
lr-x------ 1 oracle oinstall 64 Apr 10 11:08 3 -> /home/oracle/test/1.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:08 4 -> /home/oracle/test/2.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:08 5 -> /home/oracle/test/3.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:08 6 -> anon_inode:inotify
total open fd: 11 ==> Oracle用户当前总共打开了11个句柄
[root@oracle03 ~]# ps -ef|grep 5935
oracle 5935 4017 0 11:07 pts/0 00:00:00 tail -f 1.txt 2.txt 3.txt
>>>> 结论: 虽然oracle两个进程(4017, 5935 )总共打开了11个文件句柄, 但是没有一个进程打开句柄数超过7个的限制, tail命令没有任何报错。说明nofile是针对单个进程的。
[oracle@oracle03 test]$ vi 4.txt >>>>> 新增加一个文件, 让tail命令打开的句柄数超过7.
[oracle@oracle03 test]$ tail -f *.txt
==> 1.txt <==
1
==> 2.txt <==
2
==> 3.txt <==
3
==> 4.txt <==
4
tail: inotify cannot be used, reverting to polling: Too many open files
>>>>> 报错"Too many open files", 无法创建anon_inode:inotify句柄
[root@oracle03 ~]# sh n.sh
pid : PID
pid : 4017
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 2 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 6 -> /dev/pts/0
pid : 6348
lrwx------ 1 oracle oinstall 64 Apr 10 11:15 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:15 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:15 2 -> /dev/pts/0
lr-x------ 1 oracle oinstall 64 Apr 10 11:15 3 -> /home/oracle/test/1.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:15 4 -> /home/oracle/test/2.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:15 5 -> /home/oracle/test/3.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:15 6 -> /home/oracle/test/4.txt
total open fd: 11
>>>> 结论:当一个进程打开的句柄数量超过了限制, 就会报"Too many open files"错误, 通过监控/proc/pid/fd/进行监控是准确可信的。
[oracle@oracle03 test]$ tail -f *.txt & >>>> 后台执行
[1] 6640
[oracle@oracle03 test]$ ==> 1.txt <==
1
==> 2.txt <==
2
==> 3.txt <==
3
[oracle@oracle03 test]$ tail -f *.txt & >>>> 后台执行, 没有任何报错。
[2] 6646
[oracle@oracle03 test]$ ==> 1.txt <==
1
==> 2.txt <==
2
==> 3.txt <==
3
[oracle@oracle03 test]$ jobs
[1]- Running tail -f *.txt &
[2]+ Running tail -f *.txt &
[root@oracle03 ~]# ps -ef|egrep "6640|6646"
oracle 6640 4017 0 11:19 pts/0 00:00:00 tail -f 1.txt 2.txt 3.txt >>>> pts/0, 同一个会话
oracle 6646 4017 0 11:19 pts/0 00:00:00 tail -f 1.txt 2.txt 3.txt >>>> pts/0
root 6779 3566 0 11:21 pts/1 00:00:00 grep -E --color=auto 6640|6646
[root@oracle03 ~]# sh n.sh
pid : PID
pid : 4017
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 2 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 10:42 6 -> /dev/pts/0
pid : 6640
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 2 -> /dev/pts/0
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 3 -> /home/oracle/test/1.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 4 -> /home/oracle/test/2.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 5 -> /home/oracle/test/3.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 6 -> anon_inode:inotify
pid : 6646
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 0 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 1 -> /dev/pts/0
lrwx------ 1 oracle oinstall 64 Apr 10 11:20 2 -> /dev/pts/0
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 3 -> /home/oracle/test/1.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 4 -> /home/oracle/test/2.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 5 -> /home/oracle/test/3.txt
lr-x------ 1 oracle oinstall 64 Apr 10 11:20 6 -> anon_inode:inotify
total open fd: 18
>>>> 结论: 同一个会话, 创建多个2个tail进程, 每个进程打开的句柄数是7, 没有超过限制, 没报错。说明nofile只针对用户单个进程,与总数无关,与会话无关。
分享下我的监控脚本, 可以监控每个用户当前的进程数和每个用户进程打开的最大文件数。脚本偷了个懒,并没有基于/proc/pid/limits取nofile,对于系统运行过程中修改limits的场景判断是不准的。
#!/bin/bash
userNameArray=$(cat /etc/passwd | awk -F ":" ' {printf("%s\n",$1)}')
for userName in ${userNameArray[@]}
do
huValue=$(su $userName --shell /bin/bash --command "ulimit -Hu");
suValue=$(su $userName --shell /bin/bash --command "ulimit -Su");
cuValue=$(ps -fu $userName|wc -l);
hnValue=$(su $userName --shell /bin/bash --command "ulimit -Hn");
snValue=$(su $userName --shell /bin/bash --command "ulimit -Sn");
cnValue=0
for p in $(ps -u $userName -o pid); do
count=$(ls -l /proc/$p/fd/|egrep -v "^total "|wc -l)
if [ $count -gt $cnValue ]; then
cnValue=${count}
fi
done 2>/dev/null
echo $userName"|"$suValue"|"$huValue"|"$cuValue"|"$snValue"|"$hnValue"|"$cnValue
done
>>>> 输出说明: 用户名|进程数soft限制|进程数hard限制|当前进程数|打开文件数soft限制|打开文件数hard限制|当前进程打开的最大文件数
[root@oracle03 ~]# sh limits_check.sh
root|95697|95697|143|1024|4096|91
bin|4096|95697|1|1024|4096|0
daemon|4096|95697|1|1024|4096|0
adm|4096|95697|1|1024|4096|0
lp|4096|95697|1|1024|4096|0
sync|4096|95697|1|1024|4096|0
shutdown|4096|95697|1|1024|4096|0
halt|4096|95697|1|1024|4096|0
mail|4096|95697|1|1024|4096|0
operator|4096|95697|1|1024|4096|0
games|4096|95697|1|1024|4096|0
ftp|4096|95697|1|1024|4096|0
nobody|4096|95697|1|1024|4096|0
systemd-network|4096|95697|1|1024|4096|0
dbus|4096|95697|2|1024|4096|16
polkitd|4096|95697|2|1024|4096|12
sshd|4096|95697|1|1024|4096|0
postfix|4096|95697|3|1024|4096|12
oracle|16384|65536|79|65536|65536|55
grid|16384|65536|42|65536|65536|170
rpc|4096|95697|2|1024|4096|12
rpcuser|4096|95697|1|1024|4096|0
nfsnobody|4096|95697|1|1024|4096|0
ntp|4096|95697|1|1024|4096|0