问题1:plsql登陆数据库慢,大约30s左右
解决的简单过程如下
1,通过本地sqlplus"/as sysdba"登陆数据库速度快,而通过监听连接数据库30s左右,说明问题出在通过监听连接数据库上
2,通过tcpdump抓取1521端口的数据包,发现1521端口很少收到数据库连接请求,但是大约30s之后才有包回应,说明时间消耗在监听程序的处理上
3,通过tracelistener进程,时间花费在如下四次sleep等待,每次sleep之前都会打开/etc/resolv.conf文件,去解析地址
26676 0.000094open("/etc/resolv.conf", O_RDONLY) = 21 <0.000037>
26676 0.000085 fstat(21, {st_mode=S_IFREG|0644,st_size=1267, ...}) = 0 <0.000013>
26676 0.000063 mmap(NULL, 4096,PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ba737555000<0.000019>
26676 0.000058 read(21, "### BEGININFO\n#\n# Modified_by: "..., 4096) = 1267 <0.000039>
26676 0.000108 read(21, "",4096) = 0 <0.000016>
26676 0.000046 close(21) = 0 <0.000017>
26676 0.000045 munmap(0x2ba737555000, 4096) = 0<0.000024>
26676 0.000080 socket(PF_FILE, SOCK_STREAM, 0)= 21 <0.000032>
26676 0.000091 fcntl(21, F_GETFL) = 0x2 (flags O_RDWR) <0.000013>
26676 0.000041 fcntl(21, F_SETFL,O_RDWR|O_NONBLOCK) = 0 <0.000041>
26676 0.000080 connect(21, {sa_family=AF_FILE,path="/var/run/nscd/socket"}, 110) = 0 <0.000020>
26676 0.000080 poll([{fd=21,events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5000) = 1<0.000019>
26676 0.000061 sendto(21,"\2\0\0\0\r\0\0\0\6\0\0\0hosts\0\0\0", 20, MSG_NOSIGNAL, NULL, 0) =20 <0.000016>
26676 0.000048 poll([{fd=21,events=POLLIN|POLLERR|POLLHUP, revents=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1<0.000014>
26676 0.000044 recvmsg(21, {msg_name(0)=NULL,msg_iov(1)=[{"hosts\0", 6}], msg_controllen=24, {cmsg_len=20,cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {22}}, msg_flags=0}, 0) = 6<0.000018>
26676 0.000069 fstat(22, {st_mode=S_IFREG|0600,st_size=217016, ...}) = 0 <0.000017>
26676 0.000061 pread(22,"\1\0\0\0h\0\0\0\2664\0\0\1\0\0\0\2623\272R\0\0\0\0\323"..., 104, 0)= 104 <0.000015>
26676 0.000061 mmap(NULL, 217016, PROT_READ,MAP_SHARED, 22, 0) = 0x2ba737555000 <0.000020>
26676 0.000074 close(22) = 0 <0.000016>
26676 0.000073 close(21) = 0 <0.000021>
26676 0.000076 socket(PF_FILE, SOCK_STREAM, 0)= 21 <0.000027>
26676 0.000074 fcntl(21, F_GETFL) = 0x2 (flags O_RDWR) <0.000039>
26676 0.000075 fcntl(21, F_SETFL,O_RDWR|O_NONBLOCK) = 0 <0.000017>
26676 0.000093 connect(21, {sa_family=AF_FILE,path="/var/run/nscd/socket"}, 110) = 0 <0.000025>
26676 0.000131 poll([{fd=21,events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5000) = 1<0.000026>
26676 0.000075 writev(21,[{"\2\0\0\0\5\0\0\0\20\0\0\0", 12}, {"YZPC-ZWRZDBMAIN\0",16}], 2) = 28 <0.000024>
26676 0.000059 poll(
25001 4.385324 <... nanosleep resumed>{5, 0}) = 0 <5.001651>
4,注释掉nameserver,取消DNS解析,问题解决
/etc/resolv.conf
问题2:通过ssh登陆linux服务器慢,大约1分钟左右
1,查看常见的/etc/resolv.conf,发现没有问题,说明问题不在地址解析上
2,使用who命令查看,发现当前用户登录的信息有问题,时间不是最新的
3,上网查看相关资料,可能因为登录相关的问题被异常锁定了
4,查看锁定/var/run/utmp的进程
YZPC-ZWRZAPP1:/var/run# lsof|grep /var/run/utmp
gnome-pty 447 root 5uR REG 8,2 9216 68 /var/run/utmp
gnome-pty 3979 root 5u REG 8,2 9216 68 /var/run/utmp
gnome-pty 10879 root 5u REG 8,2 9216 68 /var/run/utmp
gnome-pty 16207 root 5u REG 8,2 9216 68 /var/run/utmp
gnome-pty 16435 root 5uR REG 8,2 9216 68 /var/run/utmp
gnome-pty 28974 root 5u REG 8,2 9216 68 /var/run/utmp
gnome-pty 30390 root 5uR REG 8,2 9216 68 /var/run/utmp
5,杀掉如下进程,问题解决
YZPC-ZWRZAPP1:/var/run# ps -ef|grep gnome-pty
root 447 1 0 May20 ? 00:00:00 gnome-pty-helper
root 3979 1 0 May23 ? 00:00:00 gnome-pty-helper
root 4532 4050 0 17:04 pts/3 00:00:00 grep gnome-pty
root 10879 1 0 May29 ? 00:00:00 gnome-pty-helper
root 16207 1 0 May29 ? 00:00:00 gnome-pty-helper
root 16435 1 0 May29 ? 00:00:00 gnome-pty-helper
root 28974 1 0 10:27 ? 00:00:00 gnome-pty-helper
root 30390 1 0 11:06 ? 00:00:00 gnome-pty-helper
总结:
1,使用trace等跟踪工具,跟踪慢的进程,进而发现慢的原因,第一个案例使用的trace listener进行,发现慢在了域名解析的步骤
2,总结慢的现象,根据现象发现问题的原因。疑点是who命令异常,不能显示当前用户的状态。
3,问题1是30s是因为域名解析30s超时,第二个一分钟也不是偶然