Zabbix监控tomcat连接超时排查和处理

    近期由于业务项目的需要,考虑用zabbix jvm监控tomcat的一些性能等数据,中途出现了一些故障和问题,现记录下来分享给大家。

基础环境:

zabbix server:  CentOS 7.2  IP:10.201.60.11  zabbix_java_getway  zabbix 3.2

Tomcat server : CentOS 7.2   IP:10.201.30.15    tomcat 8  jdk 8

硬件防火墙:华为USG6306

10.201.60.11与10.201.30.15防火墙规则为 zabbix server  可以访问 Tomcat server的10050端口,

关于配置zabbix server与zabbix_java_getway等配置不做详细介绍,以下是相关的配置文件:

1
zabbix_java_getway.conf默认配置即可

在zabbix_server.conf里面添加以下信息
JavaGateway=10.201.60.11
StartJavaPollers=5
JavaGatewayPort=10052
启动相关服务。

更改tomcat主机的tomcat配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[admin@bund-www-p-030015 bin]$  pwd
/usr/install/tomcat8/bin
[admin@bund-www-p-030015 bin]$  ls 
bootstrap.jar       commons-daemon-native. tar .gz  digest.sh         startup.bat           tool-wrapper.sh
catalina.bat        configtest.bat                setclasspath.bat  startup.sh            version.bat
catalina.sh         configtest.sh                 setclasspath.sh   tomcat-juli.jar       version.sh
catalina-tasks.xml  daemon.sh                      shutdown .bat      tomcat-native. tar .gz
commons-daemon.jar  digest.bat                     shutdown .sh       tool-wrapper.bat
[admin@bund-www-p-030015 bin]$ 
在catalina.sh 文件中添加以下配置一般是加在配置文件前面。
 
CATALINA_OPTS="
-Dcom.sun.management.jmxremote
-Djavax.management.builder.initial= 
-Djava.rmi.server. hostname =10.201.30.15
-Dcom.sun.management.jmxremote.authenticate= false
  -Dcom.sun.management.jmxremote.ssl= false
-Dcom.sun.management.jmxremote.port=12345"
重启tomcat程序,用cmdline-jmxclient-0.10.3 做相关的测试  获取数据正常
[admin@bund-www-p-030015 bin]$ java   -jar    /tmp/cmdline-jmxclient-0 .10.3.jar  - 10.201.30.15:12345 java.lang: type =Memory NonHeapMemoryUsage
08 /25/2017  13:03:26 +0800 org.archive.jmx.Client NonHeapMemoryUsage: 
committed: 82509824
init: 2555904
max: -1
used: 80213592

在zabbix_server上测试有以下报错,防火墙上面已经开启了12345端口的策略,但是链接过程中还是出现超时现象,用telnet测试12345端口,确实是可以访问的,zabbix的UI界面也是链接超时,于是查资料,说jvm在监控tomcat的时候,除了12345端口外还会随机开启两个端口,提供相关的服务,于是判断,这肯定是防火墙策略过高,于是将所有的端口开放,再检查竟然是可以连接的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[admin@ 10 - 201 - 60 - 11  zabbix]$ java  - jar  / tmp / cmdline - jmxclient - 0.10 . 3.jar   -  10.201 . 30.15 : 12345  java.lang: type = Memory NonHeapMemoryUsage
Exception  in  thread  "main"  java.rmi.ConnectException: Connection refused to host:  10.201 . 30.15 ; nested exception  is
     java.net.ConnectException: Connection timed out (Connection timed out)
     at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java: 619 )
     at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java: 216 )
     at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java: 202 )
     at sun.rmi.server.UnicastRef.invoke(UnicastRef.java: 130 )
     at javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source)
     at javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java: 2430 )
     at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java: 308 )
     at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java: 270 )
     at org.archive.jmx.Client.execute(Client.java: 225 )
     at org.archive.jmx.Client.main(Client.java: 154 )
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
     at java.net.PlainSocketImpl.socketConnect(Native Method)
     at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java: 350 )
     at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java: 206 )
     at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java: 188 )
     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java: 392 )
     at java.net.Socket.connect(Socket.java: 589 )
     at java.net.Socket.connect(Socket.java: 538 )
     at java.net.Socket.(Socket.java: 434 )
     at java.net.Socket.(Socket.java: 211 )
     at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java: 40 )
     at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java: 148 )
     at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java: 613 )
     ...  9  more

更改防火墙后,访问正常,看来真是防火墙端口问题,这下可坑爹了,难道就将防火墙的端口开设成12345-65535吗?

1
2
3
4
5
6
[admin@ 10 - 201 - 60 - 11  zabbix]$ java  - jar  / tmp / cmdline - jmxclient - 0.10 . 3.jar   -  10.201 . 30.15 : 12345  java.lang: type = Memory NonHeapMemoryUsage
08 / 25 / 2017  13 : 09 : 47  + 0800  org.archive.jmx.Client NonHeapMemoryUsage: 
committed:  82509824
init:  2555904
max - 1
used:  80266536

那这样的安全系数和策略太差劲了,所以继续度娘,找相关的配置,有类似指定端口类的说法,在tomcat的配置文件server.xml中添加以下配置。

1
= "org.apache.catalina.mbeans.JmxRemoteLifecycleListener"   rmiRegistryPortPlatform = "12345"  rmiServerPortPlatform = "12346" / >
1
2
3
4
5
6
7
8
9
10
11
12
"8005"  shutdown = "SHUTDOWN" >
   "org.apache.catalina.startup.VersionLoggerListener"  />
  
  
   "org.apache.catalina.core.AprLifecycleListener"  SSLEngine= "on"  />
  
   "org.apache.catalina.core.JreMemoryLeakPreventionListener"  />
   "org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"  />
   "org.apache.catalina.core.ThreadLocalLeakPreventionListener"  />
"org.apache.catalina.mbeans.JmxRemoteLifecycleListener"   rmiRegistryPortPlatform= "12345"  rmiServerPortPlatform= "12346" />

重启tomcat服务发现类似12345端口已经绑定的错误,突然想起之前在catalina.sh文件中指定一个12345端口  在server.xml中又配置了12345端口,于是继续改,将catalina.sh里的12345端口注释掉,再启动tomcat,查看监听端口,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
admin@bund-www-p-030015 conf]$  netstat   -nltup
(Not all processes could be identified, non-owned process info
  will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID /Program  name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::8080                 :::*                    LISTEN      4926 /java           
tcp6       0      0 :::26613                :::*                    LISTEN      4926 /java           
tcp6       0      0 :::22                   :::*                    LISTEN      -                   
tcp6       0      0 :::12345                :::*                    LISTEN      4926 /java           
tcp6       0      0 :::12346                :::*                    LISTEN      4926 /java           
tcp6       0      0 :::10050                :::*                    LISTEN      -                   
tcp6       0      0 127.0.0.1:8005          :::*                    LISTEN      4926 /java           
tcp6       0      0 :::8009                 :::*                    LISTEN      4926 /java           
udp        0      0 0.0.0.0:40318           0.0.0.0:*                           -                   
udp        0      0 0.0.0.0:2583            0.0.0.0:*                           -                   
udp        0      0 0.0.0.0:63117           0.0.0.0:*                           -                   
udp        0      0 0.0.0.0:68              0.0.0.0:*                           -                   
udp        0      0 10.201.30.15:123        0.0.0.0:*                           -                   
udp        0      0 127.0.0.1:123           0.0.0.0:*                           -                   
udp        0      0 0.0.0.0:123             0.0.0.0:*                           -                   
udp6       0      0 :::37436                :::*                                -                   
udp6       0      0 fe80::f8db:66ff:fe6:123 :::*                                -                   
udp6       0      0 ::1:123                 :::*                                -                   
udp6       0      0 :::123

   发现12345、12346端口处于监听,然后将防火墙策略改为12345-12346端口访问。再测试正常,基本问题解决。



本文转自 tianya1993 51CTO博客,原文链接:http://blog.51cto.com/dreamlinux/1959275,如需转载请自行联系原作者

你可能感兴趣的:(Zabbix监控tomcat连接超时排查和处理)