现象:
在zabbix web 端,JMX 显示红色,并且提示 cannot reslove 192.168.100.107,192.168.100.108......
当时第一时间想到的是,将server端的 /etc/zabbix/zabbix_server.conf 中 JavaGateway 后的 ip 只保留 192.168.100.107,然后重启 zabbix-server

可是JMX还是显示红色,并且出现了“,(Connection refused): service:jmx:rmi:///jndi/rmi://192.168.100.107:9999/jmxrmi” 的问题

解决方法:

到 192.168.100.107 下去

[root@web1 ~]# /usr/local/tomcat/bin/shutdown.sh 
Using CATALINA_BASE:   /usr/local/tomcat
Using CATALINA_HOME:   /usr/local/tomcat
Using CATALINA_TMPDIR: /usr/local/tomcat/temp
Using JRE_HOME:        /usr/local/jdk1.8/jre
Using CLASSPATH:       /usr/local/tomcat/bin/bootstrap.jar:/usr/local/tomcat/bin/tomcat-juli.jar
Jul 07, 2019 10:48:46 PM org.apache.catalina.startup.Catalina stopServer
SEVERE: Could not contact [localhost:8005] (base port [8005] and offset [0]). Tomcat may not be running.
Jul 07, 2019 10:48:46 PM org.apache.catalina.startup.Catalina stopServer
SEVERE: Error stopping Catalina
java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.(Socket.java:434)
    at java.net.Socket.(Socket.java:211)
    at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:513)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:390)
    at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:480)

[root@web1 ~]# !net
netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:9999            0.0.0.0:*               LISTEN      9629/java           
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd           
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      6387/nginx: master  
tcp        0      0 0.0.0.0:20048           0.0.0.0:*               LISTEN      15753/rpc.mountd    
tcp        0      0 0.0.0.0:37457           0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:49366           0.0.0.0:*               LISTEN      14929/rpc.statd     
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2993/sshd           
tcp        0      0 0.0.0.0:44247           0.0.0.0:*               LISTEN      9629/java           
tcp        0      0 0.0.0.0:34332           0.0.0.0:*               LISTEN      9629/java           
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      10386/zabbix_agentd 
tcp        0      0 0.0.0.0:10052           0.0.0.0:*               LISTEN      10320/java          
tcp6       0      0 :::43533                :::*                    LISTEN      -                   
tcp6       0      0 :::111                  :::*                    LISTEN      15627/rpcbind       
tcp6       0      0 :::20048                :::*                    LISTEN      15753/rpc.mountd    
tcp6       0      0 :::34298                :::*                    LISTEN      14929/rpc.statd     
tcp6       0      0 :::2049                 :::*                    LISTEN      -                   
tcp6       0      0 :::10050                :::*                    LISTEN      10386/zabbix_agentd 

我发现,tomcat完全关不掉,那么只有 kill 掉了

[root@web1 ~]# kill 9629
[root@web1 ~]# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd           
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      6387/nginx: master  
tcp        0      0 0.0.0.0:20048           0.0.0.0:*               LISTEN      15753/rpc.mountd    
tcp        0      0 0.0.0.0:37457           0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:49366           0.0.0.0:*               LISTEN      14929/rpc.statd     
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2993/sshd           
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      10386/zabbix_agentd 
tcp        0      0 0.0.0.0:10052           0.0.0.0:*               LISTEN      10320/java          
tcp6       0      0 :::43533                :::*                    LISTEN      -                   
tcp6       0      0 :::111                  :::*                    LISTEN      15627/rpcbind       
tcp6       0      0 :::20048                :::*                    LISTEN      15753/rpc.mountd    
tcp6       0      0 :::34298                :::*                    LISTEN      14929/rpc.statd     
tcp6       0      0 :::2049                 :::*                    LISTEN      -                   
tcp6       0      0 :::10050                :::*                    LISTEN      10386/zabbix_agentd 

再重启 tomcat,最好同时重启 zabbix-java-gateway 和 zabbix-agent

[root@web1 ~]# /usr/local/tomcat/bin/startup.sh 
Using CATALINA_BASE:   /usr/local/tomcat
Using CATALINA_HOME:   /usr/local/tomcat
Using CATALINA_TMPDIR: /usr/local/tomcat/temp
Using JRE_HOME:        /usr/local/jdk1.8/jre
Using CLASSPATH:       /usr/local/tomcat/bin/bootstrap.jar:/usr/local/tomcat/bin/tomcat-juli.jar
Tomcat started.
[root@web1 ~]# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:41125           0.0.0.0:*               LISTEN      10506/java          
tcp        0      0 0.0.0.0:8009            0.0.0.0:*               LISTEN      10506/java          
tcp        0      0 0.0.0.0:9999            0.0.0.0:*               LISTEN      10506/java          
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd           
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      10506/java          
tcp        0      0 0.0.0.0:40496           0.0.0.0:*               LISTEN      10506/java          
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      6387/nginx: master  
tcp        0      0 0.0.0.0:20048           0.0.0.0:*               LISTEN      15753/rpc.mountd    
tcp        0      0 0.0.0.0:37457           0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:49366           0.0.0.0:*               LISTEN      14929/rpc.statd     
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2993/sshd           
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      10386/zabbix_agentd 
tcp        0      0 0.0.0.0:10052           0.0.0.0:*               LISTEN      10320/java          
tcp6       0      0 :::43533                :::*                    LISTEN      -                   
tcp6       0      0 :::111                  :::*                    LISTEN      15627/rpcbind       
tcp6       0      0 :::20048                :::*                    LISTEN      15753/rpc.mountd    
tcp6       0      0 :::34298                :::*                    LISTEN      14929/rpc.statd     
tcp6       0      0 :::2049                 :::*                    LISTEN      -                   
tcp6       0      0 :::10050                :::*                    LISTEN      10386/zabbix_agentd 

同样的方法,将其他有问题的 tomcat 机器重启,就应该可以了
【错误汇总】zabbix 监控偶遇问题一记_第1张图片


总结:

zabbix 不通时,检查服务端 /usr/zabbix/zabbix_server.conf 中的配置,再检查各客户端 tomcat 的状态