EsgynDB Troubleshooting-Problem binding to /0.0.0.0:60020 : Address already in use.

现象

EsgynDB所在的集群,HBase有4台RegionServer,因某种原因有一个RegionServer下线。
从CDH Manager中手动启动此RegionServer无法启动,报错信息如下,

2019-08-03 10:40:48,501 ERROR org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: Failed construction of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
        at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2706)
        at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:64)
        at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2721)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2704)
        ... 5 more
Caused by: java.io.IOException: Problem binding to /0.0.0.0:60020 : Address already in use. To switch ports use the 'hbase.regionserver.port' configuration proper
ty.
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:946)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:665)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:541)
        ... 10 more
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2571)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener.(RpcServer.java:577)
        at org.apache.hadoop.hbase.ipc.RpcServer.(RpcServer.java:2040)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:939)
        ... 12 more

解决

根据报错信息,发现是60020端口被占用了,通过命令"netstat -tapn | grep 60020"检查被占用的端口,

tcp        0     42 172.31.234.7:37590      172.31.234.8:60020      ESTABLISHED 64793/java          
tcp6       0      0 172.31.234.7:48390      172.31.234.8:60020      ESTABLISHED 169495/tm           
tcp6       0      0 172.31.234.7:44362      172.31.234.10:60020     ESTABLISHED 357207/mxosrvr      
tcp6       0      0 172.31.234.7:38362      172.31.234.10:60020     ESTABLISHED 357668/mxosrvr      
tcp6       0      0 172.31.234.7:47100      172.31.234.10:60020     ESTABLISHED 169495/tm           
tcp6       0      0 172.31.234.7:54084      172.31.234.9:60020      ESTABLISHED 357668/mxosrvr      
tcp6       0      0 172.31.234.7:47380      172.31.234.8:60020      TIME_WAIT   -                   
tcp6       0      0 172.31.234.7:54064      172.31.234.9:60020      ESTABLISHED 359387/mxosrvr      
tcp6       0      0 172.31.234.7:55884      172.31.234.10:60020     ESTABLISHED 358441/mxosrvr      
tcp6       0      0 172.31.234.7:55282      172.31.234.9:60020      ESTABLISHED 357207/mxosrvr      
tcp6       0      0 172.31.234.7:44243      172.31.234.10:60020     ESTABLISHED 355979/mxosrvr      
tcp6       0      0 172.31.234.7:60020      172.31.234.9:2181       ESTABLISHED 358163/mxosrvr      
tcp6       0      0 172.31.234.7:33496      172.31.234.9:60020      ESTABLISHED 360563/mxosrvr      
tcp6       0      0 172.31.234.7:48068      172.31.234.8:60020      ESTABLISHED 357668/mxosrvr      
tcp6       0      0 172.31.234.7:47348      172.31.234.8:60020      ESTABLISHED 359387/mxosrvr      
tcp6       0      0 172.31.234.7:58994      172.31.234.10:60020     ESTABLISHED 359033/mxosrvr      
tcp6       0      0 172.31.234.7:53806      172.31.234.9:60020      ESTABLISHED 169495/tm           
tcp6       0      0 172.31.234.7:56668      172.31.234.10:60020     ESTABLISHED 357641/mxosrvr      
tcp6       0      0 172.31.234.7:46186      172.31.234.8:60020      ESTABLISHED 357207/mxosrvr      
tcp6       0      0 172.31.234.7:43650      172.31.234.8:60020      ESTABLISHED 169495/tm           
tcp6       0      0 172.31.234.7:33582      172.31.234.9:60020      ESTABLISHED 361781/mxosrvr      
tcp6       0      0 172.31.234.7:42910      172.31.234.10:60020     ESTABLISHED 359387/mxosrvr      
tcp6       0      0 172.31.234.7:49102      172.31.234.10:60020     ESTABLISHED 359462/mxosrvr

我们发现有一条“tcp6 0 0 172.31.234.7:60020 172.31.234.9:2181 ESTABLISHED 358163/mxosrvr”,此mxosrvr进程正占用着60020端口,导致RegionServer无法启动。
手动杀掉此进程号,问题解决。

kill -9 358163

你可能感兴趣的:(HBase,Cloudera,Hadoop)