zk学习中出现的问题

  • 四字命令不在白名单

is not executed because it is not in the whitelist
需要zoo.cfg中配置

4lw.commands.whitelist=*


  • 3.5之后增加Zookeeper AdminServe,默认使用8080端口

Unable to start AdminServer, exiting abnormally
org.apache.zookeeper.server.admin.AdminServer$AdminServerException: Problem starting AdminServer on address 0.0.0.0, port 8080 and command URL /commands
    at org.apache.zookeeper.server.admin.JettyAdminServer.start(JettyAdminServer.java:107)
    at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:138)
    at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
    at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
Caused by: java.io.IOException: Failed to bind to /0.0.0.0:8080

可以在zoo.cfg中添加配置
admin.serverPort=没有占用的端口


  • 启动客户端一直拒绝连接

Socket error occurred: localhost/127.0.0.1:2181: 拒绝连接
Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL

结果是我忘了先启动zkServer了 导致客户端一直连接不上

[root@localhost bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/dev-environment/zk/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@localhost bin]# ./zkCli.sh

  • 配置集群时,集群启动失败,配置的阿里云外网地址,显示地址无法请求

2020-04-30 09:53:12,757 [myid:1] - INFO  [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 1600
2020-04-30 09:53:13,266 [myid:1] - INFO  [/47.102.153.155:3888:QuorumCnxManager$Listener@917] - My election bind port: /47.102.153.155:3888
2020-04-30 09:53:13,268 [myid:1] - ERROR [/47.102.153.155:3888:QuorumCnxManager$Listener@947] - Exception while listening
java.net.BindException: 无法指定被请求的地址 (Bind failed)
    at java.net.PlainSocketImpl.socketBind(Native Method)
    at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
    at java.net.ServerSocket.bind(ServerSocket.java:375)
    at java.net.ServerSocket.bind(ServerSocket.java:329)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:919)
2020-04-30 09:53:14,281 [myid:1] - INFO  [/47.102.153.155:3888:QuorumCnxManager$Listener@962] - Leaving listener
2020-04-30 09:53:14,282 [myid:1] - ERROR [/47.102.153.155:3888:QuorumCnxManager$Listener@964] - As I'm leaving the listener thread after 3 errors. I won't be able to participate in leader election any longer: 47.102.153.155:3888. Use zookeeper.electionPortBindRetry property to increase retry count.

百度:

如果server.x后面设置的IP是公网IP,就需要在zoo.cfg中添加设置quorumListenOnAllIPs=true,但不推荐,会影响ZAB协议和快速Leader选举。 如果server.x后面设置的IP是内网IP,就不会有这个问题,推荐! 如果是一台服务器上创建多个实例的伪集群,127.0.0.1也是可以的!

结果还是报错:

java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1294)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:82)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1131)
2020-04-30 10:01:00,965 [myid:1] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@1153] - Send worker leaving thread  id 1 my id = 1
2020-04-30 10:01:02,562 [myid:1] - INFO  [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 3200
2020-04-30 10:01:02,571 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@438] - Have smaller server identifier, so dropping the connection: (2, 1)
2020-04-30 10:01:02,574 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@438] - Have smaller server identifier, so dropping the connection: (3, 1)
2020-04-30 10:01:02,575 [myid:1] - INFO  [0.0.0.0/0.0.0.0:3888:QuorumCnxManager$Listener@924] - Received connection request 10.1.1.80:39818
2020-04-30 10:01:02,577 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 1 (n.leader), 0x5e (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2020-04-30 10:01:02,584 [myid:1] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@1227] - Connection broken for id 1, my id = 1, error =

我的妈呀 原来是myid =1 与 server.0=xxx.xxx.xxx不对应 应该是myid=1 对应server.1

继续改完这个 还是 报错 连接超时 太难了...

2020-04-30 10:10:55,012 [myid:1] - WARN  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumCnxManager@685] - Cannot open channel to 2 at election address /101.132.41.111:3888
java.net.ConnectException: 拒绝连接 (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:606)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:656)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:713)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:741)
    at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1229)

这个应该是阿里云服务器 无法和虚拟机网络互通的原因 我是两台阿里云加一台虚拟机,配置集群时 阿里云和虚拟机不通导致无法连接

只能在和朋友在借一台了....

后面3台阿里云还是java.net.ConnectException: 拒绝连接 (Connection refused)

崩溃的边缘...............

几个小时后 突然发现可能是阿里云安全组对zk通信端口限制

打开 这三个2181 2888 3888 端口

终于好了.........


zk原生客户端使用问题KeeperErrorCode = NodeExists for /xxx

在每次新建一个节点时,一定要判断该节点(路径)是否存在,因为在ZooKeeper中路径使唯一的,所以当在该路径下已有节点时,继续往当前路径上新建节点就会报这个错。所以删除缓存的version-2文件夹其实就是删除了已经create的节点。

还有一种方式是利用client登录这个zookeeper,然后rmr /节点进行删除节点。


你可能感兴趣的:(zk学习中出现的问题)