Zookeeper CancelledKeyException

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

随着生产集群里应用的增多,zookeeper的压力越来越大,resourcemanager出现了异常挂起。

延迟问题

首先是“fsync-ing the write ahead log in SyncThread:3 took 1606ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide”

这个问题主要是日志同步延迟,一般是磁盘性能不好,log和data目录同磁盘,分开两个目录,设置一些参数可能有点用。

tickTime=4000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=20
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=10

CancelledKeyException

再次是zookeeper的CancelledKeyException,这个bug在CDH-5.11及之前(笔者写这篇文章的时候5.11是最新的版本)版本的zookeeper中并没有进行修复。

这个bug在在zookeeper集群负载较高时,可能导致其他使用zookeeper的服务(包括yarn、storm、kafka等)出现失去连接挂起的状态,所以需要打patch的。

https://issues.apache.org/jira/browse/ZOOKEEPER-1237

diff -uwp zookeeper-3.4.5/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java.ZK1237 zookeeper-3.4.5/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
--- zookeeper-3.4.5/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java.ZK1237    2012-09-30 10:53:32.000000000 -0700
+++ zookeeper-3.4.5/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java    2013-08-07 13:20:19.227152865 -0700
@@ -150,7 +150,8 @@ public class NIOServerCnxn extends Serve
                 // We check if write interest here because if it is NOT set,
                 // nothing is queued, so we can try to send the buffer right
                 // away without waking up the selector
-                if ((sk.interestOps() & SelectionKey.OP_WRITE) == 0) {
+                if (sk.isValid() &&
+                    (sk.interestOps() & SelectionKey.OP_WRITE) == 0) {
                     try {
                         sock.write(bb);
                     } catch (IOException e) {
@@ -214,14 +215,18 @@ public class NIOServerCnxn extends Serve

                 return;
             }
-            if (k.isReadable()) {
+            if (k.isValid() && k.isReadable()) {
                 int rc = sock.read(incomingBuffer);
                 if (rc < 0) {
-                    throw new EndOfStreamException(
+                    if (LOG.isDebugEnabled()) {
+                        LOG.debug(
                             "Unable to read additional data from client sessionid 0x"
                             + Long.toHexString(sessionId)
                             + ", likely client has closed socket");
                 }
+                    close();
+                    return;
+                }
                 if (incomingBuffer.remaining() == 0) {
                     boolean isPayload;
                     if (incomingBuffer == lenBuffer) { // start of next request
@@ -242,7 +247,7 @@ public class NIOServerCnxn extends Serve
                     }
                 }
             }
-            if (k.isWritable()) {
+            if (k.isValid() && k.isWritable()) {
                 // ZooLog.logTraceMessage(LOG,
                 // ZooLog.CLIENT_DATA_PACKET_TRACE_MASK
                 // "outgoingBuffers.size() = " +

日志

zookeeper日志

2017-06-15 03:09:03,098 [myid:3] - WARN  [SyncThread:3:FileTxnLog@321] - fsync-ing the write ahead log in SyncThread:3 took 1506ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2017-06-15 03:09:05,098 [myid:3] - WARN  [SyncThread:3:FileTxnLog@321] - fsync-ing the write ahead log in SyncThread:3 took 1606ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2017-06-15 03:09:05,744 [myid:3] - INFO  [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x35c9bd7493c22c0 type:create cxid:0x2 zxid:0x4f003006ef txntype:-1 reqpath:n/a Error Path:/hive_zookeeper_namespace/zb_ods Error:KeeperErrorCode = NodeExists for /hive_zookeeper_namespace/zb_ods
2017-06-15 03:09:09,952 [myid:3] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x25c9bd75af70091, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2017-06-15 03:09:12,076 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /133.0.91.42:46579 which had sessionid 0x25c9bd75af70091
2017-06-15 03:09:12,076 [myid:3] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x35c9bd7493c22ac, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2017-06-15 03:09:12,076 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /133.0.91.41:53351 which had sessionid 0x35c9bd7493c22ac
2017-06-15 03:09:12,076 [myid:3] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x35c9bd7493c2268, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2017-06-15 03:09:12,075 [myid:3] - INFO  [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x25c9bd75af70091 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null Error:KeeperErrorCode = Session moved
2017-06-15 03:09:12,083 [myid:3] - INFO  [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x35c9bd7493c2268 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null Error:KeeperErrorCode = Session moved
2017-06-15 03:09:12,083 [myid:3] - INFO  [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@574] - Got user-level KeeperException when processing sessionid:0x35c9bd7493c2268 type:multi cxid:0x7e06 zxid:0x4f003006f2 txntype:-1 reqpath:n/a aborting remaining multi ops. Error Path:null Error:KeeperErrorCode = Session moved
2017-06-15 03:09:12,087 [myid:3] - WARN  [SyncThread:3:FileTxnLog@321] - fsync-ing the write ahead log in SyncThread:3 took 6988ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2017-06-15 03:09:14,136 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /133.0.91.42:46580 which had sessionid 0x35c9bd7493c2268
2017-06-15 03:09:14,136 [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170)
        at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:634)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2017-06-15 03:09:14,137 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /133.0.91.42:47513
2017-06-15 03:09:14,341 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew session 0x35c9bd7493c2268 at /133.0.91.42:47513
2017-06-15 03:09:14,341 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@595] - Established session 0x35c9bd7493c2268 with negotiated timeout 10000 for client /133.0.91.42:47513
2017-06-15 03:09:14,342 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@863] - got auth packet /133.0.91.42:47513
2017-06-15 03:09:14,342 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@897] - auth success /133.0.91.42:47513
2017-06-15 03:09:14,342 [myid:3] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x35c9bd7493c2268 due to java.io.IOException: Len error 1673753
2017-06-15 03:09:14,342 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /133.0.91.42:47513 which had sessionid 0x35c9bd7493c2268
2017-06-15 03:09:14,341 [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
        at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:634)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2017-06-15 03:09:14,433 [myid:3] - INFO  [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x35c9bd7493c22c0 type:create cxid:0x5 zxid:0x4f003006fd txntype:-1 reqpath:n/a Error

 

resourcemanager日志:

2017-06-15 03:09:11,943 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
2017-06-15 03:09:11,946 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
2017-06-15 03:09:11,947 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a036265681203726d32
2017-06-15 03:09:11,947 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old node has our own data, so don't need to fence it.
2017-06-15 03:09:11,947 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /yarn-leader-election/beh/ActiveBreadCrumb to indicate that the local node is the most recent active...
2017-06-15 03:09:11,953 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x35c9bd7493c2268, likely server has closed socket, closing socket connection and attempting reconnect
2017-06-15 03:09:12,051 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate...
2017-06-15 03:09:12,361 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had02.hadoop/133.0.91.42:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:12,361 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had02.hadoop/133.0.91.42:2181, initiating session
2017-06-15 03:09:12,387 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had02.hadoop/133.0.91.42:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:12,392 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had02.hadoop/133.0.91.42:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:12,534 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate...
2017-06-15 03:09:12,678 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/opt/beh/core/hadoop/etc/hadoop/yarn-site.xml
2017-06-15 03:09:12,681 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=refreshAdminAcls      TARGET=AdminService     RESULT=SUCCESS
2017-06-15 03:09:12,681 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Already in active state
2017-06-15 03:09:12,681 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=refreshQueues TARGET=AdminService     RESULT=SUCCESS
2017-06-15 03:09:12,681 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/opt/beh/core/hadoop/etc/hadoop/yarn-site.xml
2017-06-15 03:09:12,683 INFO org.apache.hadoop.util.HostsFileReader: Setting the includes file to
2017-06-15 03:09:12,683 INFO org.apache.hadoop.util.HostsFileReader: Setting the excludes file to
2017-06-15 03:09:12,683 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
2017-06-15 03:09:12,683 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
2017-06-15 03:09:12,683 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/opt/beh/core/hadoop/etc/hadoop/core-site.xml
2017-06-15 03:09:12,685 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=refreshSuperUserGroupsConfiguration   TARGET=AdminService     RESULT=SUCCESS
2017-06-15 03:09:12,685 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/opt/beh/core/hadoop/etc/hadoop/core-site.xml
2017-06-15 03:09:12,685 INFO org.apache.hadoop.security.Groups: clearing userToGroupsMap cache
2017-06-15 03:09:12,685 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=refreshUserToGroupsMappings   TARGET=AdminService     RESULT=SUCCESS
2017-06-15 03:09:12,685 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=transitionToActive    TARGET=RMHAProtocolService      RESULT=SUCCESS
2017-06-15 03:09:12,876 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had03.hadoop/133.0.91.43:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:12,877 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had03.hadoop/133.0.91.43:2181, initiating session
2017-06-15 03:09:13,535 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1497466013139_0394_000001 (auth:SIMPLE)
2017-06-15 03:09:13,539 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM registration appattempt_1497466013139_0394_000001
2017-06-15 03:09:13,539 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   IP=133.0.91.43  OPERATION=Register App Master   TARGET=ApplicationMasterService RESULT=SUCCESS  APPID=application_1497466013139_0394    APPATTEMPTID=appattempt_1497466013139_0394_000001
2017-06-15 03:09:14,920 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had03.hadoop/133.0.91.43:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:14,927 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had03.hadoop/133.0.91.43:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:15,028 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Waiting for zookeeper to be connected, retry no. + 2
2017-06-15 03:09:15,077 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had01.hadoop/133.0.91.41:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:15,078 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had01.hadoop/133.0.91.41:2181, initiating session
2017-06-15 03:09:15,100 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had01.hadoop/133.0.91.41:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:15,105 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had01.hadoop/133.0.91.41:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 397
2017-06-15 03:09:15,772 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 398
2017-06-15 03:09:16,091 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had02.hadoop/133.0.91.42:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:16,092 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had02.hadoop/133.0.91.42:2181, initiating session
2017-06-15 03:09:16,118 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had02.hadoop/133.0.91.42:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:16,124 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had02.hadoop/133.0.91.42:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:16,315 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had03.hadoop/133.0.91.43:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:16,315 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had03.hadoop/133.0.91.43:2181, initiating session
2017-06-15 03:09:16,339 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had03.hadoop/133.0.91.43:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:16,344 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had03.hadoop/133.0.91.43:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:17,176 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had01.hadoop/133.0.91.41:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-15 03:09:17,177 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to had01.hadoop/133.0.91.41:2181, initiating session
2017-06-15 03:09:17,201 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server had01.hadoop/133.0.91.41:2181, sessionid = 0x35c9bd7493c2268, negotiated timeout = 10000
2017-06-15 03:09:17,206 WARN org.apache.zookeeper.ClientCnxn: Session 0x35c9bd7493c2268 for server had01.hadoop/133.0.91.41:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2017-06-15 03:09:17,307 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Waiting for zookeeper to be connected, retry no. + 3
2017-06-15 03:09:17,329 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server had02.hadoop/133.0.91.42:2181. Will not attempt to authenticate using SASL (unknown error)

转载于:https://my.oschina.net/yulongblog/blog/1506168

你可能感兴趣的:(Zookeeper CancelledKeyException)