1、zookeeper session是什么
客户端与服务端的任何交换操作都与会话相关,包括:临时节点的生命周期,客户端的请求顺序执行以及Watcher通知机制。
2、客户端操作
2.1、Zookeeper客户端一次创建会话的过程。
2.2、构造会话请求
long sessId = (seenRwServerBefore) ? sessionId : 0; ConnectRequest conReq = new ConnectRequest(0, lastZxid,sessionTimeout, sessId, sessionPasswd); outgoingQueue.addFirst(new Packet(null, null, conReq,null, null, readOnly));
2.3、处理Response
ConnectResponse conRsp = new ConnectResponse(); conRsp.deserialize(bbia, "connect"); this.sessionId = conRsp.getSessionId(); sendThread.onConnected(conRsp.getTimeOut(),this.sessionId,conRsp.getPasswd(), isRO);
2.3.1、更新客户端Timeout等信息
void onConnected(int _negotiatedSessionTimeout, long _sessionId, byte[] _sessionPasswd, boolean isRO) throws IOException { negotiatedSessionTimeout = _negotiatedSessionTimeout; if (negotiatedSessionTimeout <= 0) { state = States.CLOSED; eventThread.queueEvent(new WatchedEvent( Watcher.Event.EventType.None, Watcher.Event.KeeperState.Expired, null)); eventThread.queueEventOfDeath(); throw new SessionExpiredException( "Unable to reconnect to ZooKeeper service, session 0x" + Long.toHexString(sessionId) + " has expired"); } if (!readOnly && isRO) { LOG.error("Read/write client got connected to read-only server"); } readTimeout = negotiatedSessionTimeout * 2 / 3; connectTimeout = negotiatedSessionTimeout / hostProvider.size(); hostProvider.onConnected(); sessionId = _sessionId; sessionPasswd = _sessionPasswd; state = (isRO) ? States.CONNECTEDREADONLY : States.CONNECTED; seenRwServerBefore |= !isRO; LOG.info("Session establishment complete on server " + clientCnxnSocket.getRemoteSocketAddress() + ", sessionid = 0x" + Long.toHexString(sessionId) + ", negotiated timeout = " + negotiatedSessionTimeout + (isRO ? " (READ-ONLY mode)" : "")); KeeperState eventState = (isRO) ? KeeperState.ConnectedReadOnly : KeeperState.SyncConnected; eventThread.queueEvent(new WatchedEvent( Watcher.Event.EventType.None, eventState, null)); }
2.4、发送Ping请求
客户端当sessionTimeout/3(readTimeout=sessionTimeout * 2 /3 )之内都没有发送通信请求后,会主动发起一个ping请求,以维持会话状态.
if (state.isConnected()) { int timeToNextPing = readTimeout / 2 - clientCnxnSocket.getIdleSend(); if (timeToNextPing <= 0) { sendPing(); clientCnxnSocket.updateLastSend(); } else { if (timeToNextPing < to) { to = timeToNextPing; } } }
3、服务端操作
3.1、服务端会话处理流程示意图
3.2、协商sessionTimeout
2Ticktime <= sessionTimeout <= 20Ticktime(tickTime默认等于2s)
int sessionTimeout = connReq.getTimeOut(); byte passwd[] = connReq.getPasswd(); int minSessionTimeout = getMinSessionTimeout(); if (sessionTimeout < minSessionTimeout) { sessionTimeout = minSessionTimeout; } int maxSessionTimeout = getMaxSessionTimeout(); if (sessionTimeout > maxSessionTimeout) { sessionTimeout = maxSessionTimeout; } cnxn.setSessionTimeout(sessionTimeout);
3.3、判断是否需要创建新会话
long sessionId = connReq.getSessionId(); if (sessionId != 0) { long clientSessionId = connReq.getSessionId(); LOG.info("Client attempting to renew session 0x"+ Long.toHexString(clientSessionId) + " at " + cnxn.getRemoteSocketAddress()); serverCnxnFactory.closeSession(sessionId); cnxn.setSessionId(sessionId); reopenSession(cnxn, sessionId, passwd, sessionTimeout); } else { LOG.info("Client attempting to establish new session at " + cnxn.getRemoteSocketAddress()); createSession(cnxn, passwd, sessionTimeout); }
3.4、创建sessionId
每次客户端创建新会话时,zookeeper都会为其分配一个全局唯一的sessionID。
public static long initializeNextSession(long id) { long nextSid = 0; nextSid = (System.currentTimeMillis() << 24) >> 8; nextSid = nextSid | (id <<56); return nextSid; } synchronized public long createSession(int sessionTimeout) { addSession(nextSessionId, sessionTimeout); return nextSessionId++; }
3.5、注册会话
SessionTracker中维护两个数据结构sessionsWithTimeout和sessionsById,前者根据sessionID保存了所有会话超时时间,后者则根据sessionID保存了会话实体。
HashMap<Long, SessionImpl> sessionsById = new HashMap<Long, SessionImpl>(); ConcurrentHashMap<Long, Integer> sessionsWithTimeout; synchronized public void addSession(long id, int sessionTimeout) { sessionsWithTimeout.put(id, sessionTimeout); if (sessionsById.get(id) == null) { SessionImpl s = new SessionImpl(id, sessionTimeout, 0); sessionsById.put(id, s); if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG, ZooTrace.SESSION_TRACE_MASK, "SessionTrackerImpl --- Adding session 0x" + Long.toHexString(id) + " " + sessionTimeout); } } else { if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG, ZooTrace.SESSION_TRACE_MASK, "SessionTrackerImpl --- Existing session 0x" + Long.toHexString(id) + " " + sessionTimeout); } } touchSession(id, sessionTimeout); }
3.6、激活会话
3.6.1、分桶策略
将类似的会话放在同一区块中进行管理,以便于不同区块的隔离,相同区块的统一处理。
分配的原则:会话的下一次超时的时间点(ExpirationTime, ExpirationInterval=tickTime 单位:毫秒)。
ExpirationTime = CurrentTime + SessionTimeout
ExpirationTime = (ExpirationTime / ExpirationInterval +1) x ExpirationInterval
以ExpirationTime为key存入Map中,实现了分桶。
3.6.2、激活
客户端会向服务端发送ping请求来保持会话的有效,服务端收到心跳会不断的更新对应的客户端会话。
HashMap<Long, SessionSet> sessionSets = new HashMap<Long, SessionSet>(); private long roundToInterval(long time) { // We give a one interval grace period return (time / expirationInterval + 1) * expirationInterval; } synchronized public boolean touchSession(long sessionId, int timeout) { if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG, ZooTrace.CLIENT_PING_TRACE_MASK, "SessionTrackerImpl --- Touch session: 0x" + Long.toHexString(sessionId) + " with timeout " + timeout); } SessionImpl s = sessionsById.get(sessionId); // Return false, if the session doesn't exists or marked as closing if (s == null || s.isClosing()) { return false; } long expireTime = roundToInterval(System.currentTimeMillis() + timeout); if (s.tickTime >= expireTime) { // Nothing needs to be done return true; } SessionSet set = sessionSets.get(s.tickTime); if (set != null) { set.sessions.remove(s); } s.tickTime = expireTime; set = sessionSets.get(s.tickTime); if (set == null) { set = new SessionSet(); sessionSets.put(expireTime, set); } set.sessions.add(s); return true; }
3.7、会话清理
3.7.1、会话超时检测
分桶策略的数据迁移是被动触发,没有做迁移的sessionId会一直保存在之前的桶中,SessionTracker线程会检测到过期(没有迁移的)的会话。
3.7.2、清理过程
1、标记会话为“已关闭”
2、发起“会话关闭”请求
3、收集要清理的临时节点
4、添加“节点删除”事务
5、删除临时节点
6、移除会话
7、关闭NioServerCnxn
synchronized public void run() { try { while (running) { currentTime = System.currentTimeMillis(); if (nextExpirationTime > currentTime) { this.wait(nextExpirationTime - currentTime); continue; } SessionSet set; set = sessionSets.remove(nextExpirationTime); if (set != null) { for (SessionImpl s : set.sessions) { setSessionClosing(s.sessionId); expirer.expire(s); } } nextExpirationTime += expirationInterval; } } catch (InterruptedException e) { LOG.error("Unexpected interruption", e); } LOG.info("SessionTrackerImpl exited loop!"); } synchronized public void setSessionClosing(long sessionId) { if (LOG.isTraceEnabled()) { LOG.info("Session closing: 0x" + Long.toHexString(sessionId)); } SessionImpl s = sessionsById.get(sessionId); if (s == null) { return; } s.isClosing = true; } public void expire(Session session) { long sessionId = session.getSessionId(); LOG.info("Expiring session 0x" + Long.toHexString(sessionId) + ", timeout of " + session.getTimeout() + "ms exceeded"); close(sessionId); } private void close(long sessionId) { submitRequest(null, sessionId, OpCode.closeSession, 0, null, null); }
PrepRequestProcessor.pRequest2Txn(int type, long zxid, Request request, Record record, boolean deserialize) case OpCode.closeSession: // We don't want to do this check since the session expiration thread // queues up this operation without being the session owner. // this request is the last of the session so it should be ok //zks.sessionTracker.checkSession(request.sessionId, request.getOwner()); HashSet<String> es = zks.getZKDatabase() .getEphemerals(request.sessionId); synchronized (zks.outstandingChanges) { for (ChangeRecord c : zks.outstandingChanges) { if (c.stat == null) { // Doing a delete es.remove(c.path); } else if (c.stat.getEphemeralOwner() == request.sessionId) { es.add(c.path); } } for (String path2Delete : es) { addChangeRecord(new ChangeRecord(request.hdr.getZxid(), path2Delete, null, 0, null)); } zks.sessionTracker.setSessionClosing(request.sessionId); } LOG.info("Processed session termination for sessionid: 0x" + Long.toHexString(request.sessionId)); break;
FinalRequestProcessor.processRequest(Request request) if (request.hdr != null && request.hdr.getType() == OpCode.closeSession) { ServerCnxnFactory scxn = zks.getServerCnxnFactory(); // this might be possible since // we might just be playing diffs from the leader if (scxn != null && request.cnxn == null) { // calling this if we have the cnxn results in the client's // close session response being lost - we've already closed // the session/socket here before we can send the closeSession // in the switch block below scxn.closeSession(request.sessionId); return; } }