[Zookeeper学习笔记之四]Zookeeper Client Library会话重建

为了说明问题,先来看个简单的示例代码

 

package com.tom.zookeeper.book;

import com.tom.Host;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.Watcher;

import java.io.IOException;

public class Master implements Watcher {
    ZooKeeper zk;
    String hostPort;

    Master(String hostPort) {
        this.hostPort = hostPort;
    }

    void startZK() throws IOException {
        System.out.println("Start to create the ZooKeeper instance");
        zk = new ZooKeeper(hostPort, 15000, this);
        System.out.println("Finish to create the ZooKeeper instance");
    }

    public void process(WatchedEvent e) {
        System.out.println(e);
    }

    public static void main(String args[]) throws Exception {
        Master m = new Master(Host.HOST);
        m.startZK();
        Thread.sleep(60*1000); //sleep 60s
    }
}

 

 

 1.不启动Zookeeper服务器

 运行上面的代码,输出是

Start to create the ZooKeeper instance

Finish to create the ZooKeeper instance

因为会话没有建立,所以,Watcher的回调方法process没有被调用

 

2.启动Zookeeper服务器

运行上面的代码,在运行结束前,然后关闭Zookeeper服务器,输出是

Finish to create the ZooKeeper instance
WatchedEvent state:SyncConnected type:None path:null
WatchedEvent state:Disconnected type:None path:null

 

3.启动Zookeeper服务器,运行代码,在运行结束前,关闭服务器和重启服务器,输出是

Finish to create the ZooKeeper instance
WatchedEvent state:SyncConnected type:None path:null
WatchedEvent state:Disconnected type:None path:null
WatchedEvent state:SyncConnected type:None path:null

 

对于第三种情况,客户端重新建立会话,原因是当Zookeeper服务器不可用时,客户端维持的是CONNECTING状态,当Zookeeper服务器又可用时,客户端会自动重建会话

 

假如Zookeeper Client的超时时间是5秒钟,而Zookeeper服务器挂了10分钟,按理说,Zookeeper Client的连接早该超时了,可为什么能会话重建呢,为什么会有如下的事件发生呢?

WatchedEvent state:Disconnected type:None path:null
WatchedEvent state:SyncConnected type:None path:null

 

原因可以参考Zookeeper的官方FAQ,它就这个给出了答案:

What happens to ZK sessions while the cluster is down?

 

Imagine that a client is connected to ZK with a 5 second session timeout, and the administrator brings the entire ZK cluster down for an upgrade. The cluster is down for several minutes, and then is restarted.

In this scenario, the client is able to reconnect and refresh its session. Because session timeouts are tracked by the leader, the session starts counting down again with a fresh timeout when the cluster is restarted. So, as long as the client connects within the first 5 seconds after a leader is elected, it will reconnect without an expiration, and any ephemeral nodes it had prior to the downtime will be maintained.

The same behavior is exhibited when the leader crashes and a new one is elected. In the limit, if the leader is flip-flopping back and forth quickly, sessions will never expire since their timers are getting constantly reset.
意思就是说,如果在Zookeeper挂了之前,session没有超时,那么Zookeeper重启后,会完成会话重建,至于重建后的session还有多少时间就要timeout了,这个里面并没有说的很清楚,仅用了fresh这个单词,
这个单词是说重建的session从重新开始计算累计存活时间,还是说,是把之前Zookeeper活着的时候,已经走过的session时间计算在内了?

 

总结和问题:

1.在Zookeeper服务器不可用时,客户端等维持CONNECTING状态多长时间?

2.通过这个实例,我们在处理Disconnected事件时,不应该重新创建新的Zookeeper实例,试图创建新的会话,如果Zookeeper能够恢复,那么之前的Zookeeper实例在Zookeeper Client Library的帮助下可以完成会话重建,如果Zookeeper不能恢复,那么重新创建Zookeeper也是徒劳无功。

 

 
 

 

 

 

你可能感兴趣的:(zookeeper)