开篇
很早之前就开始使用zookeeper了,当时就觉的很好用,基于zookeeper还把一个不支持分项布式部署和任务分发的通用文件采集器改成了分布式的,后来拜读了《从Paxos到Zookeeper分布式一致性原理与实践》,之后自己就按照几种不同的场景debug了zookeeper单机版源代码,由于当时debug完之后没有做相应的笔记,时间久了很多地方的知识点都忘记了,于是最近又把zookeeper的源码读了一下,获益良多,记录于此
Server端的启动过程
1. 参数解析
我们先分析zookeeper单机版Server端的启动过程,启动入口类是org.apache.zookeeper.server.ZooKeeperServerMain,需要传入zoo.cfg作为启动参数,在zoo.cfg中配置的参数会被解析成ServerConfig的各个属性,这里面需要提到的是minSessionTimeout,maxSessionTimeout这两个参数,他们是根据tickTime算出来的
下面是ServerConfig的属性信息
我们注解几个重要的参数
- clientPortAddress 服务端监听客户端连接端口
- dataDir 服务端zookeeper节点信息持久化的位置
- logDir 服务端事物操作日志记录的位置
- tickTime 基于tickTime可以计算出minSessionTimeout和maxSessionTimeout
- maxClientCnxns 同一个IP地址可以创建的最大连接数
- listenBacklog 用来设置socket连接队列的最大值
runFromConfig
runFromConfig根据解析好的参数来启动zookeeper server
- 创建 FileTxnSnapLog类
FileTxnSnapLog 包含了两个重要的属性txnLog和snapLog分别指向zookeeper的事物日志和数据持久化文件
- 创建ZookeeperServer
ZookeeperServer是server端的代表类
- 创建adminServer
adminServer使用jetty作为服务器默认端口是8080,通过http://ip:8080/commands
可以查看zookeeper支持的命令
- 创建ServerCnxnFactory
ServerCnxnFactory用来管理服务端的连接,有两种实现NIOServerCnxnFactory,NettyServerCnxnFactory
默认情况下是NIOServerCnxnFactory
我看下NIOServerCnxnFactory的configure方法
public void configure(InetSocketAddress addr, int maxcc, int backlog, boolean secure) throws IOException {
if (secure) {
throw new UnsupportedOperationException("SSL isn't supported in NIOServerCnxn");
}
configureSaslLogin();
maxClientCnxns = maxcc;
initMaxCnxns();
//sessionlessCnxnTimeout是做检查session是否过期的时间间隔
sessionlessCnxnTimeout = Integer.getInteger(ZOOKEEPER_NIO_SESSIONLESS_CNXN_TIMEOUT, 10000);
// We also use the sessionlessCnxnTimeout as expiring interval for
// cnxnExpiryQueue. These don't need to be the same, but the expiring
// interval passed into the ExpiryQueue() constructor below should be
// less than or equal to the timeout.
//用来管理各个连接的超时情况的容器
cnxnExpiryQueue = new ExpiryQueue(sessionlessCnxnTimeout);
//处理各个连接超时的线程
expirerThread = new ConnectionExpirerThread();
int numCores = Runtime.getRuntime().availableProcessors();
// 32 cores sweet spot seems to be 4 selector threads
//根据服务器可用的处理器个数计算得到selector线程的个数
numSelectorThreads = Integer.getInteger(
ZOOKEEPER_NIO_NUM_SELECTOR_THREADS,
Math.max((int) Math.sqrt((float) numCores / 2), 1));
if (numSelectorThreads < 1) {
throw new IOException("numSelectorThreads must be at least 1");
}
//获得服务端处理IO事件的线程数量
numWorkerThreads = Integer.getInteger(ZOOKEEPER_NIO_NUM_WORKER_THREADS, 2 * numCores);
workerShutdownTimeoutMS = Long.getLong(ZOOKEEPER_NIO_SHUTDOWN_TIMEOUT, 5000);
String logMsg = "Configuring NIO connection handler with "
+ (sessionlessCnxnTimeout / 1000) + "s sessionless connection timeout, "
+ numSelectorThreads + " selector thread(s), "
+ (numWorkerThreads > 0 ? numWorkerThreads : "no") + " worker threads, and "
+ (directBufferBytes == 0 ? "gathered writes." : ("" + (directBufferBytes / 1024) + " kB direct buffers."));
LOG.info(logMsg);
//创建numSelectorThreads个SelectorThread线程
for (int i = 0; i < numSelectorThreads; ++i) {
selectorThreads.add(new SelectorThread(i));
}
listenBacklog = backlog;
//创建服务端ServerSocketChannel并且绑定到给定的地址上
this.ss = ServerSocketChannel.open();
ss.socket().setReuseAddress(true);
LOG.info("binding to port {}", addr);
if (listenBacklog == -1) {
ss.socket().bind(addr);
} else {
ss.socket().bind(addr, listenBacklog);
}
//设置服务端ServerSocketChannel为非阻塞
ss.configureBlocking(false);
//创建接受客户端连接线程
acceptThread = new AcceptThread(ss, addr, selectorThreads);
}
上面在NIOServerCnxnFactory的configure方法中创建了三种类型的线程
- ConnectionExpirerThread
- SelectorThread
- AcceptThread
SelectorThread
我看下它的构造方法
public SelectorThread(int id) throws IOException {
super("NIOServerCxnFactory.SelectorThread-" + id);
this.id = id;
//所有被接受的SocketChannel都会被加入到acceptedQueue中
acceptedQueue = new LinkedBlockingQueue();
//zookeeper在处理各种IO状态的时候会更改SelectionKey注册的感兴趣事件,当有SelectionKey需要去更改感兴趣的事件的时候,需要把连接对应的SelectionKey放入到updateQueue中
updateQueue = new LinkedBlockingQueue();
}
AcceptThread
看下它的构造方法
public AcceptThread(ServerSocketChannel ss, InetSocketAddress addr, Set selectorThreads) throws IOException {
//在super中会创建AcceptThread关联的selector选择器
super("NIOServerCxnFactory.AcceptThread:" + addr);
this.acceptSocket = ss;
//服务端的ServerSocketChannel向selector注册OP_ACCEPT,准备接受来自客户端的连接
this.acceptKey = acceptSocket.register(selector, SelectionKey.OP_ACCEPT);
//将来被接受的connection都会从selectorThreads选择一个selectorThread分配给这个connection,和connection对应的selectorThread
//负责处理该connection所有的IO事件
this.selectorThreads = Collections.unmodifiableList(new ArrayList(selectorThreads));
selectorIterator = this.selectorThreads.iterator();
}
zookeeper 服务启动
上面创建了各种线程,这些线程还没有启动,同时还有一些线程没有创建
NIOServerCnxnFactory.start(ZookeeperServer)最终实现了这些线程的创建启动,从而完成了zookeeper server端的启动
NIOServerCnxnFactory.startup 源码
@Override
public void startup(ZooKeeperServer zks, boolean startServer) throws IOException, InterruptedException {
//启动已经创建的acceptThread,selectorThread等
start();
setZooKeeperServer(zks);
if (startServer) {
//数据恢复,zookeeper启动之后会从数据库持久化文件snap和事物log中恢复节点数据和session信息到内存中
zks.startdata();
//
zks.startup();
}
}
start()
public void start() {
stopped = false;
if (workerPool == null) {
//创建处理IO事件的线程池
workerPool = new WorkerService("NIOWorker", numWorkerThreads, false);
}
//启动SelectorThread
for (SelectorThread thread : selectorThreads) {
if (thread.getState() == Thread.State.NEW) {
thread.start();
}
}
//启动acceptThread
// ensure thread is started once and only once
if (acceptThread.getState() == Thread.State.NEW) {
acceptThread.start();
}
//启动连接超时管理线程
if (expirerThread.getState() == Thread.State.NEW) {
expirerThread.start();
}
}
zookeeperServer.startup
public synchronized void startup() {
if (sessionTracker == null) {
//创建sessionTracker类来管理session 超时状态
createSessionTracker();
}
//启动sessionTracker
startSessionTracker();
//设置zookeeper server端请求处理链
//对于单机版本的zookeeper而言,请求处理链上包含了3个节点
//PrepRequestProcessor --> SyncRequestProcessor --> FinalRequestProcessor
//其中PrepRequestProcessor和SyncRequestProcessor都是由单独的线程运行,通过queue传递消息
setupRequestProcessors();
//创建RequestThrottler并且启动,RequestThrottler用来控制请求的数量
//达到限流的目的
startRequestThrottler();
//注册jmx监控
registerJMX();
//开启jvm暂停监控
startJvmPauseMonitor();
//注册监控点
registerMetrics();
//设置zookeeper server为运行状态
setState(State.RUNNING);
requestPathMetricsCollector.start();
localSessionEnabled = sessionTracker.isLocalSessionsEnabled();
notifyAll();
}
启动ContainerManager
ContainerManager用来管理zookeeper container类型的节点,container类型的节点作为容器借来用来存放别的节点,当一个container类型节点的所有子节点都被删除之后,ContainerManager会按照固定的检查周期去找到这些空的container节点然后把他们删除掉
上面就是zookeeper服务启动的大体过程,我现在把server端最核心的几个线程详细解析下
- acceptThread
- selectorThread
acceptThread
服务端启动之后默认会在2181端口监听用户的连接请求,这一过程由acceptThread实现,acceptThread会接受客户端的连接然后给连接分配一个selectorThread去处理,典型的reactor模型。我们看下acceptThread.run 源码
//上面我们讲解AcceptThread的构造方法的时候讲过,在AcceptThread的构造方法中服务端的ServerSocketChannel会向selector注册OP_ACCEPT监听
public void run() {
try {
while (!stopped && !acceptSocket.socket().isClosed()) {
try {
//select方法是acceptThread的核心业务逻辑
select();
} catch (RuntimeException e) {
LOG.warn("Ignoring unexpected runtime exception", e);
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception", e);
}
}
} finally {
closeSelector();
// This will wake up the selector threads, and tell the
// worker thread pool to begin shutdown.
if (!reconfiguring) {
NIOServerCnxnFactory.this.stop();
}
LOG.info("accept thread exitted run method");
}
}
- acceptThread.select
private void select() {
try {
//等待客户端连接事件的发生
selector.select();
Iterator selectedKeys = selector.selectedKeys().iterator();
while (!stopped && selectedKeys.hasNext()) {
SelectionKey key = selectedKeys.next();
selectedKeys.remove();
if (!key.isValid()) {
continue;
}
//对于连接事件使用doAccept方法处理
if (key.isAcceptable()) {
if (!doAccept()) {
// If unable to pull a new connection off the accept
// queue, pause accepting to give us time to free
// up file descriptors and so the accept thread
// doesn't spin in a tight loop.
pauseAccept(10);
}
} else {
LOG.warn("Unexpected ops in accept select {}", key.readyOps());
}
}
} catch (IOException e) {
LOG.warn("Ignoring IOException while selecting", e);
}
}
- acceptThread.doAccept()
private boolean doAccept() {
boolean accepted = false;
SocketChannel sc = null;
try {
//获取客户端连接的socketChannel
sc = acceptSocket.accept();
accepted = true;
if (limitTotalNumberOfCnxns()) {
throw new IOException("Too many connections max allowed is " + maxCnxns);
}
InetAddress ia = sc.socket().getInetAddress();
//getClientCnxnCount用于获取和记录同一个ip创建的连接数量
int cnxncount = getClientCnxnCount(ia);
//如果同一个ip创建的连接数量大于用户设定的单个客户端允许的最大连接数,直接报错,连接失败
if (maxClientCnxns > 0 && cnxncount >= maxClientCnxns) {
throw new IOException("Too many connections from " + ia + " - max is " + maxClientCnxns);
}
LOG.debug("Accepted socket connection from {}", sc.socket().getRemoteSocketAddress());
sc.configureBlocking(false);
//对可以接受的连接按照round-robin的方式从selectorThread列表中取得一个selectorThread分配给当前的连接
// Round-robin assign this connection to a selector thread
if (!selectorIterator.hasNext()) {
selectorIterator = selectorThreads.iterator();
}
SelectorThread selectorThread = selectorIterator.next();
//把当前连接加入到分配的selectorThread的队列中
if (!selectorThread.addAcceptedConnection(sc)) {
throw new IOException("Unable to add connection to selector queue"
+ (stopped ? " (shutdown in progress)" : ""));
}
acceptErrorLogger.flush();
} catch (IOException e) {
// accept, maxClientCnxns, configureBlocking
ServerMetrics.getMetrics().CONNECTION_REJECTED.add(1);
acceptErrorLogger.rateLimitLog("Error accepting new connection: " + e.getMessage());
fastCloseSock(sc);
}
return accepted;
}
}
我们分析下SelectorThread.run 源码
//selectorThread run 方法包含三个重要的方法
// select(), processAcceptedConnections(), processInterestOpsUpdateRequests()
public void run() {
try {
while (!stopped) {
try {
select();
processAcceptedConnections();
processInterestOpsUpdateRequests();
} catch (RuntimeException e) {
LOG.warn("Ignoring unexpected runtime exception", e);
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception", e);
}
}
// Close connections still pending on the selector. Any others
// with in-flight work, let drain out of the work queue.
for (SelectionKey key : selector.keys()) {
NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
if (cnxn.isSelectable()) {
cnxn.close(ServerCnxn.DisconnectReason.SERVER_SHUTDOWN);
}
cleanupSelectionKey(key);
}
SocketChannel accepted;
while ((accepted = acceptedQueue.poll()) != null) {
fastCloseSock(accepted);
}
updateQueue.clear();
} finally {
closeSelector();
// This will wake up the accept thread and the other selector
// threads, and tell the worker thread pool to begin shutdown.
NIOServerCnxnFactory.this.stop();
LOG.info("selector thread exitted run method");
}
}
对于selectorThread.run的三个方法select(), processAcceptedConnections(), processInterestOpsUpdateRequests()为了理解上的方便,我们先讲解processAcceptedConnections()
在讲解processAcceptedConnections()之前我们回到上面AcceptThread.doAccept方法调用的selectorThread.addAcceptedConnection()
public boolean addAcceptedConnection(SocketChannel accepted) {
//acceptThread接受的连接会被加入到selectorThread的updateQueue中,加入之后会把selectorThred从selector.select()中唤醒
if (stopped || !acceptedQueue.offer(accepted)) {
return false;
}
wakeupSelector();
return true;
}
- processAcceptedConnections
processAcceptedConnections就是用来处理加入到acceptedQueue中的socketChannel的
private void processAcceptedConnections() {
SocketChannel accepted;
//从acceptedQueue中获取一个SocketChannel
while (!stopped && (accepted = acceptedQueue.poll()) != null) {
SelectionKey key = null;
try {
//socketChannel在selectorThread对应的selector上注册OP_READ事件
key = accepted.register(selector, SelectionKey.OP_READ);
//把socketChannel包装成zookeeper服务端管理的NIOServerCnxn连接对象
NIOServerCnxn cnxn = createConnection(accepted, key, this);
key.attach(cnxn);
//把创建的连接对象NIOServerCnxn加入到zookeeper server端的管理范畴,之后cnxnExpiry会去定时的检查它的超时状态
addCnxn(cnxn);
} catch (IOException e) {
// register, createConnection
cleanupSelectionKey(key);
fastCloseSock(accepted);
}
}
}
通过processAcceptedConnections的处理,socketChannel就会在分配的selectorThread的selector上注册监听OP_READ事件,现在我们在看下select方法的实现
private void select() {
try {
//获取监听事件
selector.select();
Set selected = selector.selectedKeys();
ArrayList selectedList = new ArrayList(selected);
Collections.shuffle(selectedList);
Iterator selectedKeys = selectedList.iterator();
while (!stopped && selectedKeys.hasNext()) {
SelectionKey key = selectedKeys.next();
selected.remove(key);
if (!key.isValid()) {
cleanupSelectionKey(key);
continue;
}
if (key.isReadable() || key.isWritable()) {
//handleIO来处理监听事件
handleIO(key);
} else {
LOG.warn("Unexpected ops in select {}", key.readyOps());
}
}
} catch (IOException e) {
LOG.warn("Ignoring IOException while selecting", e);
}
}
handleIO 实现
private void handleIO(SelectionKey key) {
//zookeeper把每个socketChannel上发生的IO事件封装成IOWorkRequest任务,
//然后这个IOWorkRequest任务会被提交给workerPool线程池去处理
IOWorkRequest workRequest = new IOWorkRequest(this, key);
NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
// Stop selecting this key while processing on its
// connection
cnxn.disableSelectable();
//socketChannel在selector注册对任何事件都不感兴趣
//zookeeper对单个连接上的IO事件是按照顺序一个一个处理的,它是
//如何实现按照IO事件发生顺序一个一个处理的呢?
//当socketChannel接受到一个IO事件之后,他就会设置对任何事件都不感兴趣,这样后来到来的事件就没有办法传递给server端的socketChannel,
//直到当前IO事件处理完成之后,socketChannel才会继续注册一些感兴趣事件,这个时候后续事件才能被处理
key.interestOps(0);
//更新连接的超时时间点
touchCnxn(cnxn);
//向IO任务处理线程池提交IOWorkRequest任务
workerPool.schedule(workRequest);
}
根据上面的描述zookeeper通过设置socketChannel对任何事件都不感兴趣来实现同一个socketChannel上不同IO事件按照顺序处理,那么在一个IO事件处理完成之后,socketChannel是如何再次向selector注册感兴趣的事件的呢?
这个就是processInterestOpsUpdateRequests的工作了
我们看下processInterestOpsUpdateRequests的源代码
//IOWorkRequest在处理完一个IO事件之后,它就会把连接对应的selectionKey放入到updateQueue中
private void processInterestOpsUpdateRequests() {
SelectionKey key;
//processInterestOpsUpdateRequests从updateQueue中取出所有的selectionKey
while (!stopped && (key = updateQueue.poll()) != null) {
if (!key.isValid()) {
cleanupSelectionKey(key);
}
//通过selectionKey的附件获取到对应的连接对象NIOServerCnxn
NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
if (cnxn.isSelectable()) {
//selectionKey重新注册感兴趣的事件,cnxn.getInterestOps会根据连接上的读写情况设置对应的感兴趣事件
key.interestOps(cnxn.getInterestOps());
}
}
}
上面讲解了zookeeper关于连接接受和连接上IO事件处理的逻辑,下面的这张图是对这个过程的概括
server端还会启动session和connection超时管理线程,在这里就不详解解析了可以看本系列另一篇文章
zookeeper 超时对象管理实现--ExpiryQueue