扑街前言:上篇文章我们说了zk 的服务端启动源码还有一些基本概念和使用,本次我们来讨论一下zk 服务端的业务逻辑。
上篇文章我们了解了zk 的IO流程图,这次还是根据那副图我们来继续说。先上图。
从流程图可以看出连接监听的线程就是AcceptThread、监听读写的就是SelectorThread、具体业务逻辑处理的就是WorkService,上次我们也说了这些线程的创建过程和启动,那么我们就可以找到对应的线程对象,然后找到线程的run方法来进行源码分析,下面我们一点点的说。
首先我们全局搜索AcceptThread 线程对象,发现这个是NIOServerCnxnFactory 对象的内部类,然后找到run 方法,从run 方法开始看其实代码大体逻辑和之前我写NIO 实现的逻辑差不多,就是zk 的封装玩得太6了。先上代码。
下面代码就是监听连接,然后将socketChannel 绑定到SelectorThread 线程上。具体流程:先用selector 扫描是否有连接创建,然后获取所有的事件信息,然后判断当前事件是否是连接事件,如果是则从selectorThreads 线程池对象中获取一个线程(注意:这里是通过迭代器获取的,如果迭代器中没有下一个,则是去selectorThreads 对象中获取新的迭代器,后续再有就是直接从迭代器中选择下一个,所以可以看出线程获取就是简单的轮询),最后将socketChannel 绑定在SelectorThread线程上。
public void run() {
try {
/**
* AcceptThread:只要服务不关闭,Channel通道不关闭
* 则执行select(),用于接收客户端的连接
*/
while (!stopped && !acceptSocket.socket().isClosed()) {
try {
select();
} catch (RuntimeException e) {
LOG.warn("Ignoring unexpected runtime exception", e);
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception", e);
}
}
} finally {
closeSelector();
// This will wake up the selector threads, and tell the
// worker thread pool to begin shutdown.
if (!reconfiguring) {
NIOServerCnxnFactory.this.stop();
}
LOG.info("accept thread exitted run method");
}
}
private void select() {
try {
/**
* 使用多路复用器进行扫描是否有连接创建
*/
selector.select();
/**
* 通过SelectionKey获取事件信息
*/
Iterator selectedKeys = selector.selectedKeys().iterator();
while (!stopped && selectedKeys.hasNext()) {
SelectionKey key = selectedKeys.next();
selectedKeys.remove();
if (!key.isValid()) {
continue;
}
/**
* 判断是连接事件
*/
if (key.isAcceptable()) {
/**
* 执行连接建立
*/
if (!doAccept()) {
// If unable to pull a new connection off the accept
// queue, pause accepting to give us time to free
// up file descriptors and so the accept thread
// doesn't spin in a tight loop.
pauseAccept(10);
}
} else {
LOG.warn("Unexpected ops in accept select {}", key.readyOps());
}
}
} catch (IOException e) {
LOG.warn("Ignoring IOException while selecting", e);
}
}
/**
* Accept new socket connections. Enforces maximum number of connections
* per client IP address. Round-robin assigns to selector thread for
* handling. Returns whether pulled a connection off the accept queue
* or not. If encounters an error attempts to fast close the socket.
*
* @return whether was able to accept a connection or not
*/
private boolean doAccept() {
boolean accepted = false;
SocketChannel sc = null;
try {
/**
* acceptSocket是ServerSocketChannel,通过accept接收一个客户端连接
*/
sc = acceptSocket.accept();
accepted = true;
if (limitTotalNumberOfCnxns()) {
throw new IOException("Too many connections max allowed is " + maxCnxns);
}
InetAddress ia = sc.socket().getInetAddress();
int cnxncount = getClientCnxnCount(ia);
if (maxClientCnxns > 0 && cnxncount >= maxClientCnxns) {
throw new IOException("Too many connections from " + ia + " - max is " + maxClientCnxns);
}
LOG.debug("Accepted socket connection from {}", sc.socket().getRemoteSocketAddress());
sc.configureBlocking(false);//设置该客户端连接为非阻塞的
// Round-robin assign this connection to a selector thread
/**
* 从selectorThreads中拿出一个SelectorThread和该客户端连接绑定
*/
if (!selectorIterator.hasNext()) {
selectorIterator = selectorThreads.iterator();
}
SelectorThread selectorThread = selectorIterator.next();
/**
* 将该连接绑定到该selectorThread上
* sc是SocketChannel,代表了某客户端的连接
*/
if (!selectorThread.addAcceptedConnection(sc)) {
throw new IOException("Unable to add connection to selector queue"
+ (stopped ? " (shutdown in progress)" : ""));
}
acceptErrorLogger.flush();
} catch (IOException e) {
// accept, maxClientCnxns, configureBlocking
ServerMetrics.getMetrics().CONNECTION_REJECTED.add(1);
acceptErrorLogger.rateLimitLog("Error accepting new connection: " + e.getMessage());
fastCloseSock(sc);
}
return accepted;
}
/**
* Place new accepted connection onto a queue for adding. Do this
* so only the selector thread modifies what keys are registered
* with the selector.
*/
public boolean addAcceptedConnection(SocketChannel accepted) {
/**
* 将一个客户端连接添加到该SelectorThread的队列中
*/
if (stopped || !acceptedQueue.offer(accepted)) {
return false;
}
wakeupSelector();
return true;
}
当请求连接处理完成之后,就是读写事件的处理,但是从上面的流程图看,处理具体的读写事件之前还有一个对于读写事件的分发和封装,也就是在SelectorThread 线程中完成的内容。上代码。
从下面的代码可以看到,当完成NIO的操作之后,也就是selector 的获取selectionKey 和获取读写事件之后,就直接用handleIO 方法将将IO事件封装成一个 IOWorkRequest 对象,最后提交到workerService 里面,在提交任务之前还做了对过期时间的更新,也就是流程图中的ExpiryQueue 队列进行了操作。
注意,目前上一步也就是serveScoketChannel 监听连接注册的时候,是没有注册socketChannel 到selector 上的,只是将连接封装到了acceptedQueue 队列中,所以这里有一个processAcceptedConnections 方法,目的就是为了取出队列中的socketChannel 将其注册到selector 上,并将这个socketChannel 封装成一个NIOServerCnxn 对象,然后把封装好的NIOServerCnxn作为SelectionKey的attachment属性,这样下次循环的时候就可以监听到具体的读写事件了,这里还做了一件事情就是往ExpiryQueue 队列中添加了任务。
还有一点需要注意,workerService 并不是一个真正的线程池,本质它只是一个类,而它里面是有一个wokers 属性,这个才是一个真正的线程池,当上面步骤中的将IO事件封装成IOWorkRequest 对象之后提交到workerService 的时候,也就是执行workerPool.schedule(workRequest); 这段代码的时候,本质上是将IOWorkRequest 对象再次封装为一个ScheduledWorkRequest 对象,最后再往wokers 线程池中提交这个任务,这个任务也就是runnable 的实现,也就是ScheduledWorkRequest 对象。
/**
* The main loop for the thread selects() on the connections and
* dispatches ready I/O work requests, then registers all pending
* newly accepted connections and updates any interest ops on the
* queue.
*/
public void run() {
try {
while (!stopped) {
try {
/**
* selector检测IO事件并执行对应操作
* 重要
*/
select();
/**
* 处理连接事件:
* 1、将新创建的连接注册到selector上,并监听 OP_READ事件
* 2、本地缓存所有客户端连接,包括 添加或更新 cnxnExpiryQueue
*/
processAcceptedConnections();
processInterestOpsUpdateRequests();
} catch (RuntimeException e) {
LOG.warn("Ignoring unexpected runtime exception", e);
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception", e);
}
}
// Close connections still pending on the selector. Any others
// with in-flight work, let drain out of the work queue.
for (SelectionKey key : selector.keys()) {
NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
if (cnxn.isSelectable()) {
cnxn.close(ServerCnxn.DisconnectReason.SERVER_SHUTDOWN);
}
cleanupSelectionKey(key);
}
SocketChannel accepted;
while ((accepted = acceptedQueue.poll()) != null) {
fastCloseSock(accepted);
}
updateQueue.clear();
} finally {
closeSelector();
// This will wake up the accept thread and the other selector
// threads, and tell the worker thread pool to begin shutdown.
NIOServerCnxnFactory.this.stop();
LOG.info("selector thread exitted run method");
}
}
private void select() {
try {
/**
* 调用NIO selector 的 select方法,
*/
selector.select();
Set selected = selector.selectedKeys();
ArrayList selectedList = new ArrayList(selected);
Collections.shuffle(selectedList);
Iterator selectedKeys = selectedList.iterator();
while (!stopped && selectedKeys.hasNext()) {
SelectionKey key = selectedKeys.next();
selected.remove(key);
if (!key.isValid()) {
cleanupSelectionKey(key);
continue;
}
/**
* 检测到某客户端连接上有读写事件
*/
if (key.isReadable() || key.isWritable()) {
/**
* 去处理IO操作
*/
handleIO(key);
} else {
LOG.warn("Unexpected ops in select {}", key.readyOps());
}
}
} catch (IOException e) {
LOG.warn("Ignoring IOException while selecting", e);
}
}
/**
* Schedule I/O for processing on the connection associated with
* the given SelectionKey. If a worker thread pool is not being used,
* I/O is run directly by this thread.
*/
private void handleIO(SelectionKey key) {
/**
* 将IO事件封装成一个 IOWorkRequest 对象
*/
IOWorkRequest workRequest = new IOWorkRequest(this, key);
/**
* 获取对应的SocketChannel
* NIOServerCnxn中包装了SocketChannel
*/
NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
// *** Stop selecting this key while processing on its connection
cnxn.disableSelectable();
key.interestOps(0);
//更新连接过期信息
touchCnxn(cnxn);
/**
* 将IO事件 提交到 工作线程池中去执行
*/
workerPool.schedule(workRequest);
}
/**
* Iterate over the queue of accepted connections that have been
* assigned to this thread but not yet placed on the selector.
*/
private void processAcceptedConnections() {
SocketChannel accepted;
while (!stopped && (accepted = acceptedQueue.poll()) != null) {
SelectionKey key = null;
try {
/**
* 给刚创建的连接注册OP_READ事件监听
*/
key = accepted.register(selector, SelectionKey.OP_READ);
/**
* 将SocketChannel封装成一个NIOServerCnxn
*/
NIOServerCnxn cnxn = createConnection(accepted, key, this);
key.attach(cnxn);//把封装好的NIOServerCnxn作为SelectionKey的attachment
/**
* 本地缓存所有客户端连接
* 添加或更新客户端连接过期时间 cnxnExpiryQueue
* 这里是针对客户端连接的
*/
addCnxn(cnxn);
} catch (IOException e) {
// register, createConnection
cleanupSelectionKey(key);
fastCloseSock(accepted);
}
}
}
public void schedule(WorkRequest workRequest, long id) {
if (stopped) {
workRequest.cleanup();
return;
}
/**
* 将WorkRequest(默认是IOWorkRequest) 包装成一个Runnable
*/
ScheduledWorkRequest scheduledWorkRequest = new ScheduledWorkRequest(workRequest);
// If we have a worker thread pool, use that; otherwise, do the work
// directly.
int size = workers.size();
if (size > 0) {
try {
// make sure to map negative ids as well to [0, size-1]
/**
* 获取工作线程池
*/
int workerNum = ((int) (id % size) + size) % size;
ExecutorService worker = workers.get(workerNum);
/**
* 真正提交到Worker thread poll 中去执行
*/
worker.execute(scheduledWorkRequest);
} catch (RejectedExecutionException e) {
LOG.warn("ExecutorService rejected execution", e);
workRequest.cleanup();
}
} else {
// When there is no worker thread pool, do the work directly
// and wait for its completion
scheduledWorkRequest.run();
}
}
上面说了WorkService 并不是一个线程,它只是有一个属性为wokers 线程池的类,而这个线程池我们上面可以看到放的其实是ScheduledWorkRequest 线程对象,这个对象就是ruuable 的实现,那么其实我们应该找的线程就是ScheduledWorkRequest 对象。上代码。
这里其实就已经可以看到IO 事件的真正的读写了,这里操作的就是之前封装的NIOServerCnxn 对象,这里包含了就是对于selectionKey 和socketChannel 的信息。
注意这里还做了一件事情就是对ExpiryQueue 队列的过期时间进行了更新。
/**
* 处理每次客户端请求的IO操作
*/
@Override
public void run() {
try {
// Check if stopped while request was on queue
if (stopped) {
workRequest.cleanup();
return;
}
/**
* 处理请求
*/
workRequest.doWork();
} catch (Exception e) {
LOG.warn("Unexpected exception", e);
workRequest.cleanup();
}
}
}
/**
* 处理请求
* @throws InterruptedException
*/
public void doWork() throws InterruptedException {
if (!key.isValid()) {
selectorThread.cleanupSelectionKey(key);
return;
}
/**
* 处理读写IO操作
*/
if (key.isReadable() || key.isWritable()) {
/**
* 真正IO操作的方法
*/
cnxn.doIO(key);
// Check if we shutdown or doIO() closed this connection
if (stopped) {
cnxn.close(ServerCnxn.DisconnectReason.SERVER_SHUTDOWN);
return;
}
if (!key.isValid()) {
selectorThread.cleanupSelectionKey(key);
return;
}
//每次请求过来都会更新连接过期时间 cnxnExpiryQueue
touchCnxn(cnxn);
}
// Mark this connection as once again ready for selection
cnxn.enableSelectable();
// Push an update request on the queue to resume selecting
// on the current set of interest ops, which may have changed
// as a result of the I/O operations we just performed.
if (!selectorThread.addInterestOpsUpdateRequest(key)) {
cnxn.close(ServerCnxn.DisconnectReason.CONNECTION_MODE_CHANGED);
}
}
总结一下就是,上述的代码其实就是对流程图进行了详细的演示和讲解,真正的读写我们下篇再说,zk 的难点其实也是在那里,这部分内容相对来说是比较简单的。就这样,结束。