zookeeper使用及源码分析(二)

        扑街前言:上篇文章我们说了zk 的服务端启动源码还有一些基本概念和使用,本次我们来讨论一下zk 服务端的业务逻辑。


        上篇文章我们了解了zk 的IO流程图,这次还是根据那副图我们来继续说。先上图。

从流程图可以看出连接监听的线程就是AcceptThread、监听读写的就是SelectorThread、具体业务逻辑处理的就是WorkService,上次我们也说了这些线程的创建过程和启动,那么我们就可以找到对应的线程对象,然后找到线程的run方法来进行源码分析,下面我们一点点的说。

AcceptThread

        首先我们全局搜索AcceptThread 线程对象,发现这个是NIOServerCnxnFactory 对象的内部类,然后找到run 方法,从run 方法开始看其实代码大体逻辑和之前我写NIO 实现的逻辑差不多,就是zk 的封装玩得太6了。先上代码。

        下面代码就是监听连接,然后将socketChannel 绑定到SelectorThread 线程上。具体流程:先用selector 扫描是否有连接创建,然后获取所有的事件信息,然后判断当前事件是否是连接事件,如果是则从selectorThreads 线程池对象中获取一个线程(注意:这里是通过迭代器获取的,如果迭代器中没有下一个,则是去selectorThreads 对象中获取新的迭代器,后续再有就是直接从迭代器中选择下一个,所以可以看出线程获取就是简单的轮询),最后将socketChannel 绑定在SelectorThread线程上。

public void run() {
	try {
		/**
		 * AcceptThread:只要服务不关闭,Channel通道不关闭
		 * 则执行select(),用于接收客户端的连接
		 */
		while (!stopped && !acceptSocket.socket().isClosed()) {
			try {
				select();
			} catch (RuntimeException e) {
				LOG.warn("Ignoring unexpected runtime exception", e);
			} catch (Exception e) {
				LOG.warn("Ignoring unexpected exception", e);
			}
		}
	} finally {
		closeSelector();
		// This will wake up the selector threads, and tell the
		// worker thread pool to begin shutdown.
		if (!reconfiguring) {
			NIOServerCnxnFactory.this.stop();
		}
		LOG.info("accept thread exitted run method");
	}
}

private void select() {
	try {
		/**
		 * 使用多路复用器进行扫描是否有连接创建
		 */
		selector.select();
		/**
		 * 通过SelectionKey获取事件信息
		 */
		Iterator selectedKeys = selector.selectedKeys().iterator();
		while (!stopped && selectedKeys.hasNext()) {
			SelectionKey key = selectedKeys.next();
			selectedKeys.remove();

			if (!key.isValid()) {
				continue;
			}
			/**
			 * 判断是连接事件
			 */
			if (key.isAcceptable()) {
				/**
				 * 执行连接建立
				 */
				if (!doAccept()) {
					// If unable to pull a new connection off the accept
					// queue, pause accepting to give us time to free
					// up file descriptors and so the accept thread
					// doesn't spin in a tight loop.
					pauseAccept(10);
				}
			} else {
				LOG.warn("Unexpected ops in accept select {}", key.readyOps());
			}
		}
	} catch (IOException e) {
		LOG.warn("Ignoring IOException while selecting", e);
	}
}

/**
 * Accept new socket connections. Enforces maximum number of connections
 * per client IP address. Round-robin assigns to selector thread for
 * handling. Returns whether pulled a connection off the accept queue
 * or not. If encounters an error attempts to fast close the socket.
 *
 * @return whether was able to accept a connection or not
 */
private boolean doAccept() {
	boolean accepted = false;
	SocketChannel sc = null;
	try {
		/**
		 * acceptSocket是ServerSocketChannel,通过accept接收一个客户端连接
		 */
		sc = acceptSocket.accept();
		accepted = true;
		if (limitTotalNumberOfCnxns()) {
			throw new IOException("Too many connections max allowed is " + maxCnxns);
		}
		InetAddress ia = sc.socket().getInetAddress();
		int cnxncount = getClientCnxnCount(ia);

		if (maxClientCnxns > 0 && cnxncount >= maxClientCnxns) {
			throw new IOException("Too many connections from " + ia + " - max is " + maxClientCnxns);
		}

		LOG.debug("Accepted socket connection from {}", sc.socket().getRemoteSocketAddress());

		sc.configureBlocking(false);//设置该客户端连接为非阻塞的

		// Round-robin assign this connection to a selector thread
		/**
		 * 从selectorThreads中拿出一个SelectorThread和该客户端连接绑定
		 */
		if (!selectorIterator.hasNext()) {
			selectorIterator = selectorThreads.iterator();
		}
		SelectorThread selectorThread = selectorIterator.next();
		/**
		 * 将该连接绑定到该selectorThread上
		 * sc是SocketChannel,代表了某客户端的连接
		 */
		if (!selectorThread.addAcceptedConnection(sc)) {
			throw new IOException("Unable to add connection to selector queue"
								  + (stopped ? " (shutdown in progress)" : ""));
		}
		acceptErrorLogger.flush();
	} catch (IOException e) {
		// accept, maxClientCnxns, configureBlocking
		ServerMetrics.getMetrics().CONNECTION_REJECTED.add(1);
		acceptErrorLogger.rateLimitLog("Error accepting new connection: " + e.getMessage());
		fastCloseSock(sc);
	}
	return accepted;
}

/**
 * Place new accepted connection onto a queue for adding. Do this
 * so only the selector thread modifies what keys are registered
 * with the selector.
 */
public boolean addAcceptedConnection(SocketChannel accepted) {
	/**
	 * 将一个客户端连接添加到该SelectorThread的队列中
	 */
	if (stopped || !acceptedQueue.offer(accepted)) {
		return false;
	}
	wakeupSelector();
	return true;
}

SelectorThread

        当请求连接处理完成之后,就是读写事件的处理,但是从上面的流程图看,处理具体的读写事件之前还有一个对于读写事件的分发和封装,也就是在SelectorThread 线程中完成的内容。上代码。

        从下面的代码可以看到,当完成NIO的操作之后,也就是selector 的获取selectionKey 和获取读写事件之后,就直接用handleIO 方法将将IO事件封装成一个 IOWorkRequest 对象,最后提交到workerService 里面,在提交任务之前还做了对过期时间的更新,也就是流程图中的ExpiryQueue 队列进行了操作。

        注意,目前上一步也就是serveScoketChannel 监听连接注册的时候,是没有注册socketChannel 到selector 上的,只是将连接封装到了acceptedQueue 队列中,所以这里有一个processAcceptedConnections 方法,目的就是为了取出队列中的socketChannel 将其注册到selector 上,并将这个socketChannel 封装成一个NIOServerCnxn 对象,然后把封装好的NIOServerCnxn作为SelectionKey的attachment属性,这样下次循环的时候就可以监听到具体的读写事件了,这里还做了一件事情就是往ExpiryQueue 队列中添加了任务。

        还有一点需要注意,workerService 并不是一个真正的线程池,本质它只是一个类,而它里面是有一个wokers 属性,这个才是一个真正的线程池,当上面步骤中的将IO事件封装成IOWorkRequest 对象之后提交到workerService 的时候,也就是执行workerPool.schedule(workRequest); 这段代码的时候,本质上是将IOWorkRequest 对象再次封装为一个ScheduledWorkRequest 对象,最后再往wokers 线程池中提交这个任务,这个任务也就是runnable 的实现,也就是ScheduledWorkRequest 对象。

/**
 * The main loop for the thread selects() on the connections and
 * dispatches ready I/O work requests, then registers all pending
 * newly accepted connections and updates any interest ops on the
 * queue.
 */
public void run() {
	try {
		while (!stopped) {
			try {
				/**
				 * selector检测IO事件并执行对应操作
				 *  重要
				 */
				select();
				/**
				 * 处理连接事件:
				 * 1、将新创建的连接注册到selector上,并监听 OP_READ事件
				 * 2、本地缓存所有客户端连接,包括 添加或更新 cnxnExpiryQueue
				 */
				processAcceptedConnections();

				processInterestOpsUpdateRequests();
			} catch (RuntimeException e) {
				LOG.warn("Ignoring unexpected runtime exception", e);
			} catch (Exception e) {
				LOG.warn("Ignoring unexpected exception", e);
			}
		}

		// Close connections still pending on the selector. Any others
		// with in-flight work, let drain out of the work queue.
		for (SelectionKey key : selector.keys()) {
			NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();
			if (cnxn.isSelectable()) {
				cnxn.close(ServerCnxn.DisconnectReason.SERVER_SHUTDOWN);
			}
			cleanupSelectionKey(key);
		}
		SocketChannel accepted;
		while ((accepted = acceptedQueue.poll()) != null) {
			fastCloseSock(accepted);
		}
		updateQueue.clear();
	} finally {
		closeSelector();
		// This will wake up the accept thread and the other selector
		// threads, and tell the worker thread pool to begin shutdown.
		NIOServerCnxnFactory.this.stop();
		LOG.info("selector thread exitted run method");
	}
}

private void select() {
	try {
		/**
		 * 调用NIO selector 的 select方法,
		 */
		selector.select();

		Set selected = selector.selectedKeys();
		ArrayList selectedList = new ArrayList(selected);
		Collections.shuffle(selectedList);
		Iterator selectedKeys = selectedList.iterator();
		while (!stopped && selectedKeys.hasNext()) {
			SelectionKey key = selectedKeys.next();
			selected.remove(key);

			if (!key.isValid()) {
				cleanupSelectionKey(key);
				continue;
			}
			/**
			 * 检测到某客户端连接上有读写事件
			 */
			if (key.isReadable() || key.isWritable()) {
				/**
				 * 去处理IO操作
				 */
				handleIO(key);
			} else {
				LOG.warn("Unexpected ops in select {}", key.readyOps());
			}
		}
	} catch (IOException e) {
		LOG.warn("Ignoring IOException while selecting", e);
	}
}

/**
 * Schedule I/O for processing on the connection associated with
 * the given SelectionKey. If a worker thread pool is not being used,
 * I/O is run directly by this thread.
 */
private void handleIO(SelectionKey key) {
	/**
	 * 将IO事件封装成一个 IOWorkRequest 对象
	 */
	IOWorkRequest workRequest = new IOWorkRequest(this, key);
	/**
	 * 获取对应的SocketChannel
	 * NIOServerCnxn中包装了SocketChannel
	 */
	NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();

	// *** Stop selecting this key while processing on its connection
	cnxn.disableSelectable();
	key.interestOps(0);

	//更新连接过期信息
	touchCnxn(cnxn);
	/**
	 * 将IO事件 提交到 工作线程池中去执行
	 */
	workerPool.schedule(workRequest);
}

/**
 * Iterate over the queue of accepted connections that have been
 * assigned to this thread but not yet placed on the selector.
 */
private void processAcceptedConnections() {
	SocketChannel accepted;
	while (!stopped && (accepted = acceptedQueue.poll()) != null) {
		SelectionKey key = null;
		try {
			/**
			 * 给刚创建的连接注册OP_READ事件监听
			 */
			key = accepted.register(selector, SelectionKey.OP_READ);
			/**
			 * 将SocketChannel封装成一个NIOServerCnxn
			 */
			NIOServerCnxn cnxn = createConnection(accepted, key, this);
			key.attach(cnxn);//把封装好的NIOServerCnxn作为SelectionKey的attachment
			/**
			 * 本地缓存所有客户端连接
			 * 添加或更新客户端连接过期时间 cnxnExpiryQueue
			 * 这里是针对客户端连接的
			 */
			addCnxn(cnxn);
		} catch (IOException e) {
			// register, createConnection
			cleanupSelectionKey(key);
			fastCloseSock(accepted);
		}
	}
}

public void schedule(WorkRequest workRequest, long id) {
	if (stopped) {
		workRequest.cleanup();
		return;
	}

	/**
	 * 将WorkRequest(默认是IOWorkRequest) 包装成一个Runnable
	 */
	ScheduledWorkRequest scheduledWorkRequest = new ScheduledWorkRequest(workRequest);

	// If we have a worker thread pool, use that; otherwise, do the work
	// directly.
	int size = workers.size();
	if (size > 0) {
		try {
			// make sure to map negative ids as well to [0, size-1]
			/**
			 * 获取工作线程池
			 */
			int workerNum = ((int) (id % size) + size) % size;
			ExecutorService worker = workers.get(workerNum);
			/**
			 *  真正提交到Worker thread poll 中去执行
			 */
			worker.execute(scheduledWorkRequest);
		} catch (RejectedExecutionException e) {
			LOG.warn("ExecutorService rejected execution", e);
			workRequest.cleanup();
		}
	} else {
		// When there is no worker thread pool, do the work directly
		// and wait for its completion
		scheduledWorkRequest.run();
	}
}

WorkService

        上面说了WorkService 并不是一个线程,它只是有一个属性为wokers 线程池的类,而这个线程池我们上面可以看到放的其实是ScheduledWorkRequest 线程对象,这个对象就是ruuable 的实现,那么其实我们应该找的线程就是ScheduledWorkRequest 对象。上代码。

        这里其实就已经可以看到IO 事件的真正的读写了,这里操作的就是之前封装的NIOServerCnxn 对象,这里包含了就是对于selectionKey 和socketChannel 的信息。

        注意这里还做了一件事情就是对ExpiryQueue 队列的过期时间进行了更新。

/**
 * 处理每次客户端请求的IO操作
 */
@Override
public void run() {
	try {
		// Check if stopped while request was on queue
		if (stopped) {
			workRequest.cleanup();
			return;
		}
		/**
		 * 处理请求
		 */
		workRequest.doWork();
	} catch (Exception e) {
		LOG.warn("Unexpected exception", e);
		workRequest.cleanup();
	}
}

}

/**
 * 处理请求
 * @throws InterruptedException
 */
public void doWork() throws InterruptedException {
	if (!key.isValid()) {
		selectorThread.cleanupSelectionKey(key);
		return;
	}
	/**
	 * 处理读写IO操作
	 */
	if (key.isReadable() || key.isWritable()) {
		/**
		 * 真正IO操作的方法
		 */
		cnxn.doIO(key);

		// Check if we shutdown or doIO() closed this connection
		if (stopped) {
			cnxn.close(ServerCnxn.DisconnectReason.SERVER_SHUTDOWN);
			return;
		}
		if (!key.isValid()) {
			selectorThread.cleanupSelectionKey(key);
			return;
		}
		//每次请求过来都会更新连接过期时间 cnxnExpiryQueue
		touchCnxn(cnxn);
	}

	// Mark this connection as once again ready for selection
	cnxn.enableSelectable();
	// Push an update request on the queue to resume selecting
	// on the current set of interest ops, which may have changed
	// as a result of the I/O operations we just performed.
	if (!selectorThread.addInterestOpsUpdateRequest(key)) {
		cnxn.close(ServerCnxn.DisconnectReason.CONNECTION_MODE_CHANGED);
	}
}

        总结一下就是,上述的代码其实就是对流程图进行了详细的演示和讲解,真正的读写我们下篇再说,zk 的难点其实也是在那里,这部分内容相对来说是比较简单的。就这样,结束。

你可能感兴趣的:(分布式相关及组件,#,zookeeper,zookeeper)