AbstractServerThread
Acceptor和Processor都继承了AbstractServerThread,AbstractServerThread是实现了Runnable接口的抽象类。AbstractServerThread为Acceptor和Processor都提供了相关启动关闭的控制类方法。
AbstractServerThread重要的字段:
- alive:表示当前线程是否存活,在初始化时设置为true,在shutdown()方法中会将alive设置为false。
- shutdownLatch:count为1的CountDownLatch对象,标识了当前线程的shutdown操作是否完成。
- startupLatch:count为1的CountDownLatch对象,标识了当前线程的startup操作是否完成.
- 在awaitStartup()和shutdown()方法中会调用CountDownLatch.await()方法,阻塞等待启动和关闭操作完成。在startupComplete()和shutdownComplete()方法中调用CountDownLatch.countDown()方法,唤醒阻塞的线程。
- connectionQuotas: 在close()方法中,根据传入的connectionId,关闭SocketChannel并减少connectionQuotas中记录的连接数。
AbstractServerThread中比较常用的方法:
/**
* A base class with some helper variables and methods
*/
private[kafka] abstract class AbstractServerThread(connectionQuotas: ConnectionQuotas) extends Runnable with Logging {
private val startupLatch = new CountDownLatch(1)
private val shutdownLatch = new CountDownLatch(1)
private val alive = new AtomicBoolean(true)
//抽象方法
def wakeup()
/**
* Initiates a graceful shutdown by signaling to stop and waiting for the shutdown to complete
*/
def shutdown(): Unit = {
alive.set(false)//修改运行状态
wakeup()//唤醒当前的AbstractServerThread
shutdownLatch.await()//阻塞等待关闭完成
}
/**
* Wait for the thread to completely start up
* 阻塞等待启动操作完成
*/
def awaitStartup(): Unit = startupLatch.await
/**
* 标明启动操作完成,同时唤醒被阻塞的线程
* Record that the thread startup is complete
*/
protected def startupComplete() = {
startupLatch.countDown()
}
/**
* Record that the thread shutdown is complete
* 阻塞等待关闭操作完成
*/
protected def shutdownComplete() = shutdownLatch.countDown()
/**
* Is the server still running?
*/
protected def isRunning = alive.get
/**
* 关闭指定连接
* Close the connection identified by `connectionId` and decrement the connection count.
*/
def close(selector: KSelector, connectionId: String) {
val channel = selector.channel(connectionId)
if (channel != null) {
debug(s"Closing selector connection $connectionId")
val address = channel.socketAddress
if (address != null)
connectionQuotas.dec(address)//修改connectionQuotas记录的连接数
selector.close(connectionId)//关闭连接
}
}
/**
* Close `channel` and decrement the connection count.
*/
def close(channel: SocketChannel) {
if (channel != null) {
debug("Closing connection from " + channel.socket.getRemoteSocketAddress())
connectionQuotas.dec(channel.socket.getInetAddress)
swallowError(channel.socket().close())
swallowError(channel.close())
}
}
}
Acceptor
Acceptor的功能是接收客户端建立连接的请求,创建Socket连接并分配给Processor处理。
重要字段
- Java nio 的selector。
- 用于接受客户端请求的ServerSocketChannel对象。
在创建Acceptor时会初始化上面两个字段,同时还会创建并启动其管理的Processors线程。
/**
* Thread that accepts and configures new connections. There is one of these per endpoint.
*/
private[kafka] class Acceptor(val endPoint: EndPoint,
val sendBufferSize: Int,
val recvBufferSize: Int,
brokerId: Int,
processors: Array[Processor],
connectionQuotas: ConnectionQuotas) extends AbstractServerThread(connectionQuotas) with KafkaMetricsGroup {
//创建nioSelector
private val nioSelector = NSelector.open()
//创建ServerSocketChannel
val serverChannel = openServerSocket(endPoint.host, endPoint.port)
this.synchronized {//同步
//为其对应的每个Processor都创建对应的线程并启动
processors.foreach { processor =>
Utils.newThread("kafka-network-thread-%d-%s-%d".format(brokerId, endPoint.protocolType.toString, processor.id), processor, false).start()
}
}
Acceptor.run()方法是Acceptor的核心逻辑,其中完成了对OP_ACCEPT事件的处理。代码如下:
/**
* Accept loop that checks for new connection attempts
*/
def run() {
//注册OP_ACCEPT事件
serverChannel.register(nioSelector, SelectionKey.OP_ACCEPT)
startupComplete()//识别当前线程启动线程已经完成
try {
var currentProcessor = 0
while (isRunning) {//检测线程运行状态
try {
val ready = nioSelector.select(500)//等待关注的事件
if (ready > 0) {
val keys = nioSelector.selectedKeys()
val iter = keys.iterator()
while (iter.hasNext && isRunning) {
try {
val key = iter.next
iter.remove()
if (key.isAcceptable)//调用accept()方法处理OP_ACCEPT事件
accept(key, processors(currentProcessor))
else//如果不是OP_ACCEPT事件,就报错
throw new IllegalStateException("Unrecognized key state for acceptor thread.")
// round robin to the next processor thread
//更新currentProcessor,这里使用了Round-Robin的方法选择Processor
currentProcessor = (currentProcessor + 1) % processors.length
} catch {
case e: Throwable => error("Error while accepting connection", e)
}
}
}
}
catch {
// We catch all the throwables to prevent the acceptor thread from exiting on exceptions due
// to a select operation on a specific channel or a bad request. We don't want the
// the broker to stop responding to requests from other clients in these scenarios.
case e: ControlThrowable => throw e
case e: Throwable => error("Error occurred", e)
}
}
} finally {
debug("Closing server socket and selector.")
swallowError(serverChannel.close())
swallowError(nioSelector.close())
shutdownComplete()//线程关闭已经完成
}
}
Acceptor.accept()方法实现了OP_ACCEPT事件的处理,它会创建SocketChannel并将其交给Processor.accept()方法处理,同时还会增加ConnectionQuotas中记录的连接数。accept()方法的代码如下:
/*
* Accept a new connection
*/
def accept(key: SelectionKey, processor: Processor) {
val serverSocketChannel = key.channel().asInstanceOf[ServerSocketChannel]
val socketChannel = serverSocketChannel.accept()//创建SocketChannel
try {
//增加connectionQuotas中记录的连接数
connectionQuotas.inc(socketChannel.socket().getInetAddress)
socketChannel.configureBlocking(false)
socketChannel.socket().setTcpNoDelay(true)
socketChannel.socket().setKeepAlive(true)
socketChannel.socket().setSendBufferSize(sendBufferSize)
debug("Accepted connection from %s on %s and assigned it to processor %d, sendBufferSize [actual|requested]: [%d|%d] recvBufferSize [actual|requested]: [%d|%d]"
.format(socketChannel.socket.getRemoteSocketAddress, socketChannel.socket.getLocalSocketAddress, processor.id,
socketChannel.socket.getSendBufferSize, sendBufferSize,
socketChannel.socket.getReceiveBufferSize, recvBufferSize))
//将SocketChannel交给processor处理
processor.accept(socketChannel)
} catch {
case e: TooManyConnectionsException =>
info("Rejected connection from %s, address already has the configured maximum of %d connections.".format(e.ip, e.count))
close(socketChannel)//关闭socketChannel
}
}
Processor
Processor主要用于完成读取请求和写回响应的操作,Processor不参与具体业务逻辑的处理。Processor的字段如下,在创建Processor对象时会初始化这些字段。
*newConnection: ConcurrentLinkedQueue[SocketChannel]类型,其中保存了由此Processor处理的新建的SocketChannel。
- inflightResponses:保存未发送的响应。inflightResponses和客户端的InFlightRequest有些类似,但是也是有区别的,客户端不会对服务端发送的响应消息再次发送确认,所以inflightResponse中的响应会在发送成功后移除,但是InFlightRequest中的请求是在收到响应后才移除。
- selector: KSelector类型,负责管理网络连接。
- requestChannel: Processor与Handler线程之间传递数据的队列。
在Acceptor.accept()方法中创建的SocketChannel会通过Processor.accept()方法交给Processor进行处理。Processor.accept()方法接受到一个新的SocketChannel时会先将其放入newConnections队列中,然后会唤醒Processor线程来处理newConnections队列。newConnections队列是被Acceptor线程和Processor线程并发操作的所以选择ConcurrentLinkedQueue。下面是accept()方法的代码:
/**
* Queue up a new connection for reading
*/
def accept(socketChannel: SocketChannel) {
//将SocketChannel放入newConnections队列中
newConnections.add(socketChannel)
//通过调用wakeup()方法实现,最终调用java nio Selector的wakeup()方法
wakeup()
}
在Processor.run()方法中实现了从网络连接上读写数据的功能。run()方法流程:
1)首先调用startupComplete()方法,标识Processor的初始化流程已经结束,唤醒阻塞等待此Processor初始化完成的线程。
2)处理newConnection队列中的新建SocketChannel。队列中的每个SocketChannel都会在nioSelector上注册OP_READ事件。SocketChannel会被封装成KafkaChannel,并附加(attach)到SelectionKey上,以后触发OP_READ事件时,从SelectionKey上获取的是KafkaChannel类型的对象。下面是configureNewConnections()方法的代码:
/**
* Register any new connections that have been queued up
*/
private def configureNewConnections() {
while (!newConnections.isEmpty) {//遍历newConnections队列
val channel = newConnections.poll()
try {
debug(s"Processor $id listening to new connection from ${channel.socket.getRemoteSocketAddress}")
val localHost = channel.socket().getLocalAddress.getHostAddress
val localPort = channel.socket().getLocalPort
val remoteHost = channel.socket().getInetAddress.getHostAddress
val remotePort = channel.socket().getPort
//根据localHost, localPort, remoteHost, remotePort的获取创建connectionId
val connectionId = ConnectionId(localHost, localPort, remoteHost, remotePort).toString
selector.register(connectionId, channel)//注册OP_READ事件
} catch {
// We explicitly catch all non fatal exceptions and close the socket to avoid a socket leak. The other
// throwables will be caught in processor and logged as uncaught exceptions.
case NonFatal(e) =>
// need to close the channel here to avoid a socket leak.
close(channel)
error(s"Processor $id closed connection from ${channel.getRemoteAddress}", e)
}
}
}
3)获取RequestChannel中对应的responseQueue队列,并处理其中缓存的response。
如果Response是SendAction类型,表示该response需要发送给客户端,则寻找对应的KafkaChannel,为其注册OP_WRITE事件,并将KafkaChannel.send字段指向待发送的Response对象。同时还将response从responseQueue队列中移出,放入到inflightResponses中。发送完一个完整的响应后,会取消连接注册的OP_WRITE事件。
如果response是NoOpAction类型,表示连接暂时没有响应可以发送,则为KafkaChannel注册OP_READ,允许其继续读取请求。
如果Response是CloseConnectionAction类型,则关闭对应的连接。
processNewResponses()方法的代码:
private def processNewResponses() {
/*
在RequestChannel中使用Processor的Id绑定与responseQueue的对应关系
获取对应的responseQueue中的响应
*/
var curr = requestChannel.receiveResponse(id)
while (curr != null) {
try {
curr.responseAction match {
//没有响应需要发送给客户端
case RequestChannel.NoOpAction =>
// There is no response to send to the client, we need to read more pipelined requests
// that are sitting in the server's socket buffer
curr.request.updateRequestMetrics
trace("Socket server received empty response to send, registering for read: " + curr)
//注册OP_READ事件
selector.unmute(curr.request.connectionId)
//该响应需要发送给客户端
case RequestChannel.SendAction =>
//调用KSelector.send()方法,并将响应放入inflightResponse队列缓存
sendResponse(curr)
case RequestChannel.CloseConnectionAction =>
curr.request.updateRequestMetrics
trace("Closing socket connection actively according to the response code.")
close(selector, curr.request.connectionId)
}
} finally {
curr = requestChannel.receiveResponse(id)//继续处理responseQueue
}
}
}
4)调用SocketServer.poll()方法读取请求,发送响应。poll()方法底层调用KSelector.poll()方法。
private def poll() {
try selector.poll(300)
catch {
case e @ (_: IllegalStateException | _: IOException) =>
error(s"Closing processor $id due to illegal state or IO exception")
swallow(closeAll())
shutdownComplete()
throw e
}
}
KSelector.poll()方法每次调用都会将读取的请求,发送成功的请求以及断开的连接放入completedReceives,completedSends,disconnected队列中等待处理,下一步是处理相应的队列。
5)调用processCompletedReceives()方法处理KSelector.completedReceives队列。首先,遍历completedReceives,将NetworkReceive,ProcessorId,身份认证信息一起封装成RequestChannel.requestQueue队列中,等待Handler线程的后续处理。之后取消对应KafkaChannel注册的OP_READ事件,表示在发送请求前这个连接不能再读取任何请求了。
private def processCompletedReceives() {
//遍历completedReceives队列
selector.completedReceives.asScala.foreach { receive =>
try {
//获取对应的 KafkaChannel
val channel = selector.channel(receive.source)
//创建KafkaChannel对应的session对象,与权限控制相关
val session = RequestChannel.Session(new KafkaPrincipal(KafkaPrincipal.USER_TYPE, channel.principal.getName),
channel.socketAddress)
//将NetworkReceive,ProcessId,身份认证信息封装成RequestChannel.Request对象
val req = RequestChannel.Request(processor = id, connectionId = receive.source, session = session, buffer = receive.payload, startTimeMs = time.milliseconds, securityProtocol = protocol)
//将RequestChannel.Request放入RequestChannel.requestQueue队列中等待处理
requestChannel.sendRequest(req)
//取消注册的OP_READ事件,连接不再读取数据
selector.mute(receive.source)
} catch {
case e @ (_: InvalidRequestException | _: SchemaException) =>
// note that even though we got an exception, we can assume that receive.source is valid. Issues with constructing a valid receive object were handled earlier
error(s"Closing socket for ${receive.source} because of error", e)
close(selector, receive.source)
}
}
}
6)调用processCompletedSends()方法处理KSelector.completedSends队列。首先,将inflightResponses中保存的对应Response删除。然后,为对应的连接重新注册OP_READ事件,允许从该连接读取数据。
private def processCompletedSends() {
//遍历completedSends队列
selector.completedSends.asScala.foreach { send =>
//这个响应已经发送出去了,从inflightResponses删除
val resp = inflightResponses.remove(send.destination).getOrElse {
throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`")
}
resp.request.updateRequestMetrics()
selector.unmute(send.destination)//为对应的连接重新注册OP_READ事件,允许从该连接读取数据。
}
}
7)调用processDisconnected()方法处理KSelector.disconnected队列。先从inflightResponses中删除该连接对应的所有Response。然后,减少ConnectionQuotas中对应记录的连接数,为后续的新建连接做准备。
private def processDisconnected() {
//遍历selector.disconnected队列
selector.disconnected.asScala.foreach { connectionId =>
val remoteHost = ConnectionId.fromString(connectionId).getOrElse {
throw new IllegalStateException(s"connectionId has unexpected format: $connectionId")
}.remoteHost
//从InflightResponses中删除该连接对应的Response
inflightResponses.remove(connectionId).foreach(_.request.updateRequestMetrics())
// the channel has been closed by the selector but the quotas still need to be updated
//减少ConnectionQuotas中对应记录的连接数
connectionQuotas.dec(InetAddress.getByName(remoteHost))
}
}
8)当调用SocketServer.shutdown()关闭整个SocketServer时,将alive字段设置为false,这样上述循环结束。然后调用shutdownComplete()方法执行一系列关闭操作:关闭Process管理的全部连接,减少ConnectionQuotas中记录的连接数量,同时标识关闭流程已经结束,唤醒阻塞在等待该Processor结束的线程。
Run方法的代码:
override def run() {
startupComplete()//1.标识Processor的初始化流程已经结束,唤醒阻塞等待此Processor初始化完成的线程。
while (isRunning) {//检验alive字段标识的运行状态
try {
// setup any new connections that have been queued up
// 2.处理newConnection队列中的新建SocketChannel。队列中的每个SocketChannel都会在nioSelector上注册OP_READ事件。
configureNewConnections()
// register any new responses for writing
// 3.获取RequestChannel中对应的responseQueue队列,并处理其中缓存的response。
processNewResponses()
//4.SocketServer.poll()方法读取请求,发送响应。poll()方法底层调用KSelector.poll()方法。
poll()
//5.处理KSelector.completedReceives队列。
processCompletedReceives()
//6.处理KSelector.completedSends队列。
processCompletedSends()
//7.处理KSelector.disconnected队列。
processDisconnected()
} catch {
// We catch all the throwables here to prevent the processor thread from exiting. We do this because
// letting a processor exit might cause a bigger impact on the broker. Usually the exceptions thrown would
// be either associated with a specific socket channel or a bad request. We just ignore the bad socket channel
// or request. This behavior might need to be reviewed if we see an exception that need the entire broker to stop.
case e: ControlThrowable => throw e
case e: Throwable =>
error("Processor got uncaught exception.", e)
}
}
debug("Closing selector - processor " + id)
swallowError(closeAll())
shutdownComplete()//8.一系列关闭操作
}