Spark定义了一个特质ListenerBus,可以接收事件并且将事件提交到对应事件的监听器,其代码如下:
private[spark] trait ListenerBus[L <: AnyRef, E] extends Logging {
private[spark] val listeners = new CopyOnWriteArrayList[L]
final def addListener(listener: L): Unit = {
listeners.add(listener)
}
final def removeListener(listener: L): Unit = {
listeners.remove(listener)
}
final def postToAll(event: E): Unit = {
val iter = listeners.iterator
while (iter.hasNext) {
val listener = iter.next()
try {
doPostEvent(listener, event)
} catch {
case NonFatal(e) =>
logError(s"Listener ${Utils.getFormattedClassName(listener)} threw an exception", e)
}
}
}
protected def doPostEvent(listener: L, event: E): Unit
private[spark] def findListenersByClass[T <: L : ClassTag](): Seq[T] = {
val c = implicitly[ClassTag[T]].runtimeClass
listeners.asScala.filter(_.getClass == c).map(_.asInstanceOf[T]).toSeq
}
}
ListenerBus是个泛型特质,其泛型参数为[L <: AnyRef, E],其中L是代表监听器的泛型参数,可以看到ListenerBus支持任何类型的监听器,E是代表事件的泛型参数。ListenerBus中各个成员的作用如下:
ListenerBus的类继承体系:
SparkListenerBus也有如下两种实现:
private[spark] trait SparkListenerBus
extends ListenerBus[SparkListenerInterface, SparkListenerEvent] {
protected override def doPostEvent(
listener: SparkListenerInterface,
event: SparkListenerEvent): Unit = {
event match {
case stageSubmitted: SparkListenerStageSubmitted =>
listener.onStageSubmitted(stageSubmitted)
case stageCompleted: SparkListenerStageCompleted =>
listener.onStageCompleted(stageCompleted)
case jobStart: SparkListenerJobStart =>
listener.onJobStart(jobStart)
case jobEnd: SparkListenerJobEnd =>
listener.onJobEnd(jobEnd)
case taskStart: SparkListenerTaskStart =>
listener.onTaskStart(taskStart)
case taskGettingResult: SparkListenerTaskGettingResult =>
listener.onTaskGettingResult(taskGettingResult)
case taskEnd: SparkListenerTaskEnd =>
listener.onTaskEnd(taskEnd)
case environmentUpdate: SparkListenerEnvironmentUpdate =>
listener.onEnvironmentUpdate(environmentUpdate)
case blockManagerAdded: SparkListenerBlockManagerAdded =>
listener.onBlockManagerAdded(blockManagerAdded)
case blockManagerRemoved: SparkListenerBlockManagerRemoved =>
listener.onBlockManagerRemoved(blockManagerRemoved)
case unpersistRDD: SparkListenerUnpersistRDD =>
listener.onUnpersistRDD(unpersistRDD)
case applicationStart: SparkListenerApplicationStart =>
listener.onApplicationStart(applicationStart)
case applicationEnd: SparkListenerApplicationEnd =>
listener.onApplicationEnd(applicationEnd)
case metricsUpdate: SparkListenerExecutorMetricsUpdate =>
listener.onExecutorMetricsUpdate(metricsUpdate)
case executorAdded: SparkListenerExecutorAdded =>
listener.onExecutorAdded(executorAdded)
case executorRemoved: SparkListenerExecutorRemoved =>
listener.onExecutorRemoved(executorRemoved)
case blockUpdated: SparkListenerBlockUpdated =>
listener.onBlockUpdated(blockUpdated)
case logStart: SparkListenerLogStart => // ignore event log metadata
case _ => listener.onOtherEvent(event)
}
}
}
SparkListenerBus已经实现了ListenerBus的doPostEvent方法,通过对SparkListenerEvent事件的匹配,执行SparkListenerInterface监听器的相应方法。
SparkListenerEvent其实也是个特质,列出的SparkListenerSageSubmitted、SparkListenerStageCompleted等都是继承了SparkListenerEvent特质的样例类。
@DeveloperApi
@JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = JsonTypeInfo.As.PROPERTY, property = "Event")
trait SparkListenerEvent {
protected[spark] def logEvent: Boolean = true
}
@DeveloperApi
case class SparkListenerStageSubmitted(stageInfo: StageInfo, properties: Properties = null)
extends SparkListenerEvent
@DeveloperApi
case class SparkListenerStageCompleted(stageInfo: StageInfo) extends SparkListenerEvent
@DeveloperApi
case class SparkListenerTaskStart(stageId: Int, stageAttemptId: Int, taskInfo: TaskInfo)
extends SparkListenerEvent
LiveListenerBus继承了SparkListenerBus,并实现了将事件异步投递给监听器,达到实时刷新UI界面数据的效果。LiveListenerBus主要由以下部分组成:
3.1 异步事件处理线程
listenerThread用于异步处理eventQueue中的事件
private val listenerThread = new Thread(name) {
setDaemon(true)
override def run(): Unit = Utils.tryOrStopSparkContext(sparkContext) {
LiveListenerBus.withinListenerThread.withValue(true) {
while (true) {
eventLock.acquire() //获取信号量
self.synchronized {
processingEvent = true
}
try {
val event = eventQueue.poll
if (event == null) {
if (!stopped.get) {
throw new IllegalStateException("Polling `null` from eventQueue means" +
" the listener bus has been stopped. So `stopped` must be true")
}
return
}
postToAll(event) //事件处理
} finally {
self.synchronized {
processingEvent = false
}
}
}
}
}
}
listenerThread的工作步骤如下:
3.2 LiveListenerBus的消息投递
在解释了异步线程listenerThread的工作内容后,还有一个要点没有解释:eventQueue中的事件是如何放进去的呢?由于eventQueue定义在LiveListenerBus中,因此ListenerBus和SparkListenerBus中没有操纵eventQueue的方法,要将事件放入eventQueue,只能依靠LiveListenerBus自己,其post方法就是为此目的而创建
def post(event: SparkListenerEvent): Unit = {
if (stopped.get) {
logError(s"$name has already stopped! Dropping event $event")
return
}
val eventAdded = eventQueue.offer(event) //向eventQueue中添加事件
if (eventAdded) {
eventLock.release()
} else {
onDropEvent(event)
droppedEventsCounter.incrementAndGet()
}
//打印删除事件数的日志
val droppedEvents = droppedEventsCounter.get
if (droppedEvents > 0) {
if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
if (droppedEventsCounter.compareAndSet(droppedEvents, 0)) {
val prevLastReportTimestamp = lastReportTimestamp
lastReportTimestamp = System.currentTimeMillis()
logWarning(s"Dropped $droppedEvents SparkListenerEvents since " +
new java.util.Date(prevLastReportTimestamp))
}
}
}
}
其运行步骤如下:
3.3 LiveListenerBus与监听器
与LiveListenerBus配合使用的监听器,并非是父类SparkListenerBus的类型参数SparkListenerInterface,而是继承自SparkListenerInterface的SparkListener及其子类。SparkListener虽然实现了SparkListenerInterface中的每个方法,但是其实都是空实现,具体的实现需要交给子类去完成。
LiveListenerBus的工作流程图:
图中的DAGScheduler、SparkContext、BlockManagerMasterEndpoint、DriverEndpoint及LocalSchedulerBackend都是LiveListenerBus的事件来源,它们都是通过调用LiveListenerBus的post方法将消息交给异步线程listenerThread进行处理的。