Spark Streaming源码解读之Driver容错安全性
概述
Driver容错三个层面:
1. 数据层面: ReceivedBlockTracker负责管理Spark Streaming应用的元数据。
2. 逻辑层面: DStream
3. 作业调度层面,JobGenerator是Job调度层面的,负责监控具体调度到什么程度了。
源码分析
先进入ReceivedBlockTracker (ReceivedBlockTracker.scala 55-71)
/**
* Class that keep track of all the received blocks, and allocate them to batches * when required. All actions taken by this class can be saved to a write ahead log * (if a checkpoint directory has been provided), so that the state of the tracker * (received blocks and block-to-batch allocations) can be recovered after driver failure. * * Note that when any instance of this class is created with a checkpoint directory, * it will try reading events from logs in the directory. */ private[streaming] class ReceivedBlockTracker( conf: SparkConf, hadoopConf: Configuration, streamIds: Seq[Int], clock: Clock, recoverFromWriteAheadLog: Boolean, checkpointDirOption: Option[String]) extends Logging
其中receoverFromWriteAheadLog是其采用WAL的明证。
ReceivedBlockTracker的重要作用从其代码注释中可见一斑。
下面进入ReceiverBlockTracker接收并处理数据的代码部分
ReceiverBlockTracker.addBlock源码如下 (ReceiverBlockTracker.scala 87-106)
def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = { try { val writeResult = writeToLog(BlockAdditionEvent(receivedBlockInfo)) if (writeResult) { synchronized { getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo } logDebug(s"Stream ${receivedBlockInfo.streamId} received " + s"block ${receivedBlockInfo.blockStoreResult.blockId}") } else { logDebug(s"Failed to acknowledge stream ${receivedBlockInfo.streamId} receiving " + s"block ${receivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log.") } writeResult } catch { case NonFatal(e) => logError(s"Error adding block $receivedBlockInfo", e) false } }
ReceiverBlockTracker在收到元数据信息后就直接通过WAL进行容错, 成功后然后才会写进内存中。
下面考察一下allocateBlocksToBatch的源码 (receiverBlockTracker.scala 112-134)
def allocateBlocksToBatch(batchTime: Time): Unit = synchronized { if (lastAllocatedBatchTime == null || batchTime > lastAllocatedBatchTime) { val streamIdToBlocks = streamIds.map { streamId => (streamId, getReceivedBlockQueue(streamId).dequeueAll(x => true)) }.toMap val allocatedBlocks = AllocatedBlocks(streamIdToBlocks) if (writeToLog(BatchAllocationEvent(batchTime, allocatedBlocks))) { timeToAllocatedBlocks.put(batchTime, allocatedBlocks) lastAllocatedBatchTime = batchTime } else { logInfo(s"Possibly processed batch $batchTime needs to be processed again in WAL recovery") } } else { // This situation occurs when: // 1. WAL is ended with BatchAllocationEvent, but without BatchCleanupEvent, // possibly processed batch job or half-processed batch job need to be processed again, // so the batchTime will be equal to lastAllocatedBatchTime. // 2. Slow checkpointing makes recovered batch time older than WAL recovered // lastAllocatedBatchTime. // This situation will only occurs in recovery time. logInfo(s"Possibly processed batch $batchTime needs to be processed again in WAL recovery") } }
其中writeToLog就是写入WAL操作。
而allocatedBlocks就是根据时间获取的一批元数据,交给相应的job使用。
这就意味着在Job使用元数据前先进行WAL,如果job出错恢复后,可以知道数据计算到什么位置
未完待续