BroadcastManager用于将配置信息和序列化后的RDD、Job以及ShuffleDependency等信息在本地存储。如果为了容灾,也会复制到其他节点上。创建BroadcastManager的代码实现如下。
val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)
BroadcastManager除了构造器定义的三个成员属性外,BroadcastManager内部还有三个成员,分别是:
BroadcastManager在其初始化的过程中就会调用自身的initialize方法,当initialize执行完毕,BroadcastManager就正式生效。BroadcastManager的initialize方法的实现见代码清单1。
代码清单1 BroadcastManager的初始化
private def initialize() {
synchronized {
if (!initialized) {
broadcastFactory = new TorrentBroadcastFactory
broadcastFactory.initialize(isDriver, conf, securityManager)
initialized = true
}
}
}
根据代码清单1,initialize方法首先判断BroadcastManager是否已经初始化,以保证BroadcastManager只被初始化一次。新建TorrentBroadcastFactory作为BroadcastManager的广播工厂实例。之后调用TorrentBroadcastFactory的initialize方法对TorrentBroadcastFactory进行初始化[1]。最后将BroadcastManager自身标记为初始化完成状态。
注意:TorrentBroadcastFactory实现了BroadcastFactory特质。在Spark 1.x.x版本中,BroadcastManager的initialize方法是使用Java反射生成广播工厂实例broadcastFactory的,还可以通过配置属性spark.broadcast.factory指定BroadcastFactory特质的实现类,默认为org.apache.spark. broadcast.TorrentBroadcastFactory。从Spark 2.0.0版本开始,不再提供此Spark属性,属性成员broadcastFactory也固定为TorrentBroadcastFactory。
BroadcastManager中提供了三个方法,见代码清单2。
代码清单2 BroadcastManager中的三个方法
def stop() {
broadcastFactory.stop()
}
private val nextBroadcastId = new AtomicLong(0)
def newBroadcast[T: ClassTag](value_ : T, isLocal: Boolean): Broadcast[T] = {
broadcastFactory.newBroadcast[T](value_, isLocal, nextBroadcastId.getAndIncrement())
}
def unbroadcast(id: Long, removeFromDriver: Boolean, blocking: Boolean) {
broadcastFactory.unbroadcast(id, removeFromDriver, blocking)
}
从代码清单2可以看到BroadcastManager的三个方法都分别代理了TorrentBroadcastFactory的对应方法,TorrentBroadcastFactory中提供的三个方法的实现见代码清单3。
代码清单3 TorrentBroadcastFactory提供的方法
override def newBroadcast[T: ClassTag](value_ : T, isLocal: Boolean, id: Long): Broadcast[T] = {
new TorrentBroadcast[T](value_, id)
}
override def stop() { }
override def unbroadcast(id: Long, removeFromDriver: Boolean, blocking: Boolean) {
TorrentBroadcast.unpersist(id, removeFromDriver, blocking)
}
代码清单3中TorrentBroadcastFactory提供的三个方法,由于stop是空实现,所以我们只关注newBroadcast和unbroadcast两个方法。
根据代码清单3,我们知道TorrentBroadcastFactory的newBroadcast方法用于生成TorrentBroadcast实例,其作用为广播TorrentBroadcast中的value。表面看只是利用构造器生成了TorrentBroadcast实例,但是其效果远不止此。TorrentBroadcast对象包括以下属性:
case class BroadcastBlockId(broadcastId: Long, field: String = "") extends BlockId {
override def name: String = "broadcast_" + broadcastId + (if (field == "") "" else "_" + field)
}
其中broadcastId是由BroadcastManager的原子变量nextBroadcastId自增产生。
刚才说到在构造TorrentBroadcast实例的时候就会调用writeBlocks方法,其实现见代码清单4。
代码清单4 writeBlocks的实现
private def writeBlocks(value: T): Int = {
import StorageLevel._
val blockManager = SparkEnv.get.blockManager
if (!blockManager.putSingle(broadcastId, value, MEMORY_AND_DISK, tellMaster = false)) {
throw new SparkException(s"Failed to store $broadcastId in BlockManager")
}
val blocks =
TorrentBroadcast.blockifyObject(value, blockSize, SparkEnv.get.serializer, compressionCodec)
if (checksumEnabled) {
checksums = new Array[Int](blocks.length)
}
blocks.zipWithIndex.foreach { case (block, i) =>
if (checksumEnabled) {
checksums(i) = calcChecksum(block)
}
val pieceId = BroadcastBlockId(id, "piece" + i)
val bytes = new ChunkedByteBuffer(block.duplicate())
if (!blockManager.putBytes(pieceId, bytes, MEMORY_AND_DISK_SER, tellMaster = true)) {
throw new SparkException(s"Failed to store $pieceId of $broadcastId in local BlockManager")
}
}
blocks.length
}
根据代码清单4,writeBlocks的执行步骤如下:
经过以上分析,最后用图1来更直观的表示广播对象的写入过程。
前文提到,只有当TorrentBroadcast实例的_value属性值在需要的时候,才会调用readBroadcastBlock方法获取值,readBroadcastBlock的实现见代码清单5。
代码清单5 readBroadcastBlock的实现
private def readBroadcastBlock(): T = Utils.tryOrIOException {
TorrentBroadcast.synchronized {
setConf(SparkEnv.get.conf)
val blockManager = SparkEnv.get.blockManager
blockManager.getLocalValues(broadcastId) match {
case Some(blockResult) =>
if (blockResult.data.hasNext) {
val x = blockResult.data.next().asInstanceOf[T]
releaseLock(broadcastId)
x
} else {
throw new SparkException(s"Failed to get locally stored broadcast data: $broadcastId")
}
case None =>
logInfo("Started reading broadcast variable " + id)
val startTimeMs = System.currentTimeMillis()
val blocks = readBlocks().flatMap(_.getChunks())
logInfo("Reading broadcast variable " + id + " took" + Utils.getUsedTimeMs(startTimeMs))
val obj = TorrentBroadcast.unBlockifyObject[T](
blocks, SparkEnv.get.serializer, compressionCodec)
val storageLevel = StorageLevel.MEMORY_AND_DISK
if (!blockManager.putSingle(broadcastId, obj, storageLevel, tellMaster = false)) {
throw new SparkException(s"Failed to store $broadcastId in BlockManager")
}
obj
}
}
}
根据代码清单5,readBroadcastBlock的执行步骤如下:
上文谈到调用readBlocks方法可以从Driver、Executor的存储体系中获取块,其实现见代码清单6。
代码清单6 readBlocks的实现
private def readBlocks(): Array[ChunkedByteBuffer] = {
val blocks = new Array[ChunkedByteBuffer](numBlocks)
val bm = SparkEnv.get.blockManager
for (pid <- Random.shuffle(Seq.range(0, numBlocks))) {
val pieceId = BroadcastBlockId(id, "piece" + pid)
logDebug(s"Reading piece $pieceId of $broadcastId")
bm.getLocalBytes(pieceId) match {
case Some(block) =>
blocks(pid) = block
releaseLock(pieceId)
case None =>
bm.getRemoteBytes(pieceId) match {
case Some(b) =>
if (checksumEnabled) {
val sum = calcChecksum(b.chunks(0))
if (sum != checksums(pid)) {
throw new SparkException(s"corrupt remote block $pieceId of $broadcastId:" +
s" $sum != ${checksums(pid)}")
}
}
// We found the block from remote executors/driver's BlockManager, so put the block
// in this executor's BlockManager.
if (!bm.putBytes(pieceId, b, StorageLevel.MEMORY_AND_DISK_SER, tellMaster = true)) {
throw new SparkException(
s"Failed to store $pieceId of $broadcastId in local BlockManager")
}
blocks(pid) = b
case None =>
throw new SparkException(s"Failed to get $pieceId of $broadcastId")
}
}
}
blocks
}
根据代码清单6,readBlocks方法的执行步骤如下:
经过以上的分析,最后用图2来更直观的表示广播对象的读取过程。
根据代码清单3,我们知道TorrentBroadcastFactory的unbroadcast方法实际调用了TorrentBroadcast的unpersist方法对由id标记的广播对象去持久化。TorrentBroadcast的unpersist方法的实现,见代码清单7。
代码清单7 对广播对象去持久化
def unpersist(id: Long, removeFromDriver: Boolean, blocking: Boolean): Unit = {
logDebug(s"Unpersisting TorrentBroadcast $id")
SparkEnv.get.blockManager.master.removeBroadcast(id, removeFromDriver, blocking)
}
根据代码清单7,可以看到TorrentBroadcast的unpersist方法实际调用了BlockManager的子组件BlockManagerMaster的removeBroadcast方法来实现对广播对象去持久化,有关BlockManagerMaster的具体介绍请参阅《Spark内核设计的艺术》一书的6.8节。
[1] 笔者查阅TorrentBroadcastFactory的源码后,发现TorrentBroadcastFactory的Initialize方法实际是一个空实现,所以这里不作介绍。