Spark1.6-----源码解读之BlockManager

前面已经介绍了BlockManager的主要组件,现在来看看BlockManager自身的实现。

dropFromMemory:当memory sotre的空间满了之后会调用这个方法,它会抛弃一些block块。

reportBlockStatus:向BlockManagerMaster报告Block的状态

putSingle:将一个由对象构成的Block写入存储系统

putBytes:将序列化字节组写入内存

doPut:putSingle,putBytes实际的写入操作

replicate:数据块备份方法

getDiskWriter:创建DiskBlockObjectWriter。DiskBlockObjectWriter用于输出spark中间计算结果

getBlockData:获取本地Block数据

doGetRemote:获取远程Block数据

get:获取Block数据

移除内存方法dropFromMemory

内存不足时,可能需要腾出部分内存空间,dropFromMemory就实现了这样的方法。

在BlockManager 1020行代码实现:

  //当memory sotre的空间满了之后会调用这个方法,它会抛弃一些block块,
 //如果block的存储级别设置为能放到磁盘的话
 //,该方法就会将bolck放在磁盘中,否则就永久抛弃。
 def dropFromMemory(
      blockId: BlockId,
      data: () => Either[Array[Any], ByteBuffer]): Option[BlockStatus] = {

    logInfo(s"Dropping block $blockId from memory")
    //根据blockid 获取BlockInfo
    //private val blockInfo = new TimeStampedHashMap[BlockId, BlockInfo]
    val info = blockInfo.get(blockId).orNull
    
    //如果blockinfo不为空
    if (info != null) {
      //加锁 可能存在其他线程操作这个block
      info.synchronized {
        //waitForReady方法判断这个block是否能进行drop操作,(比如说这个block已经完成写操作了,或者其他线程正在写 则需要等待)
        //因为将来可能会发生变化,为了一致性而增加。
        if (!info.waitForReady()) {
          logWarning(s"Block $blockId was marked as failure. Nothing to drop")
          return None
        } else if (blockInfo.get(blockId).isEmpty) {
          logWarning(s"Block $blockId was already dropped.")
          return None
        }
        //block是否更新的标志位
        var blockIsUpdated = false
        //获取block的存储级别
        val level = info.level

        // 保存到磁盘,如果level有这么设置的话,并且磁盘不存该block块
        if (level.useDisk && !diskStore.contains(blockId)) {
          logInfo(s"Writing block $blockId to disk")
          data() match {
            case Left(elements) =>
              diskStore.putArray(blockId, elements, level, returnValues = false)
            case Right(bytes) =>
              diskStore.putBytes(blockId, bytes, level)
          }
          //
          blockIsUpdated = true
        }

        // 从内存中drop掉该block,获取dorp掉块的大小
        val droppedMemorySize =
          if (memoryStore.contains(blockId)) memoryStore.getSize(blockId) else 0L
        val blockIsRemoved = memoryStore.remove(blockId)
        if (blockIsRemoved) {
          blockIsUpdated = true
        } else {
          logWarning(s"Block $blockId could not be dropped from memory as it does not exist")
        }
        //根据blockid获取这个block更新后的block状态。比如说这个block从内存drop到磁盘了
        //那么就会返回新的存储级别以及更新后的内存和磁盘大小。
        val status = getCurrentBlockStatus(blockId, info)
        if (info.tellMaster) {
          //向BlockManagerMaster报告这个block更新后的状态
          reportBlockStatus(blockId, info, status, droppedMemorySize)
        }
        //从blockInfo删除这个block块
        if (!level.useDisk) {
          // The block is completely gone from this node; forget it so we can put() it again later.
          blockInfo.remove(blockId)
        }
        //返回Block的状态
        if (blockIsUpdated) {
          return Some(status)
        }
      }
    }
    None
  }

状态报告方法reportBlockStatus

BlockManager 342行为具体实现:

reportBlockStatus用于向BlockManagerMasterEndpoint报告Block的状态并且重新注册BlockManager。

1.调用tryToReportBlockStatus方法,tryToReportBlockStatus调用了BlockManagerMasterEndpoint的updateBlockInfo,来发送更新信息的消息。

2.如果Blockmanager没有向BlockManagerMasterEndpoint注册,则调用了asyncReregister方法,asyncReregister调用了reregister,reregister实际调用了BlockManagerMasterEndpoint的registerBlockManager和reportAllBlocks,而reportAllBlocks也调用了tryToReportBlockStatus来汇报信息。

  /**
   * Tell the master about the current storage status of a block. This will send a block update
   * message reflecting the current status, *not* the desired storage level in its block info.
   * For example, a block with MEMORY_AND_DISK set might have fallen out to be only on disk.
   *
   * droppedMemorySize exists to account for when the block is dropped from memory to disk (so
   * it is still valid). This ensures that update in master will compensate for the increase in
   * memory on slave.
   */
  private def reportBlockStatus(
      blockId: BlockId,
      info: BlockInfo,
      status: BlockStatus,
      droppedMemorySize: Long = 0L): Unit = {
    val needReregister = !tryToReportBlockStatus(blockId, info, status, droppedMemorySize)
    if (needReregister) {
      logInfo(s"Got told to re-register updating block $blockId")
      // Re-registering will report our new block for free.
      asyncReregister()
    }
    logDebug(s"Told master about block $blockId")
  }
  /**
   * Actually send a UpdateBlockInfo message. Returns the master's response,
   * which will be true if the block was successfully recorded and false if
   * the slave needs to re-register.
   */
  private def tryToReportBlockStatus(
      blockId: BlockId,
      info: BlockInfo,
      status: BlockStatus,
      droppedMemorySize: Long = 0L): Boolean = {
    if (info.tellMaster) {
      val storageLevel = status.storageLevel
      val inMemSize = Math.max(status.memSize, droppedMemorySize)
      val inExternalBlockStoreSize = status.externalBlockStoreSize
      val onDiskSize = status.diskSize
      master.updateBlockInfo(
        blockManagerId, blockId, storageLevel, inMemSize, onDiskSize, inExternalBlockStoreSize)
    } else {
      true
    }
  }

单对象块写入方法putSingle

BlockManager 998行:

putSingle方法用于将一个对象构成的Block写入存储系统。最终是通过doPut方法写入的。

  def putSingle(
      blockId: BlockId,
      value: Any,
      level: StorageLevel,
      tellMaster: Boolean = true): Seq[(BlockId, BlockStatus)] = {
    //调用putIeterator
    putIterator(blockId, Iterator(value), level, tellMaster)
  }
  
  def putIterator(
      blockId: BlockId,
      values: Iterator[Any],
      level: StorageLevel,
      tellMaster: Boolean = true,
      effectiveStorageLevel: Option[StorageLevel] = None): Seq[(BlockId, BlockStatus)] = {
    require(values != null, "Values is null")
    //调用doput方法
    doPut(blockId, IteratorValues(values), level, tellMaster, effectiveStorageLevel)
  }

序列化字节块写入方法putBytes

BlockManager 701行

实际上也用了doPut方法

  /**
   * Put a new block of serialized bytes to the block manager.
   * Return a list of blocks updated as a result of this put.
   */
  def putBytes(
      blockId: BlockId,
      bytes: ByteBuffer,
      level: StorageLevel,
      tellMaster: Boolean = true,
      effectiveStorageLevel: Option[StorageLevel] = None): Seq[(BlockId, BlockStatus)] = {
    require(bytes != null, "Bytes is null")
    doPut(blockId, ByteBufferValues(bytes), level, tellMaster, effectiveStorageLevel)
  }

数据写入方法doPut

BlockManager 701行:

Spark1.6-----源码解读之BlockManager_第1张图片

 详情请看此图。hh,我就看书的啊,我自己可写不来的哦,原书叫《深入理解spark核心思想和源码分析》,不过这书写的是1.2版本的,我看着太老了,自己看了个1.6的(虽然2.0前年就出了),照着书一页一页看,源码一步一步点的。写这个博客就是为了加深印象,hhhh。

不过源码的英文注释的,注释写也是非常好,作者很多也是参考了源码英文注释的。

详细过程过段时间再写。简单在代码解释了一下。

   /**
   * Put the given block according to the given level in one of the block stores, replicating
   * the values if necessary.
   *
The effective storage level refers to the level according to which the block will actually be
handled. This allows the caller to specify an alternate behavior of doPut while preserving
the original level specified by the user.
   */
   //根据Block的存储级别来保存该Block,如果副本系数大于1的话还会产生副本到其他机器上去
   // 参数effectiveStorageLevel 指的是Block块按实际Block level处理的,意思允许doPut的调用者指定其他存储级别
   //来操作这个Block并且保留原始的Block 存储级别
  private def doPut(
      blockId: BlockId,
      data: BlockValues,
      level: StorageLevel,
      tellMaster: Boolean = true,
      effectiveStorageLevel: Option[StorageLevel] = None)
    : Seq[(BlockId, BlockStatus)] = {

    require(blockId != null, "BlockId is null")
    require(level != null && level.isValid, "StorageLevel is null or invalid")
    effectiveStorageLevel.foreach { level =>
      require(level != null && level.isValid, "Effective StorageLevel is null or invalid")
    }

    //最终返回的结果
    val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]

    /* Remember the block's storage level so that we can correctly drop it to disk if it needs
     * to be dropped right after it got put into memory. Note, however, that other threads will
     * not be able to get() this block until we call markReady on its BlockInfo. */
    val putBlockInfo = {
      //创建一个BlockInfo
      val tinfo = new BlockInfo(level, tellMaster)
      //如果blockInfo以及缓存了该blockid的BlockInfo
      //那么就获取它
      val oldBlockOpt = blockInfo.putIfAbsent(blockId, tinfo)
      if (oldBlockOpt.isDefined) {
        //waitForReady方法判断这个block是否可用
        if (oldBlockOpt.get.waitForReady()) {
          //可用说明 已经该Block已经存在
          logWarning(s"Block $blockId already exists on this machine; not re-adding it")
          return updatedBlocks
        }
        //获取缓存中的BlockInfo
        oldBlockOpt.get
      } else {
        //否则获取新创建的BlockInfo
        tinfo
      }
    }

    val startTimeMs = System.currentTimeMillis

    /* If we're storing values and we need to replicate the data, we'll want access to the values,
     * but because our put will read the whole iterator, there will be no values left. For the
     * case where the put serializes data, we'll remember the bytes, above; but for the case where
     * it doesn't, such as deserialized storage, let's rely on the put returning an Iterator.       */
    //如果我们要存储Block并且我们还要复制Block,那么我们需要获取到这个Block的数据,
    //但是因为doput将读取整个迭代器,所以不会留下任何值。为
    //在put序列化数据的情况下,我们记住上面的字节
    var valuesAfterPut: Iterator[Any] = null

    // put之后的字节也是如此
    var bytesAfterPut: ByteBuffer = null

    // block的字节
    var size = 0L

    // 我们将会根据具体的Block Level来存储数据
    val putLevel = effectiveStorageLevel.getOrElse(level)

    //如果要存储字节,那么在本地存储之前先启动复制。
    //这更快,因为数据已经序列化并准备发送。
    val replicationFuture = data match {
      //如果Block数据已经序列化,那么就先启动复制
      case b: ByteBufferValues if putLevel.replication > 1 =>
        // Duplicate不会复制字节,只是创建一个包装器
        val bufferView = b.buffer.duplicate()
        Future {
          //futureExecutionContext 是一个 cached thread pool 将会使用这个线程池去进行Block的副本的生成
          replicate(blockId, bufferView, putLevel)
        }(futureExecutionContext)
      case _ => null
    }
    //写入Block的时候需要加锁
    putBlockInfo.synchronized {
      logTrace("Put for block %s took %s to get into synchronized block"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))
      
      var marked = false
      try {
        // returnValues - Whether to return the values put
        // 根据 存储级别来将Block来获取对应的BlockStore
        val (returnValues, blockStore: BlockStore) = {
          if (putLevel.useMemory) {
            //使用内存存储
            (true, memoryStore)
          } else if (putLevel.useOffHeap) {
            // 使用外部存储 比如tachyon
            (false, externalBlockStore)
          } else if (putLevel.useDisk) {
            //使用磁盘存储
            (putLevel.replication > 1, diskStore)
          } else {
            assert(putLevel == StorageLevel.NONE)
            throw new BlockException(
              blockId, s"Attempted to put block $blockId without specifying storage level!")
          }
        }

        // 根据BlockStore来将Block放在具体的存储体系中
        val result = data match {
          case IteratorValues(iterator) =>
            blockStore.putIterator(blockId, iterator, putLevel, returnValues)
          case ArrayValues(array) =>
            blockStore.putArray(blockId, array, putLevel, returnValues)
          case ByteBufferValues(bytes) =>
            bytes.rewind()
            blockStore.putBytes(blockId, bytes, putLevel)
        }
        //获取Block的字节
        size = result.size
        result.data match {
          case Left (newIterator) if putLevel.useMemory => valuesAfterPut = newIterator
          case Right (newBytes) => bytesAfterPut = newBytes
          case _ =>
        }

        // 将写入操作导致内存drop掉的Block放入到updatedBlocks
        if (putLevel.useMemory) {
          result.droppedBlocks.foreach { updatedBlocks += _ }
        }
        //获取当前Block的状态
        val putBlockStatus = getCurrentBlockStatus(blockId, putBlockInfo)
        if (putBlockStatus.storageLevel != StorageLevel.NONE) {
          // Now that the block is in either the memory, externalBlockStore, or disk store,
          // let other threads read it, and tell the master about it.
          marked = true
          //调用该方法之后其他线程才能看见该Block
          putBlockInfo.markReady(size)
          if (tellMaster) {
            //向BlockManagerMaster汇报
            reportBlockStatus(blockId, putBlockInfo, putBlockStatus)
          }
          //将该Block放入updatedBlocks
          updatedBlocks += ((blockId, putBlockStatus))
        }
      } finally {
        // If we failed in putting the block to memory/disk, notify other possible readers
        // that it has failed, and then remove it from the block info map.
        if (!marked) {
          // Note that the remove must happen before markFailure otherwise another thread
          // could've inserted a new BlockInfo before we remove it.
          blockInfo.remove(blockId)
          putBlockInfo.markFailure()
          logWarning(s"Putting block $blockId failed")
        }
      }
    }
    logDebug("Put block %s locally took %s".format(blockId, Utils.getUsedTimeMs(startTimeMs)))

    // Either we're storing bytes and we asynchronously started replication, or we're storing
    // values and need to serialize and replicate them now:
    //Block的复制和存储数据可以是是异步的,又或者在存储Block时,将Block序列化并且进行复制
    if (putLevel.replication > 1) {
      data match {
        case ByteBufferValues(bytes) =>
        //前面75行replicationFuture
          if (replicationFuture != null) {
            Await.ready(replicationFuture, Duration.Inf)
          }
        case _ =>
          val remoteStartTime = System.currentTimeMillis
          // Serialize the block if not already done
          if (bytesAfterPut == null) {
            if (valuesAfterPut == null) {
              throw new SparkException(
                "Underlying put returned neither an Iterator nor bytes! This shouldn't happen.")
            }
            
            //数据进行序列化
            bytesAfterPut = dataSerialize(blockId, valuesAfterPut)
          }
          //进行复制
          replicate(blockId, bytesAfterPut, putLevel)
          logDebug("Put block %s remotely took %s"
            .format(blockId, Utils.getUsedTimeMs(remoteStartTime)))
      }
    }

    BlockManager.dispose(bytesAfterPut)

    if (putLevel.replication > 1) {
      logDebug("Putting block %s with replication took %s"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))
    } else {
      logDebug("Putting block %s without replication took %s"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))
    }
    //返回该Block存储后的状态
    updatedBlocks
  }

数据块备份方法replicate

 /**
   * Replicate block to another node. Not that this is a blocking call that returns after
   * the block has been replicated.
   */
   //复制Block到其他机器上去
  private def replicate(blockId: BlockId, data: ByteBuffer, level: StorageLevel): Unit = {
    //最大复制失败次数
    val maxReplicationFailures = conf.getInt("spark.storage.maxReplicationFailures", 1)
    //需要复制的次数
    val numPeersToReplicateTo = level.replication - 1
    //可以作为备份的BlockManager(一个Executor或者Driver有一个BlockManger)
    val peersForReplication = new ArrayBuffer[BlockManagerId]
    //以及作为备份的BlockManager
    val peersReplicatedTo = new ArrayBuffer[BlockManagerId]
    //已经失败的BlockManager
    val peersFailedToReplicateTo = new ArrayBuffer[BlockManagerId]
    val tLevel = StorageLevel(
      level.useDisk, level.useMemory, level.useOffHeap, level.deserialized, 1)
    val startTime = System.currentTimeMillis
    //根据block的has值获取到的随机数
    val random = new Random(blockId.hashCode)
    //标记复制是否失败
    var replicationFailed = false
    //失败次数
    var failures = 0
    //标记复制是否完成
    var done = false

    // Get cached list of peers
    //为了容灾 peersReplicatedTo 不能存放当前的BlockManager
    //该方法去获取其他BlockManagerId
    peersForReplication ++= getPeers(forceFetch = false)

    // Get a random peer. Note that this selection of a peer is deterministic on the block id.
    // So assuming the list of peers does not change and no replication failures,
    // if there are multiple attempts in the same node to replicate the same block,
    // the same set of peers will be selected.
    //该方法用于随机获取BlockManagerId
    def getRandomPeer(): Option[BlockManagerId] = {
      // If replication had failed, then force update the cached list of peers and remove the peers
      // that have been already used
      //如果复制失败了那么就要从新获取BlockMangerId,并且清除掉peersForReplication
      //中peersReplicatedTo,peersFailedToReplicateTo也包含的BlockMangerId
      if (replicationFailed) {
        peersForReplication.clear()
        peersForReplication ++= getPeers(forceFetch = true)
        peersForReplication --= peersReplicatedTo
        peersForReplication --= peersFailedToReplicateTo
      }
      if (!peersForReplication.isEmpty) {
        //由于根据blockId.hash值获取random 这样就能保证同一个节点多次尝试复制同一个Block
        Some(peersForReplication(random.nextInt(peersForReplication.size)))
      } else {
        None
      }
    }

    // One by one choose a random peer and try uploading the block to it
    // If replication fails (e.g., target peer is down), force the list of cached peers
    // to be re-fetched from driver and then pick another random peer for replication. Also
    // temporarily black list the peer for which replication failed.
    //
    // This selection of a peer and replication is continued in a loop until one of the
    // following 3 conditions is fulfilled:
    // (i) specified number of peers have been replicated to
    // (ii) too many failures in replicating to peers
    // (iii) no peer left to replicate to
    //
    //翻译:
    //它会去调用getRandomPeer随机获取BlockManagerId,尝试去上传Block,如果复制失败了,
    //就会强制的清空peersForReplication 然后再去随机获取BlockManagerId,并且那些复制失败的
    //节点也会被记录下来
    //这个while函数会一直循环直到下面三个条件出现:
    //1.已达到复制的指定数目 2.复制失败次数太多 3.没有BlockManager可以复制
    //
    while (!done) {
      getRandomPeer() match {
        case Some(peer) =>
          try {
            val onePeerStartTime = System.currentTimeMillis
            data.rewind()
            logTrace(s"Trying to replicate $blockId of ${data.limit()} bytes to $peer")
            blockTransferService.uploadBlockSync(
              peer.host, peer.port, peer.executorId, blockId, new NioManagedBuffer(data), tLevel)
            logTrace(s"Replicated $blockId of ${data.limit()} bytes to $peer in %s ms"
              .format(System.currentTimeMillis - onePeerStartTime))
            peersReplicatedTo += peer
            peersForReplication -= peer
            replicationFailed = false
            if (peersReplicatedTo.size == numPeersToReplicateTo) {
              done = true  // specified number of peers have been replicated to
            }
          } catch {
            case e: Exception =>
              logWarning(s"Failed to replicate $blockId to $peer, failure #$failures", e)
              failures += 1
              replicationFailed = true
              peersFailedToReplicateTo += peer
              if (failures > maxReplicationFailures) { // too many failures in replcating to peers
                done = true
              }
          }
        case None => // no peer left to replicate to
          done = true
      }
    }
    val timeTakeMs = (System.currentTimeMillis - startTime)
    logDebug(s"Replicating $blockId of ${data.limit()} bytes to " +
      s"${peersReplicatedTo.size} peer(s) took $timeTakeMs ms")
    if (peersReplicatedTo.size < numPeersToReplicateTo) {
      logWarning(s"Block $blockId replicated to only " +
        s"${peersReplicatedTo.size} peer(s) instead of $numPeersToReplicateTo peers")
    }
  }
  

创建DiskBlockObjectWriter的getDiskWriter

  def getDiskWriter(
      blockId: BlockId,
      file: File,
      serializerInstance: SerializerInstance,
      bufferSize: Int,
      writeMetrics: ShuffleWriteMetrics): DiskBlockObjectWriter = {
    val compressStream: OutputStream => OutputStream = wrapForCompression(blockId, _)
    val syncWrites = conf.getBoolean("spark.shuffle.sync", false)
    //DiskBlockObjectWriter 用于输出Spark任务计算的中间结果
    new DiskBlockObjectWriter(file, serializerInstance, bufferSize, compressStream,
      syncWrites, writeMetrics, blockId)
  }

获取本地Block数据的方法getBlockData

  override def getBlockData(blockId: BlockId): ManagedBuffer = {
    if (blockId.isShuffle) {
      //如果是ShuffleMapTask的输出,那么多个partition的中间结果都写入同一个文件
      shuffleManager.shuffleBlockResolver.getBlockData(blockId.asInstanceOf[ShuffleBlockId])
    } else {
      //如果是ResultTask则使用doGet来获取本地中间结果的数据
      val blockBytesOpt = doGetLocal(blockId, asBlockResult = false)
        .asInstanceOf[Option[ByteBuffer]]
      if (blockBytesOpt.isDefined) {
        val buffer = blockBytesOpt.get
        new NioManagedBuffer(buffer)
      } else {
        throw new BlockNotFoundException(blockId.toString)
      }
    }
  }

获取本地shuffle数据方法doGetLocal

 

  //获取本地shuffle数据方法
  private def doGetLocal(blockId: BlockId, asBlockResult: Boolean): Option[Any] = {
    //获取Block
    val info = blockInfo.get(blockId).orNull
    if (info != null) {
      info.synchronized {
        // Double check to make sure the block is still there. There is a small chance that the
        // block has been removed by removeBlock (which also synchronizes on the blockInfo object).
        // Note that this only checks metadata tracking. If user intentionally deleted the block
        // on disk or from off heap storage without using removeBlock, this conditional check will
        // still pass but eventually we will get an exception because we can't find the block.
        //翻译:两次确认来保证block存在,因为有很小的可能性会是block已经被removeBlock 给remove掉了
        //注意:这只是通过检查元数据的方式来check,如果用户故意没有使用removeblock来删除block那么这个条件
        //判断就会通过,最后抛异常
        if (blockInfo.get(blockId).isEmpty) {
          logWarning(s"Block $blockId had been removed")
          return None
        }

        // 如果其他线程还在写该block,那么就等待
        if (!info.waitForReady()) {
          // If we get here, the block write failed.
          logWarning(s"Block $blockId was marked as failure.")
          return None
        }
        //获取Block 存储级别
        val level = info.level
        logDebug(s"Level for block $blockId is $level")

        // 从内存中寻找Block
        if (level.useMemory) {
          logDebug(s"Getting block $blockId from memory")
          val result = if (asBlockResult) {
            memoryStore.getValues(blockId).map(new BlockResult(_, DataReadMethod.Memory, info.size))
          } else {
            memoryStore.getBytes(blockId)
          }
          result match {
            case Some(values) =>
              return result
            case None =>
              logDebug(s"Block $blockId not found in memory")
          }
        }

        // 从堆外内存寻找Block
        if (level.useOffHeap) {
          logDebug(s"Getting block $blockId from ExternalBlockStore")
          if (externalBlockStore.contains(blockId)) {
            val result = if (asBlockResult) {
              externalBlockStore.getValues(blockId)
                .map(new BlockResult(_, DataReadMethod.Memory, info.size))
            } else {
              externalBlockStore.getBytes(blockId)
            }
            result match {
              case Some(values) =>
                return result
              case None =>
                logDebug(s"Block $blockId not found in ExternalBlockStore")
            }
          }
        }

        // 从磁盘中寻找Block
        if (level.useDisk) {
          logDebug(s"Getting block $blockId from disk")
          val bytes: ByteBuffer = diskStore.getBytes(blockId) match {
            case Some(b) => b
            case None =>
              throw new BlockException(
                blockId, s"Block $blockId not found on disk, though it should be")
          }
          assert(0 == bytes.position())

          if (!level.useMemory) {
            // If the block shouldn't be stored in memory, we can just return it
            if (asBlockResult) {
              return Some(new BlockResult(dataDeserialize(blockId, bytes), DataReadMethod.Disk,
                info.size))
            } else {
              return Some(bytes)
            }
          } else {
            次要代码
          }
        }
      }
    } else {
      logDebug(s"Block $blockId not registered locally")
    }
    None
  }

获取远程Block数据方法doGetRemote

  private def doGetRemote(blockId: BlockId, asBlockResult: Boolean): Option[Any] = {
    require(blockId != null, "BlockId is null")
    //向BlockManagerMaster 发送getLocations消息 来获取 Block数据存储的BlockManagerId
    //随机取其中一个 避免总是从一个BlockManager读数据
    val locations = Random.shuffle(master.getLocations(blockId))
    var numFetchFailures = 0
    for (loc <- locations) {
      logDebug(s"Getting remote block $blockId from $loc")
      val data = try {
        //使用blockTransferService来远程获取Block数据
        blockTransferService.fetchBlockSync(
          loc.host, loc.port, loc.executorId, blockId.toString).nioByteBuffer()
      } catch {
        case NonFatal(e) =>
          numFetchFailures += 1
          if (numFetchFailures == locations.size) {
            // An exception is thrown while fetching this block from all locations
            throw new BlockFetchException(s"Failed to fetch block from" +
              s" ${locations.size} locations. Most recent failure cause:", e)
          } else {
            // This location failed, so we retry fetch from a different one by returning null here
            logWarning(s"Failed to fetch remote block $blockId " +
              s"from $loc (failed attempt $numFetchFailures)", e)
            null
          }
      }
      
      if (data != null) {
        if (asBlockResult) {
          //将数据序列化返回
          return Some(new BlockResult(
            dataDeserialize(blockId, data),
            DataReadMethod.Network,
            data.limit()))
        } else {
          return Some(data)
        }
      }
      logDebug(s"The value of block $blockId is null")
    }
    logDebug(s"Block $blockId not found")
    None
  }

获取Block数据的方法get

  def get(blockId: BlockId): Option[BlockResult] = {
    //首先看本地有没有
    val local = getLocal(blockId)
    if (local.isDefined) {
      logInfo(s"Found block $blockId locally")
      return local
    }
    //没有就去远端拿
    val remote = getRemote(blockId)
    if (remote.isDefined) {
      logInfo(s"Found block $blockId remotely")
      return remote
    }
    None
  }

数据流序列化方法dataSerializeStream

压缩算法

磁盘写入实现DiskBlockObjectWriter

块索引shuffle管理器IndexShuffleBlockManager

shuffle内存管理器ShuffleMemoryManager

巨坑。。。。。先不写了。

你可能感兴趣的:(spark)