9.1 Spark 内存管理 MemoryManager

这个章节图片来源: https://www.ibm.com/developerworks/cn/analytics/library/ba-cn-apache-spark-memory-management/index.html

1. 为什么自己做内存管理

在前面的代码里可以看到RDD, 内部实际上是一个有一个的Partition, 而Partition的实际实现是Block.
Spark需要每个Executor上的BlockManager管理实际数据的storage level, 为了用空间换速度, 典型的就是使用内存尽可能多的换存数据, 并规避各种JVM回收上的坑. 所以Spark把内存管理单独拆出来作为一个服务, 在所有的sparkEnv里会启动一个MemoryManager来做统一内存管理.

JVM视角

根据Duke大学的介绍, spark当时启动这个计划主要是因为spark在不同的负载类型(离线分析, 流计算, 机器学习, SQL)下JVM的参数不容易配置, 性能差距可以达到2倍.
于是Spark1.6按照计划希望在JVM上建构一层来做内存管理.
Spark的内存管理视角

在这种内存管理模式下, Spark依托JVM的G1垃圾回收器来实现适应不同负载的内存管理模块.

2. MemoryManager的内部结构

主要用于管理计算用的内存和存储用的内存, 初始化时需要

  • numCores 本地分配到的核数
  • storageMemory 存储用的内存bytes
  • onHeapExecutionMemory 计算用的内存bytes, 这里使用on-heap, 因为off-heap用的是tachyon的内存空间, 和JVM无关.
/**
 * An abstract memory manager that enforces how memory is shared between execution and storage.
 *
 * In this context, execution memory refers to that used for computation in shuffles, joins,
 * sorts and aggregations, while storage memory refers to that used for caching and propagating
 * internal data across the cluster. There exists one MemoryManager per JVM.
 */
private[spark] abstract class MemoryManager(
    conf: SparkConf,
    numCores: Int,
    storageMemory: Long,
    onHeapExecutionMemory: Long) 

MemoryManager需要维护对应内存池

 // -- Methods related to memory allocation policies and bookkeeping ------------------------------

  @GuardedBy("this")
  protected val storageMemoryPool = new StorageMemoryPool(this)
  @GuardedBy("this")
  protected val onHeapExecutionMemoryPool = new ExecutionMemoryPool(this, "on-heap execution")
  @GuardedBy("this")
  protected val offHeapExecutionMemoryPool = new ExecutionMemoryPool(this, "off-heap execution")

  storageMemoryPool.incrementPoolSize(storageMemory)
  onHeapExecutionMemoryPool.incrementPoolSize(onHeapExecutionMemory)
  offHeapExecutionMemoryPool.incrementPoolSize(conf.getSizeAsBytes("spark.memory.offHeap.size", 0))

3.内存管理的几个核心方法

内存结构

在比较新的unified memory manager管理模型中, 对之前的strorage + execution进行了进一步细分.

进一步细分后, 可以看到storage里面出现了一个概念unroll. 这个机制对于做过linux内核的同学可能非常容易理解, 这就是内存页伙伴算法在JVM的实现.

这个概念非常类似于我们去饭店吃饭, 虽然我们可能不太好估计同行有多少人, 但是可以根据大概的估计先把位置占了, 而且这些位置一定要是连续的, 我和我的朋友们必须坐在一张桌子上啊! 这个Unroll就是预约连续内存空间的一种模型, 可以提升执行效率, 它把在硬盘上可能分散的数据块(实际上HDFS下面的ext4会尽可能连续写) 平铺到一段连续的内存空间中.

下面三个方法对应的就是不同的内存池进行申请

/**
   * Acquire N bytes of memory to cache the given block, evicting existing ones if necessary.
   * Blocks evicted in the process, if any, are added to `evictedBlocks`.
   * @return whether all N bytes were successfully granted.
   */
  def acquireStorageMemory

  /**
   * Acquire N bytes of memory to unroll the given block, evicting existing ones if necessary.
   *
   * This extra method allows subclasses to differentiate behavior between acquiring storage
   * memory and acquiring unroll memory. For instance, the memory management model in Spark
   * 1.5 and before places a limit on the amount of space that can be freed from unrolling.
   * Blocks evicted in the process, if any, are added to `evictedBlocks`.
   *
   * @return whether all N bytes were successfully granted.
   */
  def acquireUnrollMemory

 /**
   * Try to acquire up to `numBytes` of execution memory for the current task and return the
   * number of bytes obtained, or 0 if none can be allocated.
   *
   * This call may block until there is enough free memory in some situations, to make sure each
   * task has a chance to ramp up to at least 1 / 2N of the total memory pool (where N is the # of
   * active tasks) before it is forced to spill. This can happen if the number of tasks increase
   * but an older task had a lot of memory already.
   */
  private[memory]
  def acquireExecutionMemory

和申请对应的, 还有查询已经申请的内存, 以及释放已经申请的内存的方法

 /**
   * Execution memory currently in use, in bytes.
   */
  final def executionMemoryUsed: Long

 /**
   * Release N bytes of storage memory.
   */
  def releaseStorageMemory(numBytes: Long): Unit 

在spark的内存管理中, 有一个和内核内存管理非常相似的概念, 就是内存页memory page. 在内核中, memory page是使用buddy algorithm来管理, 在spark中也非常类似, 每次可以分配连续的2^n个页给需要的block使用. Linux内核中默认的页大小一般是4K, 而spark中是1M

  /**
   * The default page size, in bytes.
   *
   * If user didn't explicitly set "spark.buffer.pageSize", we figure out the default value
   * by looking at the number of cores available to the process, and the total amount of memory,
   * and then divide it by a factor of safety.
   */
  val pageSizeBytes: Long = {
    val minPageSize = 1L * 1024 * 1024   // 1MB
    val maxPageSize = 64L * minPageSize  // 64MB
    val cores = if (numCores > 0) numCores else Runtime.getRuntime.availableProcessors()
    // Because of rounding to next power of 2, we may have safetyFactor as 8 in worst case
    val safetyFactor = 16
    val maxTungstenMemory: Long = tungstenMemoryMode match {
      case MemoryMode.ON_HEAP => onHeapExecutionMemoryPool.poolSize
      case MemoryMode.OFF_HEAP => offHeapExecutionMemoryPool.poolSize
    }
    val size = ByteArrayMethods.nextPowerOf2(maxTungstenMemory / cores / safetyFactor)
    val default = math.min(maxPageSize, math.max(minPageSize, size))
    conf.getSizeAsBytes("spark.buffer.pageSize", default)
  }

你可能感兴趣的:(9.1 Spark 内存管理 MemoryManager)