An abstract memory manager that enforces how memory is shared between execution and storage. In this context, execution memory refers that used for computation in shuffles, joins, sorts and aggregations, while storage memory refers to that used for caching and propagating internal data across the cluster. There exists one MemoryManager per JVM.
在这个上下文,执行内存指的是那些在shuffles joins sorts 和 aggregations的计算,而存储内存指的是用来缓存和传播在集群中的内部数据(比如广播变量),一个内存管理一个Jvm.
private[spark] abstract class MemoryManager(
conf: SparkConf,
numCores: Int,
onHeapStorageMemory: Long, //用于存储的堆内存大小
onHeapExecutionMemory: Long //用于执行的堆内存大小
) extends Logging {
// -- Methods related to memory allocation policies and bookkeeping ------------------------------
protected val onHeapStorageMemoryPool = new StorageMemoryPool(this, MemoryMode.ON_HEAP)
protected val offHeapStorageMemoryPool = new StorageMemoryPool(this, MemoryMode.OFF_HEAP)
protected val onHeapExecutionMemoryPool = new ExecutionMemoryPool(this, MemoryMode.ON_HEAP)
protected val offHeapExecutionMemoryPool = new ExecutionMemoryPool(this, MemoryMode.OFF_HEAP)
protected[this] val maxOffHeapMemory = conf.get(MEMORY_OFFHEAP_SIZE)
protected[this] val offHeapStorageMemory =
(maxOffHeapMemory * conf.getDouble("spark.memory.storageFraction", 0.5)).toLong
offHeapExecutionMemoryPool.incrementPoolSize(maxOffHeapMemory - offHeapStorageMemory)
protected[this] val maxOffHeapMemory = conf.get(MEMORY_OFFHEAP_SIZE)
protected[this] val offHeapStorageMemory =
(maxOffHeapMemory * conf.getDouble("spark.memory.storageFraction", 0.5)).toLong
offHeapExecutionMemoryPool.incrementPoolSize(maxOffHeapMemory - offHeapStorageMemory)
private[spark] val MEMORY_OFFHEAP_SIZE = ConfigBuilder("spark.memory.offHeap.size")
.doc("The absolute amount of memory in bytes which can be used for off-heap allocation. " +
"This setting has no impact on heap memory usage, so if your executors' total memory " +
"consumption must fit within some hard limit then be sure to shrink your JVM heap size " +
"accordingly. This must be set to a positive value when spark.memory.offHeap.enabled=true.")
.checkValue(_ >= 0, "The off-heap memory size must not be negative")
conf.getDouble(“spark.memory.storageFraction”, 0.5),非堆内存用于存储的默认值是0.5,而且可以发现,非堆内存分成两部分,一部分用于存储,一部分用于计算,它们俩把非堆用完了.
val useLegacyMemoryManager = conf.getBoolean("spark.memory.useLegacyMode", false)
val memoryManager: MemoryManager =
if (useLegacyMemoryManager) {
new StaticMemoryManager(conf, numUsableCores)
} else {
UnifiedMemoryManager(conf, numUsableCores)
A MemoryManager that enforces a soft boundary between execution and storage such that either side can borrow memory from the other.
The region shared between execution and storage is a fraction of (the total heap space - 300MB) configurable through spark.memory.fraction (default 0.6). The position of the boundary within this space is further determined by spark.memory.storageFraction (default 0.5). This means the size of the storage region is 0.6 * 0.5 = 0.3 of the heap space by default.
Storage can borrow as much execution memory as is free until execution reclaims its space. When this 2happens, cached blocks will be evicted from memory until sufficient borrowed memory is released to satisfy the execution memory request. Similarly, execution can borrow as much storage memory as is free. However, execution memory is never evicted by storage due to the complexities involved in implementing this. The implication is that attempts to cache blocks may fail if execution has already eaten up most of the storage space, in which case the new blocks will be evicted immediately according to their respective storage levels.
执行和存储的共享区域是 (总堆内存-300M)的一小部分,通过spark.memory.fraction配置,默认是0.6,此空间内边界的位置由spark.memory.storageFraction(确定是0.5)进一步确定.这意味着默认的存储区域的内存的大小是堆内存的0.6 * 0.5 = 0.3
* Return the total amount of memory shared between execution and storage, in bytes.
private def getMaxMemory(conf: SparkConf): Long = {
val systemMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
val reservedMemory = conf.getLong("spark.testing.reservedMemory",
if (conf.contains("spark.testing")) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
val minSystemMemory = (reservedMemory * 1.5).ceil.toLong
if (systemMemory < minSystemMemory) {
throw new IllegalArgumentException(s"System memory $systemMemory must " +
s"be at least $minSystemMemory. Please increase heap size using the --driver-memory " +
s"option or spark.driver.memory in Spark configuration.")
// SPARK-12759 Check executor memory to fail fast if memory is insufficient
if (conf.contains("spark.executor.memory")) {
val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
if (executorMemory < minSystemMemory) {
throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
s"$minSystemMemory. Please increase executor memory using the " +
s"--executor-memory option or spark.executor.memory in Spark configuration.")
val usableMemory = systemMemory - reservedMemory
val memoryFraction = conf.getDouble("spark.memory.fraction", 0.6)
(usableMemory * memoryFraction).toLong