Driver: 一个Spark作业运行时包括一个Driver进程,也是作业的主进程,负责作业的解析、生成Stage并调度Task到Executor上。包括DAGScheduler,TaskScheduler。
Executor:即真正执行作业的地方,一个集群一般包含多个Executor,每个Executor接收Driver的命令Launch Task,一个Executor可以执行一到多个Task。
Application:带有自己需要的mem和cpu资源量,会在master里排队,最后被分发到worker上执行。app的启动是去各个worker遍历,获取可用的cpu,然后去各个worker launch executor。
Worker:每台slave起一个,默认或被设置cpu和mem数,并在内存里做加减维护资源剩余量。Worker同时负责拉起本地的executor backend,即执行进程。
case RegisterApplication(description) => {
if (state == RecoveryState.STANDBY) {
// ignore, don't send response
} else {
logInfo("Registering app " + description.name)
val app = createApplication(description, sender)
logInfo("Registered app " + description.name + " with ID " + app.id)
sender ! RegisteredApplication(app.id, masterUrl)
* Schedule the currently available resources among waiting apps. This method will be called
* every time a new app joins or resource availability changes.
private def schedule(): Unit = {
if (state != RecoveryState.ALIVE) { return }
// Drivers take strict precedence over executors
val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers随机找出几个workers
for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) {
for (driver <- waitingDrivers) {//可能有多个drivers等待启动
if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) {//看一下这个worker上的资源是否符合要求,mem,cores
launchDriver(worker, driver)//在此worker启动driver
waitingDrivers -= driver
* Schedule and launch executors on workers
private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app
// in the queue, then the second app, etc.
for (app <- waitingApps if app.coresLeft > 0) {
val coresPerExecutor: Option[Int] = app.desc.coresPerExecutor
// Filter out workers that don't have enough resources to launch an executor
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)//过滤正常运行中的workers
.filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&//此worker的内存是否符合app指定的内存
worker.coresFree >= coresPerExecutor.getOrElse(1))//此worker剩余核数是否符合app指定的要求,默认是1个core
val assignedCores = Master.scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)//遍历各个worker,进行资源分配下面具体看
// Now that we've decided how many cores to allocate on each worker, let's allocate them
for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
app, assignedCores(pos), coresPerExecutor, usableWorkers(pos))
val assignedCores = Master.scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)
def scheduleExecutorsOnWorkers(
app: ApplicationInfo,
usableWorkers: Array[WorkerInfo],
spreadOutApps: Boolean): Array[Int] = {
// If the number of cores per executor is not specified, then we can just schedule
// 1 core at a time since we expect a single executor to be launched on each worker
val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val numUsable = usableWorkers.length
val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker用于记录每个符合资源要求的worker能分配的核数
val assignedMemory = new Array[Int](numUsable) // Amount of memory to give to each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)//这里有2种情况,1:app所需要的核数详见注解1,详见注解1,详见注解1,重要事情来三遍!
var freeWorkers = (0 until numUsable).toIndexedSeq
def canLaunchExecutor(pos: Int): Boolean = {
usableWorkers(pos).coresFree - assignedCores(pos) >= coresPerExecutor &&
usableWorkers(pos).memoryFree - assignedMemory(pos) >= memoryPerExecutor
while (coresToAssign >= coresPerExecutor && freeWorkers.nonEmpty) {//遍历每个worker进行分配,coresToAssign是app还需多少要分配(或workers上还有多少可以分配)
freeWorkers = freeWorkers.filter(canLaunchExecutor)
freeWorkers.foreach { pos =>
var keepScheduling = true
while (keepScheduling && canLaunchExecutor(pos) && coresToAssign >= coresPerExecutor) {
coresToAssign -= coresPerExecutor
assignedCores(pos) += coresPerExecutor
// If cores per executor is not set, we are assigning 1 core at a time
// without actually meaning to launch 1 executor for each core assigned
if (app.desc.coresPerExecutor.isDefined) {
assignedMemory(pos) += memoryPerExecutor
// Spreading out an application means spreading out its executors across as
// many workers as possible. If we are not spreading out, then we should keep
// scheduling executors on this worker until we use all of its resources.
// Otherwise, just move on to the next worker.
if (spreadOutApps) {//这里是spreadOutApps为true(默认)情况,尽量在多个worker上启动executor、
keepScheduling = false //相反如果为false,则根据指定的app的总核数,尽量分配单个节点的可用cores.
for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
app, assignedCores(pos), coresPerExecutor, usableWorkers(pos))
* Allocate a worker's resources to one or more executors.
* @param app the info of the application which the executors belong to
* @param assignedCores number of cores on this worker for this application
* @param coresPerExecutor number of cores per executor
* @param worker the worker info
private def allocateWorkerResourceToExecutors(
app: ApplicationInfo,
assignedCores: Int,
coresPerExecutor: Option[Int],
worker: WorkerInfo): Unit = {
// If the number of cores per executor is specified, we divide the cores assigned
// to this worker evenly among the executors with no remainder.
// Otherwise, we launch a single executor that grabs all the assignedCores on this worker.
val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)//看下在此worker上启动executor的个数.详见注解2,详见注解2,详见注解2!
val coresToAssign = coresPerExecutor.getOrElse(assignedCores)
for (i <- 1 to numExecutors) {
val exec = app.addExecutor(worker, coresToAssign)
launchExecutor(worker, exec)//在worker上启动executor
app.state = ApplicationState.RUNNING
--executor-cores NUM Number of cores per executor. (Default: 1 in YARN mode,or all available cores on the worker in standalone mode)
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum),是取两者小者,有两种情况:
例:5个workers.每个10cores. 命令:--executor-cores 2 total-executor-cores 6
ps:如果spreadOutApps= false了,那么直接在第一个worker上全部分完总共的6个cores了。即启动一个executor。
例:5个workers.有2台分别剩余3cores.3台分别剩余1个cores. 命令:--executor-cores 2 total-executor-cores 30
显然5个worker上一共2X3 + 3X1 = 9个 < 30个的期望数字。那么就是在2台满足最低coresPerExecutor=2cores的worker上分别启动executor,共2个executors
val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)
例:3个workers.1台剩余10cores.2台分别剩余3cores. 命令:--executor-cores 2 total-executor-cores 8
这种情况就是分别遍历每个worker,第一轮遍历分配后每个worker都符合启动一个executor,总共分配了6cores<total-executor-cores 8,则继续遍历可用资源,于是乎1号worker还剩8个cores,继续分配2个cores.最终达到标准total-executor-cores 8的要求。这样在这个worker上就有启动2个executor+另外2个worker上分别有1个executor=共4个executors。