Executor
执行Task
的前期准备: Executor
执行Task之前,先看一个重要的类,它就是CoarseGrainedExecutorBackend
类onStart
方法ExecutorBackend
粗粒度进程,Driver
发送Executor
的注册请求Driver
相互通信Executor
所在的一个进程名称,Executor
才是处理Task
真正的对象,Executor
处理Task
都是由线程池来进行Task
的处理的。Driver
返回回来的Executor
注册信息,然后创建Executor
上下文。TaskSchedule
发送过来的LaunchTask
消息,开始Task
的启动与计算Executor
执行Task
的原理分析: CoarseGrainedExecutorBackend
接收到Driver
发送过来的RegisteredExecutor
消息的时候就会创建Executor
Driver
发送过来的LaunchTask
消息后就会开始执行Task
,首先它会对发送来的TaskTaskDescription
进行反序列化,然后调用launchTask
方法交由Executor
去执行Task
。launchTask
方法中,创建了TaskRunner
,然后TaskRunner继承了Runnable接口,然后将这个TaskRunner
加入到线程池和缓存中,然后线程池调用executor
方法开始Task
的执行。Executor
执行Task
的原码分析:
CoarseGrainedExecutorBackend
的onStart
方法:该方法在创建CoarseGrainedExecutorBackend
类的时候被执行,它会向Driver
注册Executor
override def onStart() {
logInfo("Connecting to driver: " + driverUrl)
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so we can use "ThreadUtils.sameThread"
driver = Some(ref)
//向Driver发送Executor的注册请求
ref.ask[RegisterExecutorResponse](
RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
}(ThreadUtils.sameThread).onComplete {
// This is a very fast action so we can use "ThreadUtils.sameThread"
case Success(msg) => Utils.tryLogNonFatalError {
Option(self).foreach(_.send(msg)) // msg must be RegisterExecutorResponse
}
case Failure(e) => {
logError(s"Cannot register with driver: $driverUrl", e)
System.exit(1)
}
}(ThreadUtils.sameThread)
}
CoarseGrainedExecutorBackend
的receive
方法:该方法作用就是接受各种消息用的。
override def receive: PartialFunction[Any, Unit] = {
//Driver返回Executor注册成功的消息,然后就会创建Executor对象。
case RegisteredExecutor(hostname) =>
logInfo("Successfully registered with driver")
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
//Driver返回Executor注册失败的消息,然后程序结束执行。
case RegisterExecutorFailed(message) =>
logError("Slave registration failed: " + message)
System.exit(1)
//接受Driver发送过来的LaunchTask消息,这个消息作用就是要求Executor开始执行Task任务
case LaunchTask(data) =>
if (executor == null) {
logError("Received LaunchTask command but executor was null")
System.exit(1)
} else {
//首先会对传过来的TaskDescription进行反序列化,
val taskDesc = ser.deserialize[TaskDescription](data.value)
logInfo("Got assigned task " + taskDesc.taskId)
//调用executor的launchTask方法开始执行Task任务。
//this:ExecutorBackend,taskId:task的索引Id,attemptNumber:尝试执行的次数,
//taskDesc.name:task的名称,taskDesc.serializedTask:TaskDescription序列化后的对象
executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
taskDesc.name, taskDesc.serializedTask)
}
case KillTask(taskId, _, interruptThread) =>
if (executor == null) {
logError("Received KillTask command but executor was null")
System.exit(1)
} else {
executor.killTask(taskId, interruptThread)
}
case StopExecutor =>
logInfo("Driver commanded a shutdown")
// Cannot shutdown here because an ack may need to be sent back to the caller. So send
// a message to self to actually do the shutdown.
self.send(Shutdown)
case Shutdown =>
executor.stop()
stop()
rpcEnv.shutdown()
}
Executor
的launchTask
方法:该方法的作用是为每个Task
创建一个TaskRunner
,然后将TaskRunner
放入内存缓存中,然后再将TaskRunner
放入线程池中,等待线程执行。
def launchTask(
context: ExecutorBackend,
taskId: Long,
attemptNumber: Int,
taskName: String,
serializedTask: ByteBuffer): Unit = {
//为每一个Task都创建一个对应的TaskRunner对象,TaskRunner继承了Java的Runnable接口
val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
serializedTask)
//将TaskRunner放入内存缓存
runningTasks.put(taskId, tr)
//Executor内部有一个Java线程池,然后将Task封装到TaskRunner线程,直接放到
//线程池中去执行,如果线程池中线程不够用的,就会等待有了空闲的线程在开始执行
threadPool.execute(tr)
}
TaskRunner
继承了Runable
接口,执行Task
的程序都放在了多线程的run
方法里了,每当一个Task
过来就会创建一个TaskRunner
对象,并且创建一个线程线程去执行Task
,然后这些TaskRunner
会放到线程池中去执行。下边是run
方法的源码解析
override def run(): Unit = {
//为Task分配一个内存管理器
val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId)
//记录反序列化的时间
val deserializeStartTime = System.currentTimeMillis()
Thread.currentThread.setContextClassLoader(replClassLoader)
//创建一个序列化器,用来对Task数据进行反序列化
val ser = env.closureSerializer.newInstance()
logInfo(s"Running $taskName (TID $taskId)")
//向Driver发送Task当前的执行状态
execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER)
var taskStart: Long = 0
startGCTime = computeTotalGcTime()
try {
//对序列化后的Task数据进行反序列化
val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask)
//通过网络通信,获取Task依赖的文件、资源、jar包,比如说Hadoop的配置文件
updateDependencies(taskFiles, taskJars)
//通过反序列化将Task进行反序列化
//类加载的作用:用发射动态加载一个类,创建类的对象
task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader)
task.setTaskMemoryManager(taskMemoryManager)
//如果在序列化之前以及被停掉了,那么就会马上退出,否则就会继续执行Task
if (killed) {
throw new TaskKilledException
}
logDebug("Task " + taskId + "'s epoch is " + task.epoch)
env.mapOutputTracker.updateEpoch(task.epoch)
// 计算出Task开始的时间
taskStart = System.currentTimeMillis()
var threwException = true
//value:就是MapStatus,因为执行Task所得结果其实就是Shuffle操作,那么Shuffle
//操作后的结果会被持久化到对应Shuffle文件中,MapStatus它封装了Shuffle的文件地址,以及计算结果的大小。
//后边会将这个MapStatus序列化,返回给对应Executor的CoraseGrainedBackend上
val (value, accumUpdates) = try {
//执行Task最核心的方法,不要着急,我们会在下边的源码中讲到
val res = task.run(
taskAttemptId = taskId,
attemptNumber = attemptNumber,
metricsSystem = env.metricsSystem)
threwException = false
res
} finally {
//当Task执行成功或者失败都会释放内存
val freedMemory = taskMemoryManager.cleanUpAllAllocatedMemory()
//监测是否内存泄漏,如果泄漏就会跑出异常
if (freedMemory > 0) {
val errMsg = s"Managed memory leak detected; size = $freedMemory bytes, TID = $taskId"
if (conf.getBoolean("spark.unsafe.exceptionOnMemoryLeak", false) && !threwException) {
throw new SparkException(errMsg)
} else {
logError(errMsg)
}
}
}
//计算出Task结束的时间
val taskFinish = System.currentTimeMillis()
// If the task has been killed, let's fail it.
if (task.killed) {
throw new TaskKilledException
}
//为Task执行后的到的结果创建序列化器
val resultSer = env.serializer.newInstance()
//记录序列化Task执行结果的时间
val beforeSerialization = System.currentTimeMillis()
//序列化Task执行后的结果,因为这个结果会返回给Driver
val valueBytes = resultSer.serialize(value)
//记录序列化Task结果的完成的时间
val afterSerialization = System.currentTimeMillis()
//设置Task运行时候的一些指标,这些都会在SparkUI上显示
for (m <- task.metrics) {
m.setExecutorDeserializeTime(
(taskStart - deserializeStartTime) + task.executorDeserializeTime)
m.setExecutorRunTime((taskFinish - taskStart) - task.executorDeserializeTime)
m.setJvmGCTime(computeTotalGcTime() - startGCTime)
m.setResultSerializationTime(afterSerialization - beforeSerialization)
m.updateAccumulators()
}
//一个包含了Task结果与累加器的更新的TaskResult
val directResult = new DirectTaskResult(valueBytes, accumUpdates, task.metrics.orNull)
//序列化TaskResult
val serializedDirectResult = ser.serialize(directResult)
//计算序TaskResult序列后的大小
val resultSize = serializedDirectResult.limit
// directSend = sending directly back to the driver
val serializedResult: ByteBuffer = {
//如果执行结果序列化后的大小是否大于最大的限制大小(可配置,默认是1G),如果大于最大的大小,那么直接丢弃它
if (maxResultSize > 0 && resultSize > maxResultSize) {
logWarning(s"Finished $taskName (TID $taskId). Result is larger than maxResultSize " +
s"(${Utils.bytesToString(resultSize)} > ${Utils.bytesToString(maxResultSize)}), " +
s"dropping it.")
ser.serialize(new IndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))
//如果执行结果序列化后的大小超出阈值大小,但是不超过最大限制大小(1G),
//那么序列化的结果不直接发送给Driver,而是通过BlockManage获取
} else if (resultSize >= akkaFrameSize - AkkaUtils.reservedSizeBytes) {
val blockId = TaskResultBlockId(taskId)
env.blockManager.putBytes(
blockId, serializedDirectResult, StorageLevel.MEMORY_AND_DISK_SER)
logInfo(
s"Finished $taskName (TID $taskId). $resultSize bytes result sent via BlockManager)")
ser.serialize(new IndirectTaskResult[Any](blockId, resultSize))
//如果没有超出阈值,那么就会直接返回给Driver
} else {
logInfo(s"Finished $taskName (TID $taskId). $resultSize bytes result sent to driver")
serializedDirectResult
}
}
//向Driver(其实是Executor所在的CoraseGrainedBackend)发送对应Task的执行结果与执行状态
//因为Executor启动以后会向CoraseGrainedBackend进行注册。
execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult)
//下边是一些异常捕获,不同执行程序可能遇到不同的异常
//根据不同的异常对程序做不同的处理
} catch {
case ffe: FetchFailedException =>
val reason = ffe.toTaskEndReason
execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
case _: TaskKilledException | _: InterruptedException if task.killed =>
logInfo(s"Executor killed $taskName (TID $taskId)")
execBackend.statusUpdate(taskId, TaskState.KILLED, ser.serialize(TaskKilled))
case cDE: CommitDeniedException =>
val reason = cDE.toTaskEndReason
execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
case t: Throwable =>
// Attempt to exit cleanly by informing the driver of our failure.
// If anything goes wrong (or this was a fatal exception), we will delegate to
// the default uncaught exception handler, which will terminate the Executor.
logError(s"Exception in $taskName (TID $taskId)", t)
val metrics: Option[TaskMetrics] = Option(task).flatMap { task =>
task.metrics.map { m =>
m.setExecutorRunTime(System.currentTimeMillis() - taskStart)
m.setJvmGCTime(computeTotalGcTime() - startGCTime)
m.updateAccumulators()
m
}
}
val serializedTaskEndReason = {
try {
ser.serialize(new ExceptionFailure(t, metrics))
} catch {
case _: NotSerializableException =>
// t is not serializable so just send the stacktrace
ser.serialize(new ExceptionFailure(t, metrics, false))
}
}
execBackend.statusUpdate(taskId, TaskState.FAILED, serializedTaskEndReason)
// Don't forcibly exit unless the exception was inherently fatal, to avoid
// stopping other tasks unnecessarily.
if (Utils.isFatalError(t)) {
SparkUncaughtExceptionHandler.uncaughtException(t)
}
} finally {
//Task执行完毕以后将task从RunningTask的队列中移除去
runningTasks.remove(taskId)
}
}
}
Executor
的updateDependencies
方法,该方法的作用就是通过网络通信,获取Task
依赖的文件、资源、jar
包,比如说Hadoop
的配置文件
private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) {
//获取Hadoop的配置文件
lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
//同步代码块,因为在CoarseGrainedExecutorBackend进程中运行多个线程,
//来执行不同的Task那么多个线程访问同一个资源,就会出现线程安全问题,
//所以为了避免数据同步问题,加上同步到代码块
synchronized {
// 遍历要拉去的文件
for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) {
logInfo("Fetching " + name + " with timestamp " + timestamp)
// Fetch file with useCache mode, close cache for local mode.
//通过Utils.fetchFile方法,利用网络通信来拉去依赖文件
Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
currentFiles(name) = timestamp
}
//遍历拉去的Jar
for ((name, timestamp) <- newJars) {
val localName = name.split("/").last
val currentTimeStamp = currentJars.get(name)
.orElse(currentJars.get(localName))
.getOrElse(-1L)
//处理时间戳的问题,保证了Jar的时间戳小于当前时间戳
if (currentTimeStamp < timestamp) {
logInfo("Fetching " + name + " with timestamp " + timestamp)
// 通过Utils.fetchFile方法,利用网络通信进行Jar的拉去
Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
currentJars(name) = timestamp
// Add it to our class loader
val url = new File(SparkFiles.getRootDirectory(), localName).toURI.toURL
if (!urlClassLoader.getURLs().contains(url)) {
logInfo("Adding " + url + " to class loader")
urlClassLoader.addURL(url)
}
}
}
}
}
Task
里的run
方法,也就是执行Task
所需的准备工作的结尾
final def run(
taskAttemptId: Long,
attemptNumber: Int,
metricsSystem: MetricsSystem)
: (T, AccumulatorUpdates) = {
//创建TaskContext,也就是Task执行的上下文,封装了Task执行所需要的数据
//stageId:属于哪个Stage,partitionId:所处理的分区,attemptNumber:尝试执行的次数,
//taskMemoryManager:所需要的内存管理器,metricsSystem:系统指标,
//internalAccumulators:内部去累加器
context = new TaskContextImpl(
stageId,
partitionId,
taskAttemptId,
attemptNumber,
taskMemoryManager,
metricsSystem,
internalAccumulators,
runningLocally = false)
TaskContext.setTaskContext(context)
context.taskMetrics.setHostname(Utils.localHostName())
context.taskMetrics.setAccumulatorsUpdater(context.collectInternalAccumulators)
taskThread = Thread.currentThread()
if (_killed) {
kill(interruptThread = false)
}
try {
//调用runTask方法,因为runTask是一个抽象类,所以它的处理逻辑都是基于子类来实现的
//因为Task的子类有两个,一个是ShuffleMapTask,另个一是ResultTask,如果想看具体的Task
//执行程序,就需要到这两个子类去解析具体的处理逻辑
(runTask(context), context.collectAccumulators())
} finally {
context.markTaskCompleted()
try {
Utils.tryLogNonFatalError {
// Release memory used by this thread for unrolling blocks
SparkEnv.get.blockManager.memoryStore.releaseUnrollMemoryForThisTask()
// Notify any tasks waiting for execution memory to be freed to wake up and try to
// acquire memory again. This makes impossible the scenario where a task sleeps forever
// because there are no other tasks left to notify it. Since this is safe to do but may
// not be strictly necessary, we should revisit whether we can remove this in the future.
val memoryManager = SparkEnv.get.memoryManager
memoryManager.synchronized { memoryManager.notifyAll() }
}
} finally {
TaskContext.unset()
}
}
}
列表内容