Spark Core(十五)Executor执行Task的原理与源码分析(一)

  1. Executor执行Task的前期准备
    1. 在我们介绍Executor执行Task之前,先看一个重要的类,它就是CoarseGrainedExecutorBackend
    2. 它创建这个进程的时候会调用onStart方法
    3. 它是ExecutorBackend粗粒度进程,
    4. 它负责向Driver发送Executor的注册请求
    5. 它是一个通信的进程,它可以与Driver相互通信
    6. 它是Executor所在的一个进程名称,Executor才是处理Task真正的对象,Executor处理Task都是由线程池来进行Task的处理的。
    7. 它负责接受Driver返回回来的Executor注册信息,然后创建Executor上下文。
    8. 它负责接受TaskSchedule发送过来的LaunchTask消息,开始Task的启动与计算
  2. Executor执行Task的原理分析
    1. CoarseGrainedExecutorBackend接收到Driver发送过来的RegisteredExecutor消息的时候就会创建Executor
    2. 然后当再次接受Driver发送过来的LaunchTask消息后就会开始执行Task,首先它会对发送来的TaskTaskDescription进行反序列化,然后调用launchTask方法交由Executor去执行Task
    3. launchTask方法中,创建了TaskRunner,然后TaskRunner继承了Runnable接口,然后将这个TaskRunner加入到线程池和缓存中,然后线程池调用executor方法开始Task的执行。
  3. Executor执行Task的原码分析

    1. CoarseGrainedExecutorBackendonStart方法:该方法在创建CoarseGrainedExecutorBackend类的时候被执行,它会向Driver注册Executor

      override def onStart() {
          logInfo("Connecting to driver: " + driverUrl)
          rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
            // This is a very fast action so we can use "ThreadUtils.sameThread"
            driver = Some(ref)
            //向Driver发送Executor的注册请求
            ref.ask[RegisterExecutorResponse](
              RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
          }(ThreadUtils.sameThread).onComplete {
            // This is a very fast action so we can use "ThreadUtils.sameThread"
            case Success(msg) => Utils.tryLogNonFatalError {
              Option(self).foreach(_.send(msg)) // msg must be RegisterExecutorResponse
            }
            case Failure(e) => {
              logError(s"Cannot register with driver: $driverUrl", e)
              System.exit(1)
            }
          }(ThreadUtils.sameThread)
      }
    2. CoarseGrainedExecutorBackendreceive方法:该方法作用就是接受各种消息用的。

      override def receive: PartialFunction[Any, Unit] = {
          //Driver返回Executor注册成功的消息,然后就会创建Executor对象。
          case RegisteredExecutor(hostname) =>
            logInfo("Successfully registered with driver")
            executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
      
          //Driver返回Executor注册失败的消息,然后程序结束执行。
          case RegisterExecutorFailed(message) =>
            logError("Slave registration failed: " + message)
            System.exit(1)
      
          //接受Driver发送过来的LaunchTask消息,这个消息作用就是要求Executor开始执行Task任务
          case LaunchTask(data) =>
            if (executor == null) {
              logError("Received LaunchTask command but executor was null")
              System.exit(1)
            } else {
              //首先会对传过来的TaskDescription进行反序列化,
              val taskDesc = ser.deserialize[TaskDescription](data.value)
              logInfo("Got assigned task " + taskDesc.taskId)
              //调用executor的launchTask方法开始执行Task任务。
              //this:ExecutorBackend,taskId:task的索引Id,attemptNumber:尝试执行的次数,
              //taskDesc.name:task的名称,taskDesc.serializedTask:TaskDescription序列化后的对象
              executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
                taskDesc.name, taskDesc.serializedTask)
            }
      
          case KillTask(taskId, _, interruptThread) =>
            if (executor == null) {
              logError("Received KillTask command but executor was null")
              System.exit(1)
            } else {
              executor.killTask(taskId, interruptThread)
            }
      
          case StopExecutor =>
            logInfo("Driver commanded a shutdown")
            // Cannot shutdown here because an ack may need to be sent back to the caller. So send
            // a message to self to actually do the shutdown.
            self.send(Shutdown)
      
          case Shutdown =>
            executor.stop()
            stop()
            rpcEnv.shutdown()
        }
    3. ExecutorlaunchTask方法:该方法的作用是为每个Task创建一个TaskRunner,然后将TaskRunner放入内存缓存中,然后再将TaskRunner放入线程池中,等待线程执行。

      def launchTask(
            context: ExecutorBackend,
            taskId: Long,
            attemptNumber: Int,
            taskName: String,
            serializedTask: ByteBuffer): Unit = {
          //为每一个Task都创建一个对应的TaskRunner对象,TaskRunner继承了Java的Runnable接口   
          val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
            serializedTask)
          //将TaskRunner放入内存缓存
          runningTasks.put(taskId, tr)
      
          //Executor内部有一个Java线程池,然后将Task封装到TaskRunner线程,直接放到
          //线程池中去执行,如果线程池中线程不够用的,就会等待有了空闲的线程在开始执行
          threadPool.execute(tr)
      }
    4. TaskRunner继承了Runable接口,执行Task的程序都放在了多线程的run方法里了,每当一个Task过来就会创建一个TaskRunner对象,并且创建一个线程线程去执行Task,然后这些TaskRunner会放到线程池中去执行。下边是run方法的源码解析

      override def run(): Unit = {
      
            //为Task分配一个内存管理器
            val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId)
            //记录反序列化的时间
            val deserializeStartTime = System.currentTimeMillis()
            Thread.currentThread.setContextClassLoader(replClassLoader)
            //创建一个序列化器,用来对Task数据进行反序列化
            val ser = env.closureSerializer.newInstance()
            logInfo(s"Running $taskName (TID $taskId)")
      
            //向Driver发送Task当前的执行状态
            execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER)
            var taskStart: Long = 0
            startGCTime = computeTotalGcTime()
      
            try {
              //对序列化后的Task数据进行反序列化
              val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask)
              //通过网络通信,获取Task依赖的文件、资源、jar包,比如说Hadoop的配置文件
              updateDependencies(taskFiles, taskJars)
              //通过反序列化将Task进行反序列化
              //类加载的作用:用发射动态加载一个类,创建类的对象
              task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader)
              task.setTaskMemoryManager(taskMemoryManager)
      
              //如果在序列化之前以及被停掉了,那么就会马上退出,否则就会继续执行Task
              if (killed) {
                throw new TaskKilledException
              }
      
              logDebug("Task " + taskId + "'s epoch is " + task.epoch)
              env.mapOutputTracker.updateEpoch(task.epoch)
      
              // 计算出Task开始的时间
              taskStart = System.currentTimeMillis()
              var threwException = true
              //value:就是MapStatus,因为执行Task所得结果其实就是Shuffle操作,那么Shuffle
              //操作后的结果会被持久化到对应Shuffle文件中,MapStatus它封装了Shuffle的文件地址,以及计算结果的大小。
              //后边会将这个MapStatus序列化,返回给对应Executor的CoraseGrainedBackend上
              val (value, accumUpdates) = try {
              //执行Task最核心的方法,不要着急,我们会在下边的源码中讲到
                val res = task.run(
                  taskAttemptId = taskId,
                  attemptNumber = attemptNumber,
                  metricsSystem = env.metricsSystem)
                threwException = false
                res
              } finally {
                //当Task执行成功或者失败都会释放内存
                val freedMemory = taskMemoryManager.cleanUpAllAllocatedMemory()
                //监测是否内存泄漏,如果泄漏就会跑出异常
                if (freedMemory > 0) {
                  val errMsg = s"Managed memory leak detected; size = $freedMemory bytes, TID = $taskId"
                  if (conf.getBoolean("spark.unsafe.exceptionOnMemoryLeak", false) && !threwException) {
                    throw new SparkException(errMsg)
                  } else {
                    logError(errMsg)
                  }
                }
              }
      
              //计算出Task结束的时间
              val taskFinish = System.currentTimeMillis()
      
              // If the task has been killed, let's fail it.
              if (task.killed) {
                throw new TaskKilledException
              }
      
              //为Task执行后的到的结果创建序列化器
              val resultSer = env.serializer.newInstance()
              //记录序列化Task执行结果的时间
              val beforeSerialization = System.currentTimeMillis()
              //序列化Task执行后的结果,因为这个结果会返回给Driver
              val valueBytes = resultSer.serialize(value)
              //记录序列化Task结果的完成的时间
              val afterSerialization = System.currentTimeMillis()
      
              //设置Task运行时候的一些指标,这些都会在SparkUI上显示
              for (m <- task.metrics) {
      
                m.setExecutorDeserializeTime(
                  (taskStart - deserializeStartTime) + task.executorDeserializeTime)
                m.setExecutorRunTime((taskFinish - taskStart) - task.executorDeserializeTime)
                m.setJvmGCTime(computeTotalGcTime() - startGCTime)
                m.setResultSerializationTime(afterSerialization - beforeSerialization)
                m.updateAccumulators()
              }
              //一个包含了Task结果与累加器的更新的TaskResult
              val directResult = new DirectTaskResult(valueBytes, accumUpdates, task.metrics.orNull)
              //序列化TaskResult
              val serializedDirectResult = ser.serialize(directResult)
              //计算序TaskResult序列后的大小
              val resultSize = serializedDirectResult.limit
      
              // directSend = sending directly back to the driver
              val serializedResult: ByteBuffer = {
                //如果执行结果序列化后的大小是否大于最大的限制大小(可配置,默认是1G),如果大于最大的大小,那么直接丢弃它
      
                if (maxResultSize > 0 && resultSize > maxResultSize) {
                  logWarning(s"Finished $taskName (TID $taskId). Result is larger than maxResultSize " +
                    s"(${Utils.bytesToString(resultSize)} > ${Utils.bytesToString(maxResultSize)}), " +
                    s"dropping it.")
                  ser.serialize(new IndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))
                //如果执行结果序列化后的大小超出阈值大小,但是不超过最大限制大小(1G),
                //那么序列化的结果不直接发送给Driver,而是通过BlockManage获取
                } else if (resultSize >= akkaFrameSize - AkkaUtils.reservedSizeBytes) {
                  val blockId = TaskResultBlockId(taskId)
                  env.blockManager.putBytes(
                    blockId, serializedDirectResult, StorageLevel.MEMORY_AND_DISK_SER)
                  logInfo(
                    s"Finished $taskName (TID $taskId). $resultSize bytes result sent via BlockManager)")
                  ser.serialize(new IndirectTaskResult[Any](blockId, resultSize))
                 //如果没有超出阈值,那么就会直接返回给Driver
                } else {
                  logInfo(s"Finished $taskName (TID $taskId). $resultSize bytes result sent to driver")
                  serializedDirectResult
                }
              }
              //向Driver(其实是Executor所在的CoraseGrainedBackend)发送对应Task的执行结果与执行状态
              //因为Executor启动以后会向CoraseGrainedBackend进行注册。
              execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult)
      
            //下边是一些异常捕获,不同执行程序可能遇到不同的异常
            //根据不同的异常对程序做不同的处理
            } catch {
              case ffe: FetchFailedException =>
                val reason = ffe.toTaskEndReason
                execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
      
              case _: TaskKilledException | _: InterruptedException if task.killed =>
                logInfo(s"Executor killed $taskName (TID $taskId)")
                execBackend.statusUpdate(taskId, TaskState.KILLED, ser.serialize(TaskKilled))
      
              case cDE: CommitDeniedException =>
                val reason = cDE.toTaskEndReason
                execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
      
              case t: Throwable =>
                // Attempt to exit cleanly by informing the driver of our failure.
                // If anything goes wrong (or this was a fatal exception), we will delegate to
                // the default uncaught exception handler, which will terminate the Executor.
                logError(s"Exception in $taskName (TID $taskId)", t)
      
                val metrics: Option[TaskMetrics] = Option(task).flatMap { task =>
                  task.metrics.map { m =>
                    m.setExecutorRunTime(System.currentTimeMillis() - taskStart)
                    m.setJvmGCTime(computeTotalGcTime() - startGCTime)
                    m.updateAccumulators()
                    m
                  }
                }
                val serializedTaskEndReason = {
                  try {
                    ser.serialize(new ExceptionFailure(t, metrics))
                  } catch {
                    case _: NotSerializableException =>
                      // t is not serializable so just send the stacktrace
                      ser.serialize(new ExceptionFailure(t, metrics, false))
                  }
                }
                execBackend.statusUpdate(taskId, TaskState.FAILED, serializedTaskEndReason)
      
                // Don't forcibly exit unless the exception was inherently fatal, to avoid
                // stopping other tasks unnecessarily.
                if (Utils.isFatalError(t)) {
                  SparkUncaughtExceptionHandler.uncaughtException(t)
                }
      
            } finally {
              //Task执行完毕以后将task从RunningTask的队列中移除去
              runningTasks.remove(taskId)
            }
          }
        }
    5. ExecutorupdateDependencies方法,该方法的作用就是通过网络通信,获取Task依赖的文件、资源、jar包,比如说Hadoop的配置文件

       private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) {
      
          //获取Hadoop的配置文件
          lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
      
          //同步代码块,因为在CoarseGrainedExecutorBackend进程中运行多个线程,
          //来执行不同的Task那么多个线程访问同一个资源,就会出现线程安全问题,
          //所以为了避免数据同步问题,加上同步到代码块
          synchronized {
            // 遍历要拉去的文件
            for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) {
              logInfo("Fetching " + name + " with timestamp " + timestamp)
              // Fetch file with useCache mode, close cache for local mode.
              //通过Utils.fetchFile方法,利用网络通信来拉去依赖文件
              Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
                env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
      
              currentFiles(name) = timestamp
            }
            //遍历拉去的Jar
            for ((name, timestamp) <- newJars) {
      
              val localName = name.split("/").last
              val currentTimeStamp = currentJars.get(name)
                .orElse(currentJars.get(localName))
                .getOrElse(-1L)
                //处理时间戳的问题,保证了Jar的时间戳小于当前时间戳
              if (currentTimeStamp < timestamp) {
                logInfo("Fetching " + name + " with timestamp " + timestamp)
                // 通过Utils.fetchFile方法,利用网络通信进行Jar的拉去
                Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
                  env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
                currentJars(name) = timestamp
                // Add it to our class loader
                val url = new File(SparkFiles.getRootDirectory(), localName).toURI.toURL
                if (!urlClassLoader.getURLs().contains(url)) {
                  logInfo("Adding " + url + " to class loader")
                  urlClassLoader.addURL(url)
                }
              }
            }
          }
        }
    6. Task里的run方法,也就是执行Task所需的准备工作的结尾

       final def run(
          taskAttemptId: Long,
          attemptNumber: Int,
          metricsSystem: MetricsSystem)
        : (T, AccumulatorUpdates) = {
      
          //创建TaskContext,也就是Task执行的上下文,封装了Task执行所需要的数据
          //stageId:属于哪个Stage,partitionId:所处理的分区,attemptNumber:尝试执行的次数,
          //taskMemoryManager:所需要的内存管理器,metricsSystem:系统指标,
          //internalAccumulators:内部去累加器
          context = new TaskContextImpl(
            stageId,
            partitionId,
            taskAttemptId,
            attemptNumber,
            taskMemoryManager,
            metricsSystem,
            internalAccumulators,
            runningLocally = false)
          TaskContext.setTaskContext(context)
          context.taskMetrics.setHostname(Utils.localHostName())
          context.taskMetrics.setAccumulatorsUpdater(context.collectInternalAccumulators)
          taskThread = Thread.currentThread()
          if (_killed) {
            kill(interruptThread = false)
          }
          try {
      
            //调用runTask方法,因为runTask是一个抽象类,所以它的处理逻辑都是基于子类来实现的
            //因为Task的子类有两个,一个是ShuffleMapTask,另个一是ResultTask,如果想看具体的Task
            //执行程序,就需要到这两个子类去解析具体的处理逻辑
            (runTask(context), context.collectAccumulators())
          } finally {
            context.markTaskCompleted()
            try {
              Utils.tryLogNonFatalError {
                // Release memory used by this thread for unrolling blocks
                SparkEnv.get.blockManager.memoryStore.releaseUnrollMemoryForThisTask()
                // Notify any tasks waiting for execution memory to be freed to wake up and try to
                // acquire memory again. This makes impossible the scenario where a task sleeps forever
                // because there are no other tasks left to notify it. Since this is safe to do but may
                // not be strictly necessary, we should revisit whether we can remove this in the future.
                val memoryManager = SparkEnv.get.memoryManager
                memoryManager.synchronized { memoryManager.notifyAll() }
              }
            } finally {
              TaskContext.unset()
            }
          }
        }
    7. 列表内容

你可能感兴趣的:(spark,大数据专栏(一)Spark)