Spark运行内幕
1、编写的一个WordCount的例子
【例】WordCount代码
1. val conf = new SparkConf() //创建SparkConf对象
conf.setAppName("Wow,My First Spark App!") //设置应用程序的名称,
conf.setMaster("local") // 程序在本地运行,但是以下的例子以standlone模式分析
2. val sc = new SparkContext(conf)
2、 val sc = newSparkContext(conf) 这里开始迈入SparkContext天堂之门!
new SparkContext 新建一个SparkContext实例,class SparkContext(config: SparkConf){}类中的语句除了方法以外,所有的语句都会执行!之前的一些语句import包,定义辅助构造器、定义属性、方法等等,一大堆的代码先略过。从SparkContext.scala第522行开始。
【源代码】SparkContext.scala文件:第522行-536行
3. // Create and start thescheduler
val (sched, ts) = SparkContext.createTaskScheduler(this,master) //第522行
_schedulerBackend = sched
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)
// start TaskScheduler after taskScheduler sets DAGScheduler reference inDAGScheduler's
// constructor
_taskScheduler.start()
_applicationId = _taskScheduler.applicationId()
_applicationAttemptId = taskScheduler.applicationAttemptId()
_conf.set("spark.app.id", _applicationId)
_ui.foreach(_.setAppId(_applicationId))
_env.blockManager.initialize(_applicationId)
3、在SparkContext.scala源代码文件中按Ctl+ createTaskScheduler
点击,代码跳转到SparkContext.scala第2592行的函数定义,这里是standlone模式,在第2629行分析SPARK_REGEX(sparkUrl)
【源代码】SparkContext.scala文件:第2629行-2634行
4. /**
* Create a task scheduler based on agiven master URL.
* Return a 2-tuple of the schedulerbackend and the task scheduler.
*/
private def createTaskScheduler(
sc: SparkContext,
master: String): (SchedulerBackend,TaskScheduler) = {
import SparkMasterRegex._
5. 。。。。。。。
6. 第2629行
7. case SPARK_REGEX(sparkUrl)=>
val scheduler = new TaskSchedulerImpl(sc)
val masterUrls =sparkUrl.split(",").map("spark://" + _)
val backend = newSparkDeploySchedulerBackend(scheduler, sc,masterUrls)
scheduler.initialize(backend)
(backend, scheduler)
4、在SparkContext实例化的时候调用createTaskScheduler来创建TaskSchedulerImpl(SparkContext.scala文件:第2630行)和SparkDeploySchedulerBackend(SparkContext.scala文件:第2632行)。这里将TaskSchedulerImpl实例返回给scheduler变量,SparkDeploySchedulerBackend实例返回给backend,然后(backend,scheduler)作为元组返回给【源代码】SparkContext.scala文件:第522行-536行中的_schedulerBackend,以及_taskScheduler,其中_taskScheduler就是TaskSchedulerImpl实例,然后启动_taskScheduler.start(),就是启动了TaskSchedulerImpl的方法start()
【源代码】SparkContext.scala文件:第522行-536行
8. // Create and start thescheduler
val (sched, ts) = SparkContext.createTaskScheduler(this, master) //第522行
_schedulerBackend = sched
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)
// start TaskScheduler after taskScheduler sets DAGScheduler reference inDAGScheduler's
// constructor
_taskScheduler.start()
5、_taskScheduler.start()启动
就是启动了TaskSchedulerImpl的方法start(),TaskSchedulerImpl的方法start()调用了backend.start(),backend就是在SparkContext.scala文件:第2629行-2634行中赋值的,backend是SparkDeploySchedulerBackend实例,调用了backend.start()就是调用了SparkDeploySchedulerBackend的start方法
9. val backend = newSparkDeploySchedulerBackend(scheduler, sc, masterUrls)
scheduler.initialize(backend)
【源代码】TaskSchedulerImpl.scala文件:第143行-154行
10. override def start() {
backend.start()
if (!isLocal &&conf.getBoolean("spark.speculation", false)) {
logInfo("Starting speculativeexecution thread")
speculationScheduler.scheduleAtFixedRate(newRunnable {
override def run(): Unit =Utils.tryOrStopSparkContext(sc) {
checkSpeculatableTasks()
}
}, SPECULATION_INTERVAL_MS,SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
}
}
6、backend.start()即SparkDeploySchedulerBackend启动start(),其中SparkDeploySchedulerBackend.scala的第93-94行,里面定义了
val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
等appDesc的描述内容,传入Command来指定具体为当前应用程序启动的Executor进行的入口类的名称为CoarseGrainedExecutorBackend,然后将appDesc内容作为AppClient的参数,。创建AppClient对象并调用AppClient对象的start方法
【源代码】SparkDeploySchedulerBackend.scala文件:第52行-98行
11. override def start() {
super.start()
launcherBackend.connect()
// The endpoint for executors to talkto us
val driverUrl =rpcEnv.uriOf(SparkEnv.driverActorSystemName,
RpcAddress(sc.conf.get("spark.driver.host"),sc.conf.get("spark.driver.port").toInt),
CoarseGrainedSchedulerBackend.ENDPOINT_NAME)
val args = Seq(
"--driver-url", driverUrl,
"--executor-id","{{EXECUTOR_ID}}",
"--hostname","{{HOSTNAME}}",
"--cores","{{CORES}}",
"--app-id","{{APP_ID}}",
"--worker-url","{{WORKER_URL}}")
val extraJavaOpts =sc.conf.getOption("spark.executor.extraJavaOptions")
.map(Utils.splitCommandString).getOrElse(Seq.empty)
val classPathEntries =sc.conf.getOption("spark.executor.extraClassPath")
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
val libraryPathEntries =sc.conf.getOption("spark.executor.extraLibraryPath")
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
// When testing, expose the parentclass path to the child. This is processed by
// compute-classpath.{cmd,sh} and makesall needed jars available to child processes
// when the assembly is built with the"*-provided" profiles enabled.
val testingClassPath =
if (sys.props.contains("spark.testing")){
sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq
} else {
Nil
}
// Start executors with a few necessaryconfigs for registering with the scheduler
val sparkJavaOpts = Utils.sparkJavaOpts(conf,SparkConf.isExecutorStartupConf)
val javaOpts = sparkJavaOpts ++extraJavaOpts
val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
val appUIAddress =sc.ui.map(_.appUIAddress).getOrElse("")
val coresPerExecutor =conf.getOption("spark.executor.cores").map(_.toInt)
val appDesc = newApplicationDescription(sc.appName, maxCores, sc.executorMemory,
command, appUIAddress,sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)
client = new AppClient(sc.env.rpcEnv,masters, appDesc, this, conf)
client.start()
launcherBackend.setState(SparkAppHandle.State.SUBMITTED)
waitForRegistration()
launcherBackend.setState(SparkAppHandle.State.RUNNING)
}
7、AppClient对象的start方法,在该start方法中会创建ClientEndpoint。
【源代码】AppClient.scala文件:第281行-284行
12.
13. def start() {
// Just launch an rpcEndpoint; it willcall back into the listener.
endpoint.set(rpcEnv.setupEndpoint("AppClient", newClientEndpoint(rpcEnv)))
}
8、然后调用AppClient的onstart方法。
【源代码】AppClient.scala文件:第85行-94行
14.
15. override def onStart(): Unit = {
try {
registerWithMaster(1)
} catch {
case e: Exception =>
logWarning("Failed to connectto master", e)
markDisconnected()
stop()
}
}
9、ClientEndpoint启动并通过registerWithMaster,registerWithMaster再调用tryRegisterMaster来注册当前的应用程序到Master中
【源代码】AppClient.scala文件:第125行-142行
16. /**
* Register with all mastersasynchronously. It will call `registerWithMaster` every
* REGISTRATION_TIMEOUT_SECONDS secondsuntil exceeding REGISTRATION_RETRIES times.
* Once we connect to a mastersuccessfully, all scheduling work and Futures will be cancelled.
*
* nthRetry means this is the nth attemptto register with master.
*/
private def registerWithMaster(nthRetry: Int){
registerMasterFutures.set(tryRegisterAllMasters())
registrationRetryTimer.set(registrationRetryThread.scheduleAtFixedRate(newRunnable {
override def run(): Unit = {
Utils.tryOrExit {
if (registered.get) {
registerMasterFutures.get.foreach(_.cancel(true))
registerMasterThreadPool.shutdownNow()
} else if (nthRetry >=REGISTRATION_RETRIES) {
markDead("All masters areunresponsive! Giving up.")
} else {
registerMasterFutures.get.foreach(_.cancel(true))
registerWithMaster(nthRetry +1)
}
}
}
}, REGISTRATION_TIMEOUT_SECONDS, REGISTRATION_TIMEOUT_SECONDS,TimeUnit.SECONDS))
}
10、调用tryRegisterAllMasters方法,发送RegisterApplication(appDescription,self))消息向Master注册
17. val masterRef =
rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress,Master.ENDPOINT_NAME)
masterRef.send(RegisterApplication(appDescription, self))
【源代码】AppClient.scala文件:第99行-116行
18. /**
* Register with all masters asynchronously and returns an array `Future`sfor cancellation.
*/
private def tryRegisterAllMasters():Array[JFuture[_]] = {
for (masterAddress <-masterRpcAddresses) yield {
registerMasterThreadPool.submit(newRunnable {
override def run(): Unit = try {
if (registered.get) {
return
}
logInfo("Connecting tomaster " + masterAddress.toSparkURL + "...")
val masterRef =
rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress,Master.ENDPOINT_NAME)
masterRef.send(RegisterApplication(appDescription,self))
} catch{
case ie: InterruptedException=> // Cancelled
case NonFatal(e) =>logWarning(s"Failed to connect to master $masterAddress", e)
}
})
}
}
12、master收到RegisterApplication消息以后,Master接受到注册信息后如何可以运行程序,则会为该程序生产Job ID并通过schedule来分配计算资源,具体计算资源的分配是通过应用程序的运行方式、Memory、cores等配置信息来决定的,schedule()资源调度
【源代码】Master.scala文件:第244行-257行
19. caseRegisterApplication(description, driver) => {
// TODO Prevent repeated registrationsfrom some driver
if (state == RecoveryState.STANDBY) {
// ignore, don't send response
} else {
logInfo("Registering app "+ description.name)
val app =createApplication(description, driver)
registerApplication(app)
logInfo("Registered app " +description.name + " with ID " + app.id)
persistenceEngine.addApplication(app)
driver.send(RegisteredApplication(app.id,self))
schedule()
}
}
13、master进行schedule()资源调度,在一台worker上启动driver,launchDriver(worker, driver),然后在worker上启动executors
【源代码】Master.scala文件:第701行-708行
20. /**
* Schedule the currently availableresources among waiting apps. This method will be called
* every time a new app joins or resourceavailability changes.
*/
private def schedule(): Unit = {
if (state != RecoveryState.ALIVE) {return }
// Drivers take strict precedence overexecutors
val shuffledWorkers =Random.shuffle(workers) // Randomization helps balance drivers
for (worker <- shuffledWorkers ifworker.state == WorkerState.ALIVE) {
for (driver <- waitingDrivers) {
if (worker.memoryFree >=driver.desc.mem && worker.coresFree >= driver.desc.cores) {
launchDriver(worker, driver)
waitingDrivers -= driver
}
}
}
startExecutorsOnWorkers()
}
14、master进行schedule()资源调度, 在workers上启动executors 。
【源代码】Master.scala文件:第655行-676行
21. /**
* Schedule and launch executors onworkers
*/
private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFOscheduler. We keep trying to fit in the first app
// in the queue, then the second app,etc.
for (app <- waitingApps ifapp.coresLeft > 0) {
val coresPerExecutor: Option[Int] =app.desc.coresPerExecutor
// Filter out workers that don't haveenough resources to launch an executor
val usableWorkers =workers.toArray.filter(_.state == WorkerState.ALIVE)
.filter(worker =>worker.memoryFree >= app.desc.memoryPerExecutorMB &&
worker.coresFree >=coresPerExecutor.getOrElse(1))
.sortBy(_.coresFree).reverse
val assignedCores =scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)
// Now that we've decided how manycores to allocate on each worker, let's allocate them
for (pos <- 0 untilusableWorkers.length if assignedCores(pos) > 0) {
allocateWorkerResourceToExecutors(
app, assignedCores(pos), coresPerExecutor,usableWorkers(pos))
}
}
}
15、master决定好了分配多少cores给worker,就开始分配启动worker。
【源代码】Master.scala文件:第684行-699行
22. private defallocateWorkerResourceToExecutors(
app: ApplicationInfo,
assignedCores: Int,
coresPerExecutor: Option[Int],
worker: WorkerInfo): Unit = {
// If the number of cores per executoris specified, we divide the cores assigned
// to this worker evenly among theexecutors with no remainder.
// Otherwise, we launch a single executorthat grabs all the assignedCores on this worker.
val numExecutors = coresPerExecutor.map{ assignedCores / _ }.getOrElse(1)
val coresToAssign =coresPerExecutor.getOrElse(assignedCores)
for (i <- 1 to numExecutors) {
val exec = app.addExecutor(worker,coresToAssign)
launchExecutor(worker, exec)
app.state = ApplicationState.RUNNING
}
}
16、master启动worker
【源代码】Master.scala文件:第720行-727行
23. private deflaunchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {
logInfo("Launching executor "+ exec.fullId + " on worker " + worker.id)
worker.addExecutor(exec)
worker.endpoint.send(LaunchExecutor(masterUrl,
exec.application.id, exec.id,exec.application.desc, exec.cores, exec.memory))
exec.application.driver.send(
ExecutorAdded(exec.id, worker.id,worker.hostPort, exec.cores, exec.memory))
}
17、worker收到LaunchExecutor消息了,首先分配ExecutorRunner
【源代码】worker.scala文件:第431-487行
24. case LaunchExecutor(masterUrl,appId, execId, appDesc, cores_, memory_) =>
if (masterUrl != activeMasterUrl) {
logWarning("Invalid Master(" + masterUrl + ") attempted to launch executor.")
} else {
try {
logInfo("Asked to launchexecutor %s/%d for %s".format(appId, execId, appDesc.name))
// Create the executor's workingdirectory
val executorDir = new File(workDir,appId + "/" + execId)
if (!executorDir.mkdirs()) {
throw newIOException("Failed to create directory " + executorDir)
}
// Create local dirs for theexecutor. These are passed to the executor via the
// SPARK_EXECUTOR_DIRS environmentvariable, and deleted by the Worker when the
// application finishes.
val appLocalDirs =appDirectories.get(appId).getOrElse {
Utils.getOrCreateLocalRootDirs(conf).map{ dir =>
val appDir =Utils.createDirectory(dir, namePrefix = "executor")
Utils.chmod700(appDir)
appDir.getAbsolutePath()
}.toSeq
}
appDirectories(appId) =appLocalDirs
val manager = new ExecutorRunner(
appId,
execId,
appDesc.copy(command =Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
cores_,
memory_,
self,
workerId,
host,
webUi.boundPort,
publicAddress,
sparkHome,
executorDir,
workerUri,
conf,
appLocalDirs,ExecutorState.RUNNING)
executors(appId + "/" +execId) = manager
manager.start()
coresUsed += cores_
memoryUsed += memory_
sendToMaster(ExecutorStateChanged(appId,execId, manager.state, None, None))
} catch {
case e: Exception => {
logError(s"Failed to launchexecutor $appId/$execId for ${appDesc.name}.", e)
if (executors.contains(appId +"/" + execId)) {
executors(appId + "/"+ execId).kill()
executors -= appId +"/" + execId
}
sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
Some(e.toString), None))
}
}
}
18、worker分配ExecutorRunner赋值给manager,然后manager.start()
【源代码】ExecutorRunner.scala文件:第67-80行
25. private[worker] def start() {
workerThread = newThread("ExecutorRunner for " + fullId) {
override def run() {fetchAndRunExecutor() }
}
19、ExecutorRunner调用start方法,start调用fetchAndRunExecutor方法;fetchAndRunExecutor下载运行的程序并运行executor。
【源代码】ExecutorRunner.scala文件:第132-186行
26. /**
* Download and run the executordescribed in our ApplicationDescription
*/
private def fetchAndRunExecutor() {
try {
// Launch the process
val builder =CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
memory, sparkHome.getAbsolutePath,substituteVariables)
val command = builder.command()
val formattedCommand =command.asScala.mkString("\"", "\" \"","\"")
logInfo(s"Launch command:$formattedCommand")
builder.directory(executorDir)
builder.environment.put("SPARK_EXECUTOR_DIRS",appLocalDirs.mkString(File.pathSeparator))
// In case we are running this fromwithin the Spark Shell, avoid creating a "scala"
// parent process for the executorcommand
builder.environment.put("SPARK_LAUNCH_WITH_SCALA","0")
// Add webUI log urls
val baseUrl =
s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="
builder.environment.put("SPARK_LOG_URL_STDERR",s"${baseUrl}stderr")
builder.environment.put("SPARK_LOG_URL_STDOUT",s"${baseUrl}stdout")
process = builder.start()
val header = "Spark ExecutorCommand: %s\n%s\n\n".format(
formattedCommand, "=" *40)
// Redirect its stdout and stderr tofiles
val stdout = new File(executorDir,"stdout")
stdoutAppender =FileAppender(process.getInputStream, stdout, conf)
val stderr = new File(executorDir,"stderr")
Files.write(header, stderr, UTF_8)
stderrAppender =FileAppender(process.getErrorStream, stderr, conf)
// Wait for it to exit; executor mayexit with code 0 (when driver instructs it to shutdown)
// or with nonzero exit code
val exitCode = process.waitFor()
state = ExecutorState.EXITED
val message = "Command exitedwith code " + exitCode
worker.send(ExecutorStateChanged(appId, execId, state, Some(message),Some(exitCode)))
} catch {
case interrupted:InterruptedException => {
logInfo("Runner thread forexecutor " + fullId + " interrupted")
state = ExecutorState.KILLED
killProcess(None)
}
case e: Exception => {
logError("Error runningexecutor", e)
state = ExecutorState.FAILED
killProcess(Some(e.toString))
}
}
}
ExecutorRunner内部会通过Thread的方式构建ProcessBuilder来启动另外一个JVM进程,这个JVM进程启动时候加载的main方法所在的类的名称就是在创建ClientEndpoint时传入的Command来指定具体名称为CoarseGrainedExecutorBackend的类,此时JVM在通过ProcessBuilder启动的时候获得了CoarseGrainedExecutorBackend后加载并调用其中的main方法,在main方法中会实例化CoarseGrainedExecutorBackend本身这个消息循环体
补充说明:
ExecutorRunner.scala文件:第138行
val builder = CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
memory, sparkHome.getAbsolutePath,substituteVariables)
直接调用object对象CommandUtils的buildProcessBuilder方法,记录command的spark classpath信息
【源代码】CommandUtils.scala文件:第35-58行
/**
* Build a ProcessBuilder based on thegiven parameters.
* The `env` argument is exposed fortesting.
*/
def buildProcessBuilder(
command: Command,
securityMgr: SecurityManager,
memory: Int,
sparkHome: String,
substituteArguments: String =>String,
classPaths: Seq[String] =Seq[String](),
env: Map[String, String] = sys.env):ProcessBuilder = {
val localCommand = buildLocalCommand(
command, securityMgr,substituteArguments, classPaths, env)
val commandSeq =buildCommandSeq(localCommand, memory, sparkHome)
val builder = newProcessBuilder(commandSeq: _*)
valenvironment = builder.environment()
for ((key, value) <-localCommand.environment) {
environment.put(key, value)
}
builder
}
在CommandUtils.scala文件第52行,新建了一个ProcessBuilder实例赋值给builder,这里ProcessBuilder是java开发的,ProcessBuilder的构造器传入多个command参数, this.command是一个字符串列表,包含运行的参数。buildProcessBuilder返回的builder赋值给ExecutorRunner.scala文件中的第138行builder
【源代码】ProcessBuilder.java文件:第204-219行
/**
* Constructs a process builder with thespecified operating
* system program and arguments. This is a convenience
* constructor that sets the processbuilder's command to a string
* list containing the same strings asthe {@code command}
* array, in the same order. It is not checked whether
* {@code command} corresponds to a validoperating system
* command.
*
* @param command a string arraycontaining the program and its arguments
*/
public ProcessBuilder(String...command) {
this.command = newArrayList<>(command.length);
for (String arg : command)
this.command.add(arg);
}
ProcessBuilder实例的启动
【源代码】ExecutorRunner.scala文件:第156行
process = builder.start()
ProcessBuilder启动start()方法,这里ProcessImpl就是jvm新开辟的线程
【源代码】ProcessBuilder.java文件:第1004-1054行
public Process start() throws IOException {
// Must convert to array first -- amalicious user-supplied
// list might try to circumvent thesecurity check.
String[] cmdarray =command.toArray(new String[command.size()]);
cmdarray = cmdarray.clone();
for (String arg : cmdarray)
if (arg == null)
throw newNullPointerException();
// Throws IndexOutOfBoundsExceptionif command is empty
String prog = cmdarray[0];
SecurityManager security =System.getSecurityManager();
if (security != null)
security.checkExec(prog);
String dir = directory == null ? null: directory.toString();
for (int i = 1; i
throw newIOException("invalid null character in command");
}
}
try {
return ProcessImpl.start(cmdarray,
environment,
dir,
redirects,
redirectErrorStream);
} catch (IOException |IllegalArgumentException e) {
String exceptionInfo = ":" + e.getMessage();
Throwable cause = e;
if ((e instanceof IOException)&& security != null) {
// Can not disclose the failreason for read-protected files.
try {
security.checkRead(prog);
} catch (SecurityExceptionse) {
exceptionInfo ="";
cause = se;
}
}
// It's much easier for us to createa high-quality error
// message than the low-level Ccode which found the problem.
throw new IOException(
"Cannot run program\"" + prog + "\""
+ (dir == null ? "": " (in directory \"" + dir + "\")")
+ exceptionInfo,
cause);
}
}
再次梳理一下整个command的流程:
l SparkDeploySchedulerBackend.scala文件:第87行定义了command
valcommand =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
args, sc.executorEnvs, classPathEntries ++ testingClassPath,libraryPathEntries, javaOpts)
l Command的数据结构如下,定义了mainClass,mainClass就是org.apache.spark.executor.CoarseGrainedExecutorBackend,还定义了arguments、environment、classPathEntries、libraryPathEntries、javaOpts等参数
【源代码】Command.scala文件:第22-29行
private[spark] case class Command(
mainClass: String,
arguments: Seq[String],
environment: Map[String, String],
classPathEntries: Seq[String],
libraryPathEntries: Seq[String],
javaOpts: Seq[String]) {
}
l 然后一直传递下去,SparkDeploySchedulerBackend->AppClient对象->ClientEndpoint->tryRegisterMaster->Master->Worker->ProcessBuilder->CoarseGrainedExecutorBackend
20、CoarseGrainedExecutorBackend在实例化的时候会通过回调onStart向DriverEndpoint发送RegisterExecutor来注册当前的CoarseGrainedExecutorBackend
【源代码】CoarseGrainedExecutorBackend.scala文件:第55-72行
27. override def onStart() {
logInfo("Connecting to driver:" + driverUrl)
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so wecan use "ThreadUtils.sameThread"
driver = Some(ref)
ref.ask[RegisterExecutorResponse](
RegisterExecutor(executorId, self, hostPort,cores, extractLogUrls))
}(ThreadUtils.sameThread).onComplete {
// This is a very fast action so wecan use "ThreadUtils.sameThread"
case Success(msg) =>Utils.tryLogNonFatalError {
Option(self).foreach(_.send(msg))// msg must be RegisterExecutorResponse
}
case Failure(e) => {
logError(s"Cannot registerwith driver: $driverUrl", e)
System.exit(1)
}
}(ThreadUtils.sameThread)
}