重磅!Spark运行内幕 打通Spark系统运行内幕机制流程

Spark运行内幕

1、编写的一个WordCount的例子

【例】WordCount代码

1.       val conf = new SparkConf() //创建SparkConf对象
conf.setAppName("Wow,My First Spark App!") //设置应用程序的名称,
conf.setMaster("local") //
程序在本地运行,但是以下的例子以standlone模式分析

2.       val sc = new SparkContext(conf)

 

2 val sc = newSparkContext(conf)  这里开始迈入SparkContext天堂之门!

new SparkContext 新建一个SparkContext实例,class SparkContext(config: SparkConf){}类中的语句除了方法以外,所有的语句都会执行!之前的一些语句import包,定义辅助构造器、定义属性、方法等等,一大堆的代码先略过。从SparkContext.scala522行开始。

【源代码】SparkContext.scala文件:第522-536

3.       // Create and start thescheduler
val (sched, ts) = SparkContext.createTaskScheduler(this,master) //
522

_schedulerBackend = sched
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)

// start TaskScheduler after taskScheduler sets DAGScheduler reference inDAGScheduler's
// constructor
_taskScheduler.start()

_applicationId = _taskScheduler.applicationId()
_applicationAttemptId = taskScheduler.applicationAttemptId()
_conf.set("spark.app.id", _applicationId)
_ui.foreach(_.setAppId(_applicationId))
_env.blockManager.initialize(_applicationId)

3、在SparkContext.scala源代码文件中按Ctl+ createTaskScheduler

点击,代码跳转到SparkContext.scala第2592行的函数定义,这里是standlone模式,在第2629行分析SPARK_REGEX(sparkUrl)

 

【源代码】SparkContext.scala文件:第2629-2634

 

4.       /**
 * Create a task scheduler based on agiven master URL.
 * Return a 2-tuple of the schedulerbackend and the task scheduler.
 
*/
private def createTaskScheduler(
   
sc: SparkContext,
   
master: String): (SchedulerBackend,TaskScheduler) = {
 
import SparkMasterRegex._

5.       。。。。。。。

6.       2629

7.       case SPARK_REGEX(sparkUrl)=>
  val scheduler = new TaskSchedulerImpl
(sc)
  val masterUrls =sparkUrl.split(",").map("spark://" + _)
  val
backend = newSparkDeploySchedulerBackend
(scheduler, sc,masterUrls)
  scheduler.initialize(backend)
  (backend, scheduler)

 

 

 

 

4、在SparkContext实例化的时候调用createTaskScheduler来创建TaskSchedulerImplSparkContext.scala文件:第2630行)和SparkDeploySchedulerBackendSparkContext.scala文件:第2632行)。这里将TaskSchedulerImpl实例返回给scheduler变量,SparkDeploySchedulerBackend实例返回给backend,然后(backend,scheduler)作为元组返回给【源代码】SparkContext.scala文件:第522-536行中的_schedulerBackend,以及_taskScheduler,其中_taskScheduler就是TaskSchedulerImpl实例,然后启动_taskScheduler.start(),就是启动了TaskSchedulerImpl的方法start()

【源代码】SparkContext.scala文件:第522-536

8.       // Create and start thescheduler
val (sched, ts) = SparkContext.createTaskScheduler(this, master) //
522

_schedulerBackend = sched
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)

// start TaskScheduler after taskScheduler sets DAGScheduler reference inDAGScheduler's
// constructor
_taskScheduler.start()

 

5_taskScheduler.start()启动

就是启动了TaskSchedulerImpl的方法start()TaskSchedulerImpl的方法start()调用了backend.start()backend就是在SparkContext.scala文件:第2629-2634行中赋值的,backendSparkDeploySchedulerBackend实例,调用了backend.start()就是调用了SparkDeploySchedulerBackendstart方法

9.       val backend = newSparkDeploySchedulerBackend(scheduler, sc, masterUrls)
  scheduler.initialize(backend)

【源代码】TaskSchedulerImpl.scala文件:第143-154

10.    override def start() {
  backend.start()


  if (!isLocal &&conf.getBoolean("spark.speculation", false)) {
    logInfo("Starting speculativeexecution thread")
   
speculationScheduler.scheduleAtFixedRate(newRunnable {
     
override def run(): Unit =Utils.tryOrStopSparkContext(sc) {
       
checkSpeculatableTasks()
     
}
   
}, SPECULATION_INTERVAL_MS,SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
 
}
}

 

6backend.start()SparkDeploySchedulerBackend启动start(),其中SparkDeploySchedulerBackend.scala的第9394行,里面定义了 

val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
   
args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

appDesc的描述内容,传入Command来指定具体为当前应用程序启动的Executor进行的入口类的名称为CoarseGrainedExecutorBackend然后将appDesc内容作为AppClient的参数,。创建AppClient对象并调用AppClient对象的start方法

 

【源代码】SparkDeploySchedulerBackend.scala文件:第52-98

 

11.    override def start() {
  super.start()
  launcherBackend.connect()

 
// The endpoint for executors to talkto us
 
val driverUrl =rpcEnv.uriOf(SparkEnv.driverActorSystemName,
   
RpcAddress(sc.conf.get("spark.driver.host"),sc.conf.get("spark.driver.port").toInt),
   
CoarseGrainedSchedulerBackend.ENDPOINT_NAME)
 
val args = Seq(
   
"--driver-url", driverUrl,
   
"--executor-id","{{EXECUTOR_ID}}",
   
"--hostname","{{HOSTNAME}}",
   
"--cores","{{CORES}}",
   
"--app-id","{{APP_ID}}",
   
"--worker-url","{{WORKER_URL}}")
 
val extraJavaOpts =sc.conf.getOption("spark.executor.extraJavaOptions")
   
.map(Utils.splitCommandString).getOrElse(Seq.empty)
 
val classPathEntries =sc.conf.getOption("spark.executor.extraClassPath")
   
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
 
val libraryPathEntries =sc.conf.getOption("spark.executor.extraLibraryPath")
   
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

 
// When testing, expose the parentclass path to the child. This is processed by
 
// compute-classpath.{cmd,sh} and makesall needed jars available to child processes
 
// when the assembly is built with the"*-provided" profiles enabled.
 
val testingClassPath =
   
if (sys.props.contains("spark.testing")){
     
sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq
   
} else {
     
Nil
   
}

 
// Start executors with a few necessaryconfigs for registering with the scheduler
 
val sparkJavaOpts = Utils.sparkJavaOpts(conf,SparkConf.isExecutorStartupConf)
 
val javaOpts = sparkJavaOpts ++extraJavaOpts
  val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
    args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

  val appUIAddress =sc.ui.map(_.appUIAddress).getOrElse("")
  val coresPerExecutor =conf.getOption("spark.executor.cores").map(_.toInt)
  val appDesc = newApplicationDescription(sc.appName, maxCores, sc.executorMemory,
    command, appUIAddress,sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)
 
client = new AppClient(sc.env.rpcEnv,masters, appDesc, this, conf)
  client.start()

 launcherBackend.setState(SparkAppHandle.State.SUBMITTED)
  waitForRegistration()
 
launcherBackend.setState(SparkAppHandle.State.RUNNING)
}

 

 

7AppClient对象的start方法,在该start方法中会创建ClientEndpoint

【源代码】AppClient.scala文件:第281-284

 

12.     

13.    def start() {
  // Just launch an rpcEndpoint; it willcall back into the listener.
 endpoint.set(rpcEnv.setupEndpoint("AppClient", newClientEndpoint(rpcEnv)))
}

8、然后调用AppClientonstart方法。

【源代码】AppClient.scala文件:第85-94

 

14.     

15.    override def onStart(): Unit = {
  try {
    registerWithMaster(1)
 
} catch {
   
case e: Exception =>
     
logWarning("Failed to connectto master", e)
     
markDisconnected()
     
stop()
 
}
}

 

 

 

 9ClientEndpoint启动并通过registerWithMasterregisterWithMaster再调用tryRegisterMaster来注册当前的应用程序到Master

【源代码】AppClient.scala文件:第125-142

 

16.    /**
 * Register with all mastersasynchronously. It will call `registerWithMaster` every
 * REGISTRATION_TIMEOUT_SECONDS secondsuntil exceeding REGISTRATION_RETRIES times.
 
* Once we connect to a mastersuccessfully, all scheduling work and Futures will be cancelled.
 
*
 
* nthRetry means this is the nth attemptto register with master.
 
*/
private def
registerWithMaster
(nthRetry: Int){
  registerMasterFutures.set(tryRegisterAllMasters())

 registrationRetryTimer.set(registrationRetryThread.scheduleAtFixedRate(newRunnable {
    override def run(): Unit = {
     
Utils.tryOrExit {
       
if (registered.get) {
         
registerMasterFutures.get.foreach(_.cancel(true))
         
registerMasterThreadPool.shutdownNow()
       
} else if (nthRetry >=REGISTRATION_RETRIES) {
         
markDead("All masters areunresponsive! Giving up.")
       
} else {
         
registerMasterFutures.get.foreach(_.cancel(true))
         
registerWithMaster(nthRetry +1)
       
}
     
}
   
}
 
}, REGISTRATION_TIMEOUT_SECONDS, REGISTRATION_TIMEOUT_SECONDS,TimeUnit.SECONDS))
}

 

10、调用tryRegisterAllMasters方法,发送RegisterApplication(appDescription,self))消息Master注册

17.    val masterRef =
 rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress,Master.ENDPOINT_NAME)
masterRef.send(RegisterApplication(appDescription, self))

 

【源代码】AppClient.scala文件:第99-116

 

18.    /**
 * Register with all masters asynchronously and returns an array `Future`sfor cancellation.
 
*/
private def
tryRegisterAllMasters()
:Array[JFuture[_]] = {
  for (masterAddress <-masterRpcAddresses) yield {
    registerMasterThreadPool.submit(newRunnable {
     
override def run(): Unit = try {
       
if (registered.get) {
         
return
       
}
       
logInfo("Connecting tomaster " + masterAddress.toSparkURL + "...")
       
val masterRef =
         
rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress,Master.ENDPOINT_NAME)
      
 masterRef.send(RegisterApplication(appDescription,self))
      } catch{
        case ie: InterruptedException=> // Cancelled
       
case NonFatal(e) =>logWarning(s"Failed to connect to master $masterAddress", e)
     
}
   
})
 
}
}

 

 

12master收到RegisterApplication消息以后,Master接受到注册信息后如何可以运行程序,则会为该程序生产Job ID并通过schedule来分配计算资源,具体计算资源的分配是通过应用程序的运行方式、Memorycores等配置信息来决定的,schedule()资源调度

 

【源代码】Master.scala文件:第244-257

 

19.    caseRegisterApplication(description, driver) => {
  // TODO Prevent repeated registrationsfrom some driver
  if (state == RecoveryState.STANDBY) {
   
// ignore, don't send response
 
} else {
   
logInfo("Registering app "+ description.name)
   
val app =createApplication(description, driver)
   
registerApplication(app)
   
logInfo("Registered app " +description.name + " with ID " + app.id)
   
persistenceEngine.addApplication(app)
   
driver.send(RegisteredApplication(app.id,self))
 
  schedule()

  }
}

 

 

13、master进行schedule()资源调度,在一台worker上启动driver,launchDriver(worker, driver),然后在worker上启动executors

【源代码】Master.scala文件:第701-708

 

20.    /**
 * Schedule the currently availableresources among waiting apps. This method will be called
 * every time a new app joins or resourceavailability changes.
 
*/
private def schedule(): Unit = {
 
if (state != RecoveryState.ALIVE) {return }
 
// Drivers take strict precedence overexecutors
 
val shuffledWorkers =Random.shuffle(workers) // Randomization helps balance drivers
 
for (worker <- shuffledWorkers ifworker.state == WorkerState.ALIVE) {
   
for (driver <- waitingDrivers) {
     
if (worker.memoryFree >=driver.desc.mem && worker.coresFree >= driver.desc.cores) {
     
  launchDriver(worker, driver)

        waitingDrivers -= driver
      }
   
}
 
}
 startExecutorsOnWorkers()
}

 

14master进行schedule()资源调度, workers上启动executors 

【源代码】Master.scala文件:第655-676

21.    /**
 * Schedule and launch executors onworkers
 */
private def startExecutorsOnWorkers(): Unit = {
 
// Right now this is a very simple FIFOscheduler. We keep trying to fit in the first app
 
// in the queue, then the second app,etc.
 
for (app <- waitingApps ifapp.coresLeft > 0) {
   
val coresPerExecutor: Option[Int] =app.desc.coresPerExecutor
   
// Filter out workers that don't haveenough resources to launch an executor
   
val usableWorkers =workers.toArray.filter(_.state == WorkerState.ALIVE)
     
.filter(worker =>worker.memoryFree >= app.desc.memoryPerExecutorMB &&
       
worker.coresFree >=coresPerExecutor.getOrElse(1))
     
.sortBy(_.coresFree).reverse
   
val assignedCores =scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

   
// Now that we've decided how manycores to allocate on each worker, let's allocate them
   
for (pos <- 0 untilusableWorkers.length if assignedCores(pos) > 0) {
    
 allocateWorkerResourceToExecutors
(
        app, assignedCores(pos), coresPerExecutor,usableWorkers(pos))
    }
 
}
}

 

15master决定好了分配多少coresworker,就开始分配启动worker

【源代码】Master.scala文件:第684-699

 

22.    private defallocateWorkerResourceToExecutors(
    app: ApplicationInfo,
    assignedCores: Int,
   
coresPerExecutor: Option[Int],
   
worker: WorkerInfo): Unit = {
 
// If the number of cores per executoris specified, we divide the cores assigned
 
// to this worker evenly among theexecutors with no remainder.
 
// Otherwise, we launch a single executorthat grabs all the assignedCores on this worker.
 
val numExecutors = coresPerExecutor.map{ assignedCores / _ }.getOrElse(1)
 
val coresToAssign =coresPerExecutor.getOrElse(assignedCores)
 
for (i <- 1 to numExecutors) {
   
val exec = app.addExecutor(worker,coresToAssign)
   
launchExecutor(worker, exec)
   
app.state = ApplicationState.RUNNING
 
}
}

16master启动worker

 

【源代码】Master.scala文件:第720-727

 

23.    private deflaunchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {
  logInfo("Launching executor "+ exec.fullId + " on worker " + worker.id)
  worker.addExecutor(exec)
 
worker.endpoint.send(
LaunchExecutor
(masterUrl,
    exec.application.id, exec.id,exec.application.desc, exec.cores, exec.memory))
  exec.application.driver.send(
   
ExecutorAdded(exec.id, worker.id,worker.hostPort, exec.cores, exec.memory))
}

 

 

17worker收到LaunchExecutor消息了,首先分配ExecutorRunner

 

【源代码】worker.scala文件:第431487

24.    case LaunchExecutor(masterUrl,appId, execId, appDesc, cores_, memory_) =>
  if (masterUrl != activeMasterUrl) {
    logWarning("Invalid Master(" + masterUrl + ") attempted to launch executor.")
 
} else {
   
try {
     
logInfo("Asked to launchexecutor %s/%d for %s".format(appId, execId, appDesc.name))

     
// Create the executor's workingdirectory
     
val executorDir = new File(workDir,appId + "/" + execId)
     
if (!executorDir.mkdirs()) {
       
throw newIOException("Failed to create directory " + executorDir)
     
}

     
// Create local dirs for theexecutor. These are passed to the executor via the
     
// SPARK_EXECUTOR_DIRS environmentvariable, and deleted by the Worker when the
     
// application finishes.
     
val appLocalDirs =appDirectories.get(appId).getOrElse {
       
Utils.getOrCreateLocalRootDirs(conf).map{ dir =>
         
val appDir =Utils.createDirectory(dir, namePrefix = "executor")
         
Utils.chmod700(appDir)
         
appDir.getAbsolutePath()
       
}.toSeq
     
}
     
appDirectories(appId) =appLocalDirs
     
val manager = new ExecutorRunner(
       appId,
        execId,
     
  appDesc.copy(command =Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
       cores_,
        memory_,
       
self,
       
workerId,
       
host,
       
webUi.boundPort,
       
publicAddress,
       
sparkHome,
       
executorDir,
       
workerUri,
       
conf,
       
appLocalDirs,ExecutorState.RUNNING)
     
executors(appId + "/" +execId) = manager
     
manager.start()
     
coresUsed += cores_
     
memoryUsed += memory_
     
sendToMaster(ExecutorStateChanged(appId,execId, manager.state, None, None))
   
} catch {
     
case e: Exception => {
       
logError(s"Failed to launchexecutor $appId/$execId for ${appDesc.name}.", e)
       
if (executors.contains(appId +"/" + execId)) {
         
executors(appId + "/"+ execId).kill()
         
executors -= appId +"/" + execId
       
}
       
sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
         
Some(e.toString), None))
     
}
   
}
 
}

 

 

 

18worker分配ExecutorRunner赋值给manager,然后manager.start()

【源代码】ExecutorRunner.scala文件:第6780

 

25.    private[worker] def start() {
  workerThread = newThread("ExecutorRunner for " + fullId) {
    override def run() {fetchAndRunExecutor() }
 
}
 
 

 

 

19ExecutorRunner调用start方法,start调用fetchAndRunExecutor方法;fetchAndRunExecutor下载运行的程序并运行executor

【源代码】ExecutorRunner.scala文件:第132186

 

26.    /**
 * Download and run the executordescribed in our ApplicationDescription
 */
private def fetchAndRunExecutor() {

  try {
    // Launch the process
   
val builder =CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
     
memory, sparkHome.getAbsolutePath,substituteVariables)
   
val command = builder.command()
   
val formattedCommand =command.asScala.mkString("\"", "\" \"","\"")
   
logInfo(s"Launch command:$formattedCommand")

   
builder.directory(executorDir)
   
builder.environment.put("SPARK_EXECUTOR_DIRS",appLocalDirs.mkString(File.pathSeparator))
   
// In case we are running this fromwithin the Spark Shell, avoid creating a "scala"
   
// parent process for the executorcommand
   
builder.environment.put("SPARK_LAUNCH_WITH_SCALA","0")

   
// Add webUI log urls
   
val baseUrl =
     
s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="
   
builder.environment.put("SPARK_LOG_URL_STDERR",s"${baseUrl}stderr")
   
builder.environment.put("SPARK_LOG_URL_STDOUT",s"${baseUrl}stdout")

   
process = builder.start()
   
val header = "Spark ExecutorCommand: %s\n%s\n\n".format(
     
formattedCommand, "=" *40)

   
// Redirect its stdout and stderr tofiles
   
val stdout = new File(executorDir,"stdout")
   
stdoutAppender =FileAppender(process.getInputStream, stdout, conf)

   
val stderr = new File(executorDir,"stderr")
  
 
Files.write(header, stderr, UTF_8)
   
stderrAppender =FileAppender(process.getErrorStream, stderr, conf)

   
// Wait for it to exit; executor mayexit with code 0 (when driver instructs it to shutdown)
   
// or with nonzero exit code
   
val exitCode = process.waitFor()
   
state = ExecutorState.EXITED
   
val message = "Command exitedwith code " + exitCode
   
worker.send(ExecutorStateChanged(appId, execId, state, Some(message),Some(exitCode)))
 
} catch {
   
case interrupted:InterruptedException => {
     
logInfo("Runner thread forexecutor " + fullId + " interrupted")
     
state = ExecutorState.KILLED
     
killProcess(None)
   
}
   
case e: Exception => {
     
logError("Error runningexecutor", e)
     
state = ExecutorState.FAILED
     
killProcess(Some(e.toString))
   
}
 
}
}

 

ExecutorRunner内部会通过Thread的方式构建ProcessBuilder来启动另外一个JVM进程,这个JVM进程启动时候加载的main方法所在的类的名称就是在创建ClientEndpoint时传入的Command来指定具体名称为CoarseGrainedExecutorBackend的类,此时JVM在通过ProcessBuilder启动的时候获得了CoarseGrainedExecutorBackend后加载并调用其中的main方法,在main方法中会实例化CoarseGrainedExecutorBackend本身这个消息循环体

 

补充说明:

ExecutorRunner.scala文件:第138

 

val builder = CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
  memory, sparkHome.getAbsolutePath,substituteVariables)

 

直接调用object对象CommandUtilsbuildProcessBuilder方法,记录commandspark classpath信息

【源代码】CommandUtils.scala文件:第3558

 

/**
 * Build a ProcessBuilder based on thegiven parameters.
 * The `env` argument is exposed fortesting.
 
*/
def
buildProcessBuilder
(
    command: Command,
    securityMgr: SecurityManager,
   
memory: Int,
   
sparkHome: String,
   
substituteArguments: String =>String,
   
classPaths: Seq[String] =Seq[String](),
   
env: Map[String, String] = sys.env):ProcessBuilder = {
  val localCommand = buildLocalCommand(
   
command, securityMgr,substituteArguments, classPaths, env)
  val commandSeq =buildCommandSeq(localCommand, memory, sparkHome)
 
val builder = newProcessBuilder(commandSeq: _*)
  valenvironment = builder.environment()
  for ((key, value) <-localCommand.environment) {
   
environment.put(key, value)
 
}
 
 builder

}

 

 

    CommandUtils.scala文件第52行,新建了一个ProcessBuilder实例赋值给builder,这里ProcessBuilderjava开发的,ProcessBuilder的构造器传入多个command参数, this.command是一个字符串列表,包含运行的参数。buildProcessBuilder返回的builder赋值给ExecutorRunner.scala文件中的第138builder

 

【源代码】ProcessBuilder.java文件:第204-219

 

/**
 * Constructs a process builder with thespecified operating
 * system program and arguments.
 
This is a convenience
 
* constructor that sets the processbuilder's command to a string
 
* list containing the same strings asthe {@code command}
 
* array, in the same order.
 
It is not checked whether
 
* {@code command} corresponds to a validoperating system
 
* command.
 
*
 
* @param command a string arraycontaining the program and its arguments
 
*/
public
ProcessBuilder
(String...command) {
    this.command = newArrayList<>(command.length);
    for (String arg : command)
       
this.command.add(arg);
}

 

 

 

ProcessBuilder实例的启动

 【源代码】ExecutorRunner.scala文件:第156

process = builder.start()

 

ProcessBuilder启动start()方法,这里ProcessImpl就是jvm新开辟的线程

【源代码】ProcessBuilder.java文件:第1004-1054

 

public Process start() throws IOException {
    // Must convert to array first -- amalicious user-supplied
    // list might try to circumvent thesecurity check.
   
String[] cmdarray =command.toArray(new String[command.size()]);
   
cmdarray = cmdarray.clone();

   
for (String arg : cmdarray)
 
      
if (arg == null)
           
throw newNullPointerException();
   
// Throws IndexOutOfBoundsExceptionif command is empty
   
String prog = cmdarray[0];

   
SecurityManager security =System.getSecurityManager();
   
if (security != null)
       
security.checkExec(prog);

   
String dir = directory == null ? null: directory.toString();

   
for (int i = 1; i
       
if (cmdarray[i].indexOf('\u0000')>= 0) {
            throw newIOException("invalid null character in command");
        }
    }

    try {
        return ProcessImpl.start(cmdarray,
                                
environment,
                                 dir,
                                redirects,
                                redirectErrorStream);

    } catch (IOException |IllegalArgumentException e) {
        String exceptionInfo = ":" + e.getMessage();
       
Throwable cause = e;
       
if ((e instanceof IOException)&& security != null) {
           
// Can not disclose the failreason for read-protected files.
           
try {
               
security.checkRead(prog);
           
} catch (SecurityExceptionse) {
               
exceptionInfo ="";
               
cause = se;
           
}
       
}
       
// It's much easier for us to createa high-quality error
       
// message than the low-level Ccode which found the problem.
       
throw new IOException(
           
"Cannot run program\"" + prog + "\""
           
+ (dir == null ? "": " (in directory \"" + dir + "\")")
       
    
+ exceptionInfo,
           
cause);
   
}
}

 

再次梳理一下整个command的流程:

l   SparkDeploySchedulerBackend.scala文件:第87行定义了command

 valcommand =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

   args, sc.executorEnvs, classPathEntries ++ testingClassPath,libraryPathEntries, javaOpts)

l  Command的数据结构如下,定义了mainClassmainClass就是org.apache.spark.executor.CoarseGrainedExecutorBackend,还定义了argumentsenvironmentclassPathEntrieslibraryPathEntriesjavaOpts等参数

【源代码】Command.scala文件:第22-29

 

private[spark] case class Command(
    mainClass: String,
    arguments: Seq[String],
   
environment: Map[String, String],
   
classPathEntries: Seq[String],
   
libraryPathEntries: Seq[String],
   
javaOpts: Seq[String]) {
}

l  然后一直传递下去,SparkDeploySchedulerBackend->AppClient对象->ClientEndpoint->tryRegisterMaster->Master->Worker->ProcessBuilder->CoarseGrainedExecutorBackend

 

20CoarseGrainedExecutorBackend在实例化的时候会通过回调onStartDriverEndpoint发送RegisterExecutor来注册当前的CoarseGrainedExecutorBackend

 

 

【源代码】CoarseGrainedExecutorBackend.scala文件:第5572

 

27.    override def onStart() {
  logInfo("Connecting to driver:" + driverUrl)
 rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
   
// This is a very fast action so wecan use "ThreadUtils.sameThread"
   
driver = Some(ref)
   
ref.ask[RegisterExecutorResponse](
   
  RegisterExecutor(executorId, self, hostPort,cores, extractLogUrls))
  }(ThreadUtils.sameThread).onComplete
{
    // This is a very fast action so wecan use "ThreadUtils.sameThread"
    case Success(msg) =>Utils.tryLogNonFatalError {
     
Option(self).foreach(_.send(msg))// msg must be RegisterExecutorResponse
   
}
   
case Failure(e) => {
     
logError(s"Cannot registerwith driver: $driverUrl", e)
     
System.exit(1)
   
}
 
}(ThreadUtils.sameThread)
}

 

                               

 

你可能感兴趣的:(Hadoop)