5.Worker源码分析

worker的源码分析代码就比较少了 , 整个worker.scala的代码只有不到600行 , 但是还得研究一下DriverRunner和ExecutorRunner的代码


5.Worker源码分析_第1张图片


1. Worker中的代码主要集中在  receiveWithLogging方法中 , 该方法接收别的组件发送过来的消息 , 就像master源码中发送过来的launchDriver()消息 , 代码如下 :
     
     
     
     
  1. /**
  2. * 启动Driver
  3. */
  4. case LaunchDriver(driverId, driverDesc) => {
  5. logInfo(s"Asked to launch driver $driverId")
  6. // 创建一个DriverRunner对象 , 里面会创建一个Thread(Java)对象
  7. val driver = new DriverRunner(
  8. conf,
  9. driverId,
  10. workDir,
  11. sparkHome,
  12. driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
  13. self,
  14. akkaUrl)
  15. // 将创建的Driver加入drivers缓存队列中 , drivers是一个HashMap , 因此从这里可以看出若是worker节点内存充足的话是可以创建多个Driver的
  16. drivers(driverId) = driver
  17. // 启动driver
  18. driver.start()
  19. // 增加当前worker已经使用的cores
  20. coresUsed += driverDesc.cores
  21. // 增加当前worker已经使用的内存
  22. memoryUsed += driverDesc.mem
  23. }
上述代码中会创建一个DriverRunner的对象 , 该对象的里面就会创建一个DriverRunner的Java线程 , 然后启动driver , 运行DriverRunner的start方法 , 该方法中的builder就会创建一个Driver进程 , 代码如下:

      
      
      
      
  1. /** Starts a thread to run and manage the driver. */
  2. /**
  3. * 启动driver , 内部会创建一个Java线程
  4. */
  5. def start() = {
  6. new Thread("DriverRunner for " + driverId) {
  7. override def run() {
  8. try {
  9. // 创建Driver的工作目录
  10. val driverDir = createWorkingDirectory()
  11. // 下载上传的jar包到工作目录中
  12. val localJarFilename = downloadUserJar(driverDir)
  13. def substituteVariables(argument: String): String = argument match {
  14. case "{{WORKER_URL}}" => workerUrl
  15. case "{{USER_JAR}}" => localJarFilename
  16. case other => other
  17. }
  18. // TODO: If we add ability to submit multiple jars they should also be added here
  19. // 创建ProcessBuilder , 用于封装启动Driver的命令
  20. val builder = CommandUtils.buildProcessBuilder(driverDesc.command, driverDesc.mem,
  21. sparkHome.getAbsolutePath, substituteVariables)
  22. // 正式启动Driver
  23. launchDriver(builder, driverDir, driverDesc.supervise)
  24. }
  25. catch {
  26. case e: Exception => finalException = Some(e)
  27. }
  28. // 获取Driver状态
  29. val state =
  30. if (killed) {
  31. DriverState.KILLED
  32. } else if (finalException.isDefined) {
  33. DriverState.ERROR
  34. } else {
  35. finalExitCode match {
  36. case Some(0) => DriverState.FINISHED
  37. case _ => DriverState.FAILED
  38. }
  39. }
  40. finalState = Some(state)
  41. // 最终worker将state信息作为参数会发送一个Driver状态改变的消息
  42. worker ! DriverStateChanged(driverId, state, finalException)
  43. }
  44. }.start()
  45. }

里面的 createWorkingDirectory ()方法是创建Driver工作目录 , 用于存放jar包和日志打印 , 代码如下 :
      
      
      
      
  1. /**
  2. * Creates the working directory for this driver.
  3. * Will throw an exception if there are errors preparing the directory.
  4. *
  5. * 创建Driver工作目录 , 返回一个文件对象
  6. */
  7. private def createWorkingDirectory(): File = {
  8. val driverDir = new File(workDir, driverId)
  9. if (!driverDir.exists() && !driverDir.mkdirs()) {
  10. throw new IOException("Failed to create directory " + driverDir)
  11. }
  12. driverDir
  13. }

下载jar包的源码如下 :
       
       
       
       
  1. /**
  2. * Creates the working directory for this driver.
  3. * Will throw an exception if there are errors preparing the directory.
  4. *
  5. * 创建Driver工作目录 , 返回一个文件对象
  6. */
  7. private def createWorkingDirectory(): File = {
  8. val driverDir = new File(workDir, driverId)
  9. if (!driverDir.exists() && !driverDir.mkdirs()) {
  10. throw new IOException("Failed to create directory " + driverDir)
  11. }
  12. driverDir
  13. }

启动Driver的代码如下 :
       
       
       
       
  1. /**
  2. * 启动Driver
  3. */
  4. private def launchDriver(builder: ProcessBuilder, baseDir: File, supervise: Boolean) {
  5. // 设置builder的操作目录
  6. builder.directory(baseDir)
  7. // 定义初始化process信息
  8. def initialize(process: Process) = {
  9. // Redirect stdout and stderr to files
  10. // 下面就是定义日志输出文件
  11. val stdout = new File(baseDir, "stdout")
  12. CommandUtils.redirectStream(process.getInputStream, stdout)
  13. val stderr = new File(baseDir, "stderr")
  14. val header = "Launch Command: %s\n%s\n\n".format(
  15. builder.command.mkString("\"", "\" \"", "\""), "=" * 40)
  16. Files.append(header, stderr, UTF_8)
  17. CommandUtils.redirectStream(process.getErrorStream, stderr)
  18. }
  19. //运行命令
  20. runCommandWithRetry(ProcessBuilderLike(builder), initialize, supervise)
  21. }
 

会看到 runCommandWithRetry方法 , 该方法就是答应最终的log信息 , 代码如下

      
      
      
      
  1. /**
  2. * 运行命令
  3. */
  4. private[deploy] def runCommandWithRetry(command: ProcessBuilderLike, initialize: Process => Unit,
  5. supervise: Boolean) {
  6. // Time to wait between submission retries.
  7. var waitSeconds = 1
  8. // A run of this many seconds resets the exponential back-off.
  9. val successfulRunDuration = 5
  10. var keepTrying = !killed
  11. while (keepTrying) {
  12. logInfo("Launch Command: " + command.command.mkString("\"", "\" \"", "\""))
  13. synchronized {
  14. if (killed) { return }
  15. process = Some(command.start())
  16. initialize(process.get)
  17. }
  18. val processStart = clock.getTimeMillis()
  19. val exitCode = process.get.waitFor()
  20. if (clock.getTimeMillis() - processStart > successfulRunDuration * 1000) {
  21. waitSeconds = 1
  22. }
  23. if (supervise && exitCode != 0 && !killed) {
  24. logInfo(s"Command exited with status $exitCode, re-launching after $waitSeconds s.")
  25. sleeper.sleep(waitSeconds)
  26. waitSeconds = waitSeconds * 2 // exponential back-off
  27. }
  28. keepTrying = supervise && exitCode != 0 && !killed
  29. finalExitCode = Some(exitCode)
  30. }
  31. }

2.launchExecutor , 创建ExecutorRunner对象 , 并创建Executor进程
worker中launchExecutor方法如下:
     
     
     
     
  1. /**
  2. * 发布ExecutorRunner信息
  3. */
  4. case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
  5. // 检查一下masterUrl是否正确
  6. if (masterUrl != activeMasterUrl) {
  7. logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
  8. } else {
  9. try {
  10. logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))
  11. // 创建Executor工作目录
  12. val executorDir = new File(workDir, appId + "/" + execId)
  13. if (!executorDir.mkdirs()) {
  14. throw new IOException("Failed to create directory " + executorDir)
  15. }
  16. // Create local dirs for the executor. These are passed to the executor via the
  17. // SPARK_LOCAL_DIRS environment variable, and deleted by the Worker when the
  18. // application finishes.
  19. val appLocalDirs = appDirectories.get(appId).getOrElse {
  20. Utils.getOrCreateLocalRootDirs(conf).map { dir =>
  21. Utils.createDirectory(dir).getAbsolutePath()
  22. }.toSeq
  23. }
  24. // 将Executor的目录信息加入缓存中 , 缓存对象为HashMap
  25. appDirectories(appId) = appLocalDirs
  26. // 创建ExecutorRunner对象 , 与DriverRunner一样 , 里面会创建一个Java线程对象
  27. val manager = new ExecutorRunner(
  28. appId,
  29. execId,
  30. appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
  31. cores_,
  32. memory_,
  33. self,
  34. workerId,
  35. host,
  36. webUi.boundPort,
  37. publicAddress,
  38. sparkHome,
  39. executorDir,
  40. akkaUrl,
  41. conf,
  42. appLocalDirs, ExecutorState.LOADING)
  43. // 将Executor加入缓存队列中
  44. executors(appId + "/" + execId) = manager
  45. // 启动executor工作
  46. manager.start()
  47. // 增加已经使用的core
  48. coresUsed += cores_
  49. // 增加已经使用的内存
  50. memoryUsed += memory_
  51. // 通过master的代理对象发送Executor状态改变的消息
  52. master ! ExecutorStateChanged(appId, execId, manager.state, None, None)
  53. } catch {
  54. case e: Exception => {
  55. logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
  56. if (executors.contains(appId + "/" + execId)) {
  57. executors(appId + "/" + execId).kill()
  58. executors -= appId + "/" + execId
  59. }
  60. master ! ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
  61. Some(e.toString), None)
  62. }
  63. }
  64. }

上述代码中最重要的就是创建ExecutorRunner对象 , 然后通过start方法启动ExecutorRunner , 源码如下 :
      
      
      
      
  1. /**
  2. * 启动executor
  3. */
  4. def start() {
  5. // 创建Java线程对象 , run方法中会执行fetchAndRunExecutor方法
  6. workerThread = new Thread("ExecutorRunner for " + fullId) {
  7. override def run() { fetchAndRunExecutor() }
  8. }
  9. // 启动workerThread线程
  10. workerThread.start()
  11. // Shutdown hook that kills actors on shutdown.
  12. shutdownHook = new Thread() {
  13. override def run() {
  14. killProcess(Some("Worker shutting down"))
  15. }
  16. }
  17. Runtime.getRuntime.addShutdownHook(shutdownHook)
  18. }

启动workerThread线程就会调用fetchAndRunExecutor创建Executor进程 , 源码如下:
      
      
      
      
  1. /**
  2. * Download and run the executor described in our ApplicationDescription
  3. */
  4. def fetchAndRunExecutor() {
  5. try {
  6. // Launch the process
  7. // 创建进程构造器builder , 用于创建executor进程
  8. val builder = CommandUtils.buildProcessBuilder(appDesc.command, memory,
  9. sparkHome.getAbsolutePath, substituteVariables)
  10. val command = builder.command()
  11. logInfo("Launch command: " + command.mkString("\"", "\" \"", "\""))
  12. // 构建executor的进程目录
  13. builder.directory(executorDir)
  14. builder.environment.put("SPARK_LOCAL_DIRS", appLocalDirs.mkString(","))
  15. // In case we are running this from within the Spark Shell, avoid creating a "scala"
  16. // parent process for the executor command
  17. builder.environment.put("SPARK_LAUNCH_WITH_SCALA", "0")
  18. // Add webUI log urls
  19. val baseUrl =
  20. s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="
  21. builder.environment.put("SPARK_LOG_URL_STDERR", s"${baseUrl}stderr")
  22. builder.environment.put("SPARK_LOG_URL_STDOUT", s"${baseUrl}stdout")
  23. // builder命令启动
  24. process = builder.start()
  25. val header = "Spark Executor Command: %s\n%s\n\n".format(
  26. command.mkString("\"", "\" \"", "\""), "=" * 40)
  27. // Redirect its stdout and stderr to files
  28. val stdout = new File(executorDir, "stdout")
  29. stdoutAppender = FileAppender(process.getInputStream, stdout, conf)
  30. val stderr = new File(executorDir, "stderr")
  31. Files.write(header, stderr, UTF_8)
  32. stderrAppender = FileAppender(process.getErrorStream, stderr, conf)
  33. // Wait for it to exit; executor may exit with code 0 (when driver instructs it to shutdown)
  34. // or with nonzero exit code
  35. val exitCode = process.waitFor()
  36. state = ExecutorState.EXITED
  37. val message = "Command exited with code " + exitCode
  38. // worker发送Executor状态改变的消息
  39. worker ! ExecutorStateChanged(appId, execId, state, Some(message), Some(exitCode))
  40. } catch {
  41. case interrupted: InterruptedException => {
  42. logInfo("Runner thread for executor " + fullId + " interrupted")
  43. state = ExecutorState.KILLED
  44. killProcess(None)
  45. }
  46. case e: Exception => {
  47. logError("Error running executor", e)
  48. state = ExecutorState.FAILED
  49. killProcess(Some(e.toString))
  50. }
  51. }
  52. }
该方法里面就会创建BuilderProcess对象 , 然后根据ApplicationDesc的信息创建Executor进程;

以上就是所有的Worker源码了 , scala代码写的程序源代码真少 !

你可能感兴趣的:(Java,spark)