Spark-spark-submit提交Job流程 解读

  • spark-submit
  • spark-class
  • org.apache.spark.launcher.Main
  • SparkSubmitCommandBuilder class
    • 构造方法
    • buildCommand
    • buildSparkSubmitCommand
  • OptionParser
  • SparkSubmit Object
    • runMain 方法
  • YarnClusterApplication
  • ClientArguments
  • Client
    • run
    • submitApplication
    • createContainerLaunchContext
    • setupLaunchEnv
    • populateClasspath



Spark spark-submit提交Job流程 解读_第1张图片 spark-submit 是Spark Home bin目录下的一个sh,在安装Spark的时候,会在系统的环境变量中配置SPARK_HOME(即spark的位置路径)

#如果${SPARK_HOME} 的长度为0,则会使用spark-submit
if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home

# disable randomized hash for string in Python 3.3+
#执行 spark_home//bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"
# "$@" 代表spark-submit的所有参数列表
exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"

可以看出 它会转移到 spark-class sh脚本执行。


spark-class 也是一个sh脚本

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home

#加载 配置文件
#目的是找到 conf下的 加载其中的环境变量,确定集群运行模式
. "${SPARK_HOME}"/bin/

# Find the java binary
if [ -n "${JAVA_HOME}" ]; then
  if [ "$(command -v java)" ]; then
    echo "JAVA_HOME is not set" >&2
    exit 1

# Find Spark jars.
if [ -d "${SPARK_HOME}/jars" ]; then
#设置LAUNCH_CLASSPATH spark运行的classpath 为$SPARK_JARS_DIR/*
if [ ! -d "$SPARK_JARS_DIR" ] && [ -z "$SPARK_TESTING$SPARK_SQL_TESTING" ]; then
  echo "Failed to find Spark jars directory ($SPARK_JARS_DIR)." 1>&2
  echo "You need to build Spark with the target \"package\" before running this program." 1>&2
  exit 1

# Add the launcher build dir to the classpath if requested.
if [ -n "$SPARK_PREPEND_CLASSES" ]; then

# For tests
if [[ -n "$SPARK_TESTING" ]]; then

# The launcher library will print arguments separated by a NULL character, to allow arguments with
# characters that would be otherwise interpreted by the shell. Read that in a while loop, populating
# an array that will be used to exec the final command.
# The exit code of the launcher is appended to the output, so the parent shell removes it from the
# command array and checks the value to see if the launcher succeeded.
#组装 运行命令 java -Xmx128m -cp $LAUNCH_CLASSPATH org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master xx --deploy-mode cluster ... 
build_command() {
  "$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"
  #这里在最后再追加一个 0
  printf "%d\0" $?

# Turn off posix mode since it does not allow process substitution
# 为了后面捕获java 输出 
set +o posix
while IFS= read -d '' -r ARG; do
done < <(build_command "$@")

#组装 运行命令 java -Xmx128m -cp $LAUNCH_CLASSPATH org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master xx --deploy-mode cluster ... 0
echo "yyb cmd ${CMD[*]}"
LAST=$((COUNT - 1))
#所以这个会是 0 

# Certain JVM failures result in errors being printed to stdout (instead of stderr), which causes
# the code that parses the output of the launcher to get confused. In those cases, check if the
# exit code is an integer, and if it's not, handle it as a special error case.
if ! [[ $LAUNCHER_EXIT_CODE =~ ^[0-9]+$ ]]; then
  echo "${CMD[@]}" | head -n-1 1>&2
  exit 1

if [ $LAUNCHER_EXIT_CODE != 0 ]; then
#这里又把上面追加的0 去掉
#运行命令 java -Xmx128m -cp $LAUNCH_CLASSPATH org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master xx --deploy-mode cluster ... --jars xxx.jar,yyy.jar user-jar.jar 
#需要注意的是 下面这个一共是2个执行命令 ${CMD[@]}是一个,这个会返回 组装好的 真正执行的命令 
## 最后 组装好的命令 示例:
#{JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*...  org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
#然后 exec 再执行 真正组装好的命令
exec "${CMD[@]}"

作用是:找到 conf下的 加载其中的环境变量,确定集群运行模式

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
if [ -z "$SPARK_ENV_LOADED" ]; then
  #SPARK_CONF_DIR 下面的 存在的话 则进入if,一般的话会存在,所以一般会进入这个if
  if [ -f "${SPARK_CONF_DIR}/" ]; then
    # Promote all variable declarations to environment (exported) variables
    #加载SPARK_CONF_DIR 目录下的 修改的环境变量 主要用来配置 spark 运行在那种集群模式下
    set -a
    . "${SPARK_CONF_DIR}/"
    set +a

# Setting SPARK_SCALA_VERSION if not already set.
if [ -z "$SPARK_SCALA_VERSION" ]; then


  if [[ -d "$ASSEMBLY_DIR2" && -d "$ASSEMBLY_DIR1" ]]; then
    echo -e "Presence of build for multiple Scala versions detected." 1>&2
    echo -e 'Either clean one of them or, export SPARK_SCALA_VERSION in' 1>&2
    exit 1

  if [ -d "$ASSEMBLY_DIR2" ]; then
    export SPARK_SCALA_VERSION="2.11"
    export SPARK_SCALA_VERSION="2.12"


spark-submit 提交的Job 的参数最后都传递到这个类这里,命令实例如下:

 java -Xmx128m -cp $LAUNCH_CLASSPATH org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master xx --deploy-mode cluster ... --jars xxx.jar,yyy.jar user-jar.jar

//最后结果 {JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*… org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx … 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数


org.apache.spark.launcher.Main这个类在spark源码的launcher目录下,主要目的是解析和验证spark-submit 后面的参数,抽取和验证参数的正确性。

class Main {
  public static void main(String[] argsArray) throws Exception {
  	//判断参数数量是否为0,为0的话 抛出异常提示信息
    checkArgument(argsArray.length > 0, "Not enough arguments: missing class name.");
    //参数数组 转 ArrayList
    List<String> args = new ArrayList<>(Arrays.asList(argsArray));
    //拿到真正的待运行类 全称 org.apache.spark.deploy.SparkSubmit
    String className = args.remove(0);
	//是否打印 运行命令
    boolean printLaunchCommand = !isEmpty(System.getenv("SPARK_PRINT_LAUNCH_COMMAND"));
    AbstractCommandBuilder builder;
    //spark-submit 走if这个分支
    if (className.equals("org.apache.spark.deploy.SparkSubmit")) {
      try {
      	//这个的args是-master xx --deploy-mode cluster ... --jars xxx.jar,yyy.jar user-jar.jar
      	//这一步执行完成 builder 里面的属性已经都有args配置的参数值
      	//引文SparkSubmitCommandBuilder 的内部类 OptionParser parse过程操作的
      	//这一步详细的解读见下面 SparkSubmitCommandBuilder的部分
        builder = new SparkSubmitCommandBuilder(args);
      } catch (IllegalArgumentException e) {
        printLaunchCommand = false;
        System.err.println("Error: " + e.getMessage());

        MainClassOptionParser parser = new MainClassOptionParser();
        try {
        } catch (Exception ignored) {
          // Ignore parsing exceptions.

        List<String> help = new ArrayList<>();
        if (parser.className != null) {
        builder = new SparkSubmitCommandBuilder(help);
    } else {
      builder = new SparkClassCommandBuilder(className, args);

    Map<String, String> env = new HashMap<>();
    //接下来 build commed 细节见SparkSubmitCommandBuilder的buildCommand方法
    List<String> cmd = builder.buildCommand(env);
    if (printLaunchCommand) {
      System.err.println("Spark Command: " + join(" ", cmd));

    if (isWindows()) {
      System.out.println(prepareWindowsCommand(cmd, env));
    } else {
      // In bash, use NULL as the arg separator since it cannot be used in an argument.
      List<String> bashCmd = prepareBashCommand(cmd, env);
      //最后结果 {JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*...  org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
      //这个结果 会被 执行 spark-class 的 shell 脚本 捕获
      //exec 执行这个已经组装好的命令
      for (String c : bashCmd) {

   * Prepare a command line for execution from a Windows batch script.
   * The method quotes all arguments so that spaces are handled as expected. Quotes within arguments
   * are "double quoted" (which is batch for escaping a quote). This page has more details about
   * quoting and other batch script fun stuff:
  private static String prepareWindowsCommand(List<String> cmd, Map<String, String> childEnv) {
    StringBuilder cmdline = new StringBuilder();
    for (Map.Entry<String, String> e : childEnv.entrySet()) {
      cmdline.append(String.format("set %s=%s", e.getKey(), e.getValue()));
      cmdline.append(" && ");
    for (String arg : cmd) {
      cmdline.append(" ");
    return cmdline.toString();

   * Prepare the command for execution from a bash script. The final command will have commands to
   * set up any needed environment variables needed by the child process.
  private static List<String> prepareBashCommand(List<String> cmd, Map<String, String> childEnv) {
    if (childEnv.isEmpty()) {
      return cmd;

    List<String> newCmd = new ArrayList<>();

    for (Map.Entry<String, String> e : childEnv.entrySet()) {
      newCmd.add(String.format("%s=%s", e.getKey(), e.getValue()));
    return newCmd;

   * A parser used when command line parsing fails for spark-submit. It's used as a best-effort
   * at trying to identify the class the user wanted to invoke, since that may require special
   * usage strings (handled by SparkSubmitArguments).
  private static class MainClassOptionParser extends SparkSubmitOptionParser {

    String className;

    protected boolean handle(String opt, String value) {
      if (CLASS.equals(opt)) {
        className = value;
      return false;

    protected boolean handleUnknown(String opt) {
      return false;

    protected void handleExtraArgs(List<String> extra) {




SparkSubmitCommandBuilder class



//这个的args的是 --master xx --deploy-mode cluster ... --jars xxx.jar,yyy.jar user-jar.jar
SparkSubmitCommandBuilder(List<String> args) {
    this.allowsMixedArguments = false;
    this.sparkArgs = new ArrayList<>();
    boolean isExample = false;
    List<String> submitArgs = args;

    if (args.size() > 0) {
    //spark-submit 不会进入这个switch
      switch (args.get(0)) {
        case PYSPARK_SHELL:
          this.allowsMixedArguments = true;
          appResource = PYSPARK_SHELL;
          submitArgs = args.subList(1, args.size());

        case SPARKR_SHELL:
          this.allowsMixedArguments = true;
          appResource = SPARKR_SHELL;
          submitArgs = args.subList(1, args.size());

        case RUN_EXAMPLE:
          isExample = true;
          submitArgs = args.subList(1, args.size());
      this.isExample = isExample; //false
      //细节 见 OptionParser 解读部分
      OptionParser parser = new OptionParser();
      //spark-submit 的话 是true
      this.isAppResourceReq = parser.isAppResourceReq;
    }  else {
      this.isExample = isExample;
      this.isAppResourceReq = false;


//env是传过来的null map,appResource是用户自己JOb Jar,所以会走else 这个分支
  public List<String> buildCommand(Map<String, String> env)
      throws IOException, IllegalArgumentException {
    if (PYSPARK_SHELL.equals(appResource) && isAppResourceReq) {
      return buildPySparkShellCommand(env);
    } else if (SPARKR_SHELL.equals(appResource) && isAppResourceReq) {
      return buildSparkRCommand(env);
    } else {//spark-submit的话走这个分支 详细信息看下面
      return buildSparkSubmitCommand(env);


private List<String> buildSparkSubmitCommand(Map<String, String> env)
      throws IOException, IllegalArgumentException {
   //加载用户指定的propertiesFile 或者 conf/spark-defaults.conf 文件
   // conf/spark-defaults.conf 这个可以统一配置 生产环境一般是有这个文件的,但是里面的配置项都是注释的
    Map<String, String> config = getEffectiveConfig();
    //这个返回 false
    boolean isClientMode = isClientMode(config);
    //extraClassPath is null
    String extraClassPath = isClientMode ? config.get(SparkLauncher.DRIVER_EXTRA_CLASSPATH) : null;
	//返回 ${JAVA_HOME}/bin [java-opt] -cp 
	//如果 conf/java-opts 有这个 java 优化文件,则会在 ${JAVA_HOME}/bin 加上 这个优化参数
	//-cp 的主要组装细节在buildClassPath 方法中
	//cp 会有spark_home/conf spark_home/core/target/jars/* spark_home/mllib/target/jars/* spark_home/assembly/target/scala-%s/jars/* HADOOP_CONF_DIR、YARN_CONF_DIR、SPARK_DIST_CLASSPATH中的
    List<String> cmd = buildJavaCommand(extraClassPath);
    // Take Thrift Server as daemon
    if (isThriftServer(mainClass)) {
      addOptionString(cmd, System.getenv("SPARK_DAEMON_JAVA_OPTS"));
    //到这里 组装的命令已经这个样子了 {JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*... 
    //继续追加 SPARK_SUBMIT_OPTS 优化参数
    addOptionString(cmd, System.getenv("SPARK_SUBMIT_OPTS"));

    // We don't want the client to specify Xmx. These have to be set by their corresponding
    // memory flag --driver-memory or configuration entry spark.driver.memory
    String driverExtraJavaOptions = config.get(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS);
    if (!isEmpty(driverExtraJavaOptions) && driverExtraJavaOptions.contains("Xmx")) {
      String msg = String.format("Not allowed to specify max heap(Xmx) memory settings through " +
                   "java options (was %s). Use the corresponding --driver-memory or " +
                   "spark.driver.memory configuration instead.", driverExtraJavaOptions);
      throw new IllegalArgumentException(msg);

    if (isClientMode) {
      // Figuring out where the memory value come from is a little tricky due to precedence.
      // Precedence is observed in the following order:
      // - explicit configuration (setConf()), which also covers --driver-memory cli argument.
      // - properties file.
      // - SPARK_DRIVER_MEMORY env variable
      // - SPARK_MEM env variable
      // - default value (1g)
      // Take Thrift Server as daemon
      String tsMemory =
        isThriftServer(mainClass) ? System.getenv("SPARK_DAEMON_MEMORY") : null;
      String memory = firstNonEmpty(tsMemory, config.get(SparkLauncher.DRIVER_MEMORY),
        System.getenv("SPARK_DRIVER_MEMORY"), System.getenv("SPARK_MEM"), DEFAULT_MEM);
      cmd.add("-Xmx" + memory);
      addOptionString(cmd, driverExtraJavaOptions);
      mergeEnvPathList(env, getLibPathEnvName(),
	//到这里 组装的命令已经这个样子了 {JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*...  org.apache.spark.deploy.SparkSubmit
    //这里是规整 spark-submit 后面的参数
    // --master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit 配置的参数
    // 最后的cmd
    //{JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*...  org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
    return cmd;



protected final void parse(List<String> args) {
    Pattern eqSeparatedOpt = Pattern.compile("(--[^=]+)=(.+)");

    int idx = 0;
    for (idx = 0; idx < args.size(); idx++) {
      String arg = args.get(idx);
      String value = null;

      Matcher m = eqSeparatedOpt.matcher(arg);
      if (m.matches()) {
        arg =;
        value =;

      // Look for options with a value.
      //以 --master yarn 为例 name就是 --master
      //如果这里是最后一个 用户的Job jar,那么name 就是 null了
      String name = findCliOption(arg, opts);
      if (name != null) {//但是这个在--master yarn 会有一个空格,
      //所以要idx++ 获取 yarn的位置的值,如果已经没有args下一个值了那么就会抛出异常
        if (value == null) {
          if (idx == args.size() - 1) {
            throw new IllegalArgumentException(
                String.format("Missing argument for option '%s'.", arg));
          value = args.get(idx);
        if (!handle(name, value)) {

      // Look for a switch.
      //这里是 spark-submit -h等的帮助 参数解析
      name = findCliOption(arg, switches);
      if (name != null) {
        if (!handle(name, null)) {
	  //这里会处理 appResource即用户自己的Job jar
      if (!handleUnknown(arg)) {

    if (idx < args.size()) {
    //这里用来处理用户自己的Job jar 后面自己的参数 appArgs
    handleExtraArgs(args.subList(idx, args.size()));

SparkSubmit Object

上面spark-submit sh脚本 执行的最终结果就是执行 SparkSubmit Object 的 main 方法。
{JAVA_HOME}/bin/java [java_opt] -cp xx/xxx;yy/yy/*… org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode cluster --conf xxx=xxx --jars hdfs://jar1,jar2 --class xxxx … 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数


//这里的args就是 -master yarn --deploy-mode cluster --conf xxx=xxx --jars hdfs://jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
override def main(args: Array[String]): Unit = {
    // Initialize logging if it hasn't been done yet. Keep track of whether logging needs to
    // be reset before the application starts.
    val uninitLog = initializeLogIfNecessary(true, silent = true)
	//使用SparkSubmitArguments解析参数,里面实际的parse 过程和上面的一样
    val appArgs = new SparkSubmitArguments(args)
    if (appArgs.verbose) {
      // scalastyle:off println
      // scalastyle:on println
    appArgs.action match {
      case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
      case SparkSubmitAction.KILL => kill(appArgs)
      case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)

//args: SparkSubmitArguments 已经解析过的 -master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
private def submit(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {

    def doRunMain(): Unit = {
      if (args.proxyUser != null) {
        val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
        try {
          proxyUser.doAs(new PrivilegedExceptionAction[Unit]() {
            override def run(): Unit = {
              runMain(args, uninitLog)
        } catch {
          case e: Exception =>
            if (e.getStackTrace().length == 0) {
              // scalastyle:off println
              printStream.println(s"ERROR: ${e.getClass().getName()}: ${e.getMessage()}")
              // scalastyle:on println
            } else {
              throw e
      } else {
        //再次 转移到 runMain 这个函数,下面来看看这个方法
        runMain(args, uninitLog)

    if (args.isStandaloneCluster && args.useRest) {
      try {
        // scalastyle:off println
        printStream.println("Running Spark using the REST application submission protocol.")
        // scalastyle:on println
      } catch {
        // Fail over to use the legacy submission gateway
        case e: SubmitRestConnectionException =>
          printWarning(s"Master endpoint ${args.master} was not a REST server. " +
            "Falling back to legacy submission gateway instead.")
          args.useRest = false
          submit(args, false)
    // In all other modes, just run the main class as prepared
    } else {
    //这个方法直接会走这一步,然后执行 这个内部函数

runMain 方法

//args: SparkSubmitArguments 已经解析过的 -master yarn --deploy-mode cluster --conf xxx=xxx --jars jar1,jar2 --class xxxx .. 其他的spark-submit配置的参数 用户自己的jar 用户自己设置的用户参数
private def runMain(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {
//prepareSubmitEnvironment 这个方法是一个很重要且非常长的方法,里面会确定 执行用户jar的 与各个集群的连接的提交类
//下面有详细的 解读
    val (childArgs, childClasspath, sparkConf, childMainClass) = prepareSubmitEnvironment(args)
    //sparkConf childClasspath childMainClass childArgs 已经初始化了 
//如果是yarn的话 childMainClass是org.apache.spark.deploy.yarn.YarnClusterApplication
//childClasspath 里面已经有了用户的Job jar了
    // Let the main class re-initialize the logging system once it starts.
    if (uninitLog) {

    // scalastyle:off println
    if (args.verbose) {
      printStream.println(s"Main class:\n$childMainClass")
      // sysProps may contain sensitive information, so redact before printing
      printStream.println(s"Spark config:\n${Utils.redact(sparkConf.getAll.toMap).mkString("\n")}")
      printStream.println(s"Classpath elements:\n${childClasspath.mkString("\n")}")
    // scalastyle:on println

    val loader =
      if (sparkConf.get(DRIVER_USER_CLASS_PATH_FIRST)) {
        new ChildFirstURLClassLoader(new Array[URL](0),
      } else {
        new MutableURLClassLoader(new Array[URL](0),

    for (jar <- childClasspath) {
      addJarToClasspath(jar, loader)

    var mainClass: Class[_] = null

    try {
    //反射 org.apache.spark.deploy.yarn.YarnClusterApplication
    //这个类在源码的 resource-managers 下
    //这个类 是 org.apache.spark.deploy.yarn.client的 内部类
      mainClass = Utils.classForName(childMainClass)
    } catch {
      case e: ClassNotFoundException =>
        if (childMainClass.contains("thriftserver")) {
          // scalastyle:off println
          printStream.println(s"Failed to load main class $childMainClass.")
          printStream.println("You need to build Spark with -Phive and -Phive-thriftserver.")
          // scalastyle:on println
      case e: NoClassDefFoundError =>
        if (e.getMessage.contains("org/apache/hadoop/hive")) {
          // scalastyle:off println
          printStream.println(s"Failed to load hive class.")
          printStream.println("You need to build Spark with -Phive and -Phive-thriftserver.")
          // scalastyle:on println

    val app: SparkApplication = if (classOf[SparkApplication].isAssignableFrom(mainClass)) {
    //会走这一步 实例化org.apache.spark.deploy.yarn.YarnClusterApplication
    } else {
      // SPARK-4170
      if (classOf[scala.App].isAssignableFrom(mainClass)) {
        printWarning("Subclasses of scala.App may not work correctly. Use a main() method instead.")
      new JavaMainApplication(mainClass)

    def findCause(t: Throwable): Throwable = t match {
      case e: UndeclaredThrowableException =>
        if (e.getCause() != null) findCause(e.getCause()) else e
      case e: InvocationTargetException =>
        if (e.getCause() != null) findCause(e.getCause()) else e
      case e: Throwable =>

    try {
      //启动org.apache.spark.deploy.yarn.YarnClusterApplication start 方法 并传入参数 和 sparkConf
      //详细的下面运行步骤 见下面的 YarnClusterApplication 类的解读
      app.start(childArgs.toArray, sparkConf)
    } catch {
      case t: Throwable =>
        findCause(t) match {
          case SparkUserAppException(exitCode) =>

          case t: Throwable =>
            throw t
private[deploy] def prepareSubmitEnvironment(
      args: SparkSubmitArguments,
      conf: Option[HadoopConfiguration] = None)
      : (Seq[String], Seq[String], SparkConf, String) = {
    try {
    //转移到doPrepareSubmitEnvironment 方法,conf=None
      doPrepareSubmitEnvironment(args, conf)
      //sparkConf childClasspath childMainClass childArgs 已经初始化了 
//如果是yarn的话 childMainClass是org.apache.spark.deploy.yarn.YarnClusterApplication
//childClasspath 里面已经有了用户的Job jar了
    } catch {
      case e: SparkException =>
        throw e

private def doPrepareSubmitEnvironment(
      args: SparkSubmitArguments,
      conf: Option[HadoopConfiguration] = None)
      : (Seq[String], Seq[String], SparkConf, String) = {
    // Return values
    val childArgs = new ArrayBuffer[String]()
    val childClasspath = new ArrayBuffer[String]()
    val sparkConf = new SparkConf()
    var childMainClass = ""

    // clusterManager = YARN
    val clusterManager: Int = args.master match {
      case "yarn" => YARN
      case "yarn-client" | "yarn-cluster" =>
        printWarning(s"Master ${args.master} is deprecated since 2.0." +
          " Please use master \"yarn\" with specified deploy mode instead.")
      case m if m.startsWith("spark") => STANDALONE
      case m if m.startsWith("mesos") => MESOS
      case m if m.startsWith("k8s") => KUBERNETES
      case m if m.startsWith("local") => LOCAL
      case _ =>
        printErrorAndExit("Master must either be yarn or start with spark, mesos, k8s, or local")

    // deployMode = CLUSTER
    var deployMode: Int = args.deployMode match {
      case "client" | null => CLIENT
      case "cluster" => CLUSTER
      case _ => printErrorAndExit("Deploy mode must be either client or cluster"); -1
	//走这个if  args.master = "yarn"
    if (clusterManager == YARN) {
      (args.master, args.deployMode) match {
        case ("yarn-cluster", null) =>
          deployMode = CLUSTER
          args.master = "yarn"
        case ("yarn-cluster", "client") =>
          printErrorAndExit("Client deploy mode is not compatible with master \"yarn-cluster\"")
        case ("yarn-client", "cluster") =>
          printErrorAndExit("Cluster deploy mode is not compatible with master \"yarn-client\"")
        case (_, mode) =>
          args.master = "yarn"

      // Make sure YARN is included in our build if we're trying to use it
      if (!Utils.classIsLoadable(YARN_CLUSTER_SUBMIT_CLASS) && !Utils.isTesting) {
          "Could not load YARN classes. " +
          "This copy of Spark may not have been compiled with YARN support.")

    if (clusterManager == KUBERNETES) {
      args.master = Utils.checkAndGetK8sMasterUrl(args.master)
      // Make sure KUBERNETES is included in our build if we're trying to use it
      if (!Utils.classIsLoadable(KUBERNETES_CLUSTER_SUBMIT_CLASS) && !Utils.isTesting) {
          "Could not load KUBERNETES classes. " +
            "This copy of Spark may not have been compiled with KUBERNETES support.")

    // Fail fast, the following modes are not supported or applicable
    (clusterManager, deployMode) match {
      case (STANDALONE, CLUSTER) if args.isPython =>
        printErrorAndExit("Cluster deploy mode is currently not supported for python " +
          "applications on standalone clusters.")
      case (STANDALONE, CLUSTER) if args.isR =>
        printErrorAndExit("Cluster deploy mode is currently not supported for R " +
          "applications on standalone clusters.")
      case (KUBERNETES, _) if args.isPython =>
        printErrorAndExit("Python applications are currently not supported for Kubernetes.")
      case (KUBERNETES, _) if args.isR =>
        printErrorAndExit("R applications are currently not supported for Kubernetes.")
      case (KUBERNETES, CLIENT) =>
        printErrorAndExit("Client mode is currently not supported for Kubernetes.")
      case (LOCAL, CLUSTER) =>
        printErrorAndExit("Cluster deploy mode is not compatible with master \"local\"")
      case (_, CLUSTER) if isShell(args.primaryResource) =>
        printErrorAndExit("Cluster deploy mode is not applicable to Spark shells.")
      case (_, CLUSTER) if isSqlShell(args.mainClass) =>
        printErrorAndExit("Cluster deploy mode is not applicable to Spark SQL shell.")
      case (_, CLUSTER) if isThriftServer(args.mainClass) =>
        printErrorAndExit("Cluster deploy mode is not applicable to Spark Thrift server.")
      case _ =>

    // Update args.deployMode if it is null. It will be passed down as a Spark property later.
    (args.deployMode, deployMode) match {
      case (null, CLIENT) => args.deployMode = "client"
      case (null, CLUSTER) => args.deployMode = "cluster"
      case _ =>
    //isYarnCluster true 其他为false
    val isYarnCluster = clusterManager == YARN && deployMode == CLUSTER
    val isMesosCluster = clusterManager == MESOS && deployMode == CLUSTER
    val isStandAloneCluster = clusterManager == STANDALONE && deployMode == CLUSTER
    val isKubernetesCluster = clusterManager == KUBERNETES && deployMode == CLUSTER
	//走这个if分支 但是 is scala 程序 ,所以不仔细看了
    if (!isMesosCluster && !isStandAloneCluster) {
      val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(
        args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath,

      if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
        args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates)
        if (args.isPython || isInternal(args.primaryResource)) {
          args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates)
      if (args.isR && !StringUtils.isBlank(args.jars)) {
        RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose)
	//把 spark-submit 的 --conf 的配置参数 加到 sparkConf 中去 
	//并 创建一个hadoopConf对象 里面有sparkConf,主要把s3 和 sparkConf以中以 spark.hadoop 开头的参数 放进来
    args.sparkProperties.foreach { case (k, v) => sparkConf.set(k, v) }
    val hadoopConf = conf.getOrElse(SparkHadoopUtil.newConfiguration(sparkConf))
    val targetDir = Utils.createTempDir()

    // assure a keytab is available from any place in a JVM
    if (clusterManager == YARN || clusterManager == LOCAL || clusterManager == MESOS) {
      if (args.principal != null) {
        if (args.keytab != null) {
          require(new File(args.keytab).exists(), s"Keytab file: ${args.keytab} does not exist")
          // Add keytab and principal configurations in sysProps to make them available
          // for later use; e.g. in spark sql, the isolated class loader used to talk
          // to HiveMetastore will use these settings. They will be set as Java system
          // properties and then loaded by SparkConf
          sparkConf.set(KEYTAB, args.keytab)
          sparkConf.set(PRINCIPAL, args.principal)
          UserGroupInformation.loginUserFromKeytab(args.principal, args.keytab)

    // 解析不同的路径 全称
    args.jars = Option(args.jars).map(resolveGlobPaths(_, hadoopConf)).orNull
    args.files = Option(args.files).map(resolveGlobPaths(_, hadoopConf)).orNull
    args.pyFiles = Option(args.pyFiles).map(resolveGlobPaths(_, hadoopConf)).orNull
    args.archives = Option(args.archives).map(resolveGlobPaths(_, hadoopConf)).orNull

    lazy val secMgr = new SecurityManager(sparkConf)

    // In client mode, download remote files.
    var localPrimaryResource: String = null
    var localJars: String = null
    var localPyFiles: String = null
    if (deployMode == CLIENT) {
      localPrimaryResource = Option(args.primaryResource).map {
        downloadFile(_, targetDir, sparkConf, hadoopConf, secMgr)
      localJars = Option(args.jars).map {
        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
      localPyFiles = Option(args.pyFiles).map {
        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)

    //会走这个if 这里是下载 其他的远程资源 如http ftp
    if (clusterManager == YARN) {
      val forceDownloadSchemes = sparkConf.get(FORCE_DOWNLOAD_SCHEMES)

      def shouldDownload(scheme: String): Boolean = {
        forceDownloadSchemes.contains(scheme) ||
          Try { FileSystem.getFileSystemClass(scheme, hadoopConf) }.isFailure

      def downloadResource(resource: String): String = {
        val uri = Utils.resolveURI(resource)
        uri.getScheme match {
          case "local" | "file" => resource
          case e if shouldDownload(e) =>
            val file = new File(targetDir, new Path(uri).getName)
            if (file.exists()) {
            } else {
              downloadFile(resource, targetDir, sparkConf, hadoopConf, secMgr)
          case _ => uri.toString

      args.primaryResource = Option(args.primaryResource).map { downloadResource }.orNull
      args.files = Option(args.files).map { files =>
      args.pyFiles = Option(args.pyFiles).map { pyFiles =>
      args.jars = Option(args.jars).map { jars =>
      args.archives = Option(args.archives).map { archives =>

    // If we're running a python app, set the main class to our specific python runner
    if (args.isPython && deployMode == CLIENT) {
      if (args.primaryResource == PYSPARK_SHELL) {
        args.mainClass = "org.apache.spark.api.python.PythonGatewayServer"
      } else {
        // If a python file is provided, add it to the child arguments and list of files to deploy.
        // Usage: PythonAppRunner 
[app arguments] args.mainClass = "org.apache.spark.deploy.PythonRunner" args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs if (clusterManager != YARN) { // The YARN backend distributes the primary file differently, so don't merge it. args.files = mergeFileLists(args.files, args.primaryResource) } } if (clusterManager != YARN) { // The YARN backend handles python files differently, so don't merge the lists. args.files = mergeFileLists(args.files, args.pyFiles) } if (localPyFiles != null) { sparkConf.set("spark.submit.pyFiles", localPyFiles) } } // In YARN mode for an R app, add the SparkR package archive and the R package // archive containing all of the built R libraries to archives so that they can // be distributed with the job if (args.isR && clusterManager == YARN) { val sparkRPackagePath = RUtils.localSparkRPackagePath if (sparkRPackagePath.isEmpty) { printErrorAndExit("SPARK_HOME does not exist for R application in YARN mode.") } val sparkRPackageFile = new File(sparkRPackagePath.get, SPARKR_PACKAGE_ARCHIVE) if (!sparkRPackageFile.exists()) { printErrorAndExit(s"$SPARKR_PACKAGE_ARCHIVE does not exist for R application in YARN mode.") } val sparkRPackageURI = Utils.resolveURI(sparkRPackageFile.getAbsolutePath).toString // Distribute the SparkR package. // Assigns a symbol link name "sparkr" to the shipped package. args.archives = mergeFileLists(args.archives, sparkRPackageURI + "#sparkr") // Distribute the R package archive containing all the built R packages. if (!RUtils.rPackages.isEmpty) { val rPackageFile = RPackageUtils.zipRLibraries(new File(RUtils.rPackages.get), R_PACKAGE_ARCHIVE) if (!rPackageFile.exists()) { printErrorAndExit("Failed to zip all the built R packages.") } val rPackageURI = Utils.resolveURI(rPackageFile.getAbsolutePath).toString // Assigns a symbol link name "rpkg" to the shipped package. args.archives = mergeFileLists(args.archives, rPackageURI + "#rpkg") } } // TODO: Support distributing R packages with standalone cluster if (args.isR && clusterManager == STANDALONE && !RUtils.rPackages.isEmpty) { printErrorAndExit("Distributing R packages with standalone cluster is not supported.") } // TODO: Support distributing R packages with mesos cluster if (args.isR && clusterManager == MESOS && !RUtils.rPackages.isEmpty) { printErrorAndExit("Distributing R packages with mesos cluster is not supported.") } // If we're running an R app, set the main class to our specific R runner if (args.isR && deployMode == CLIENT) { if (args.primaryResource == SPARKR_SHELL) { args.mainClass = "org.apache.spark.api.r.RBackend" } else { // If an R file is provided, add it to the child arguments and list of files to deploy. // Usage: RRunner
[app arguments] args.mainClass = "org.apache.spark.deploy.RRunner" args.childArgs = ArrayBuffer(localPrimaryResource) ++ args.childArgs args.files = mergeFileLists(args.files, args.primaryResource) } } if (isYarnCluster && args.isR) { // In yarn-cluster mode for an R app, add primary resource to files // that can be distributed with the job args.files = mergeFileLists(args.files, args.primaryResource) } // Special flag to avoid deprecation warnings at the client sys.props("SPARK_SUBMIT") = "true" // A list of rules to map each argument to system properties or command-line options in // each deploy mode; we iterate through these below val options = List[OptionAssigner]( // All cluster managers OptionAssigner(args.master, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.master"), OptionAssigner(args.deployMode, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.submit.deployMode"), OptionAssigner(, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = ""), OptionAssigner(args.ivyRepoPath, ALL_CLUSTER_MGRS, CLIENT, confKey = "spark.jars.ivy"), OptionAssigner(args.driverMemory, ALL_CLUSTER_MGRS, CLIENT, confKey = "spark.driver.memory"), OptionAssigner(args.driverExtraClassPath, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.driver.extraClassPath"), OptionAssigner(args.driverExtraJavaOptions, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.driver.extraJavaOptions"), OptionAssigner(args.driverExtraLibraryPath, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.driver.extraLibraryPath"), // Propagate attributes for dependency resolution at the driver side OptionAssigner(args.packages, STANDALONE | MESOS, CLUSTER, confKey = "spark.jars.packages"), OptionAssigner(args.repositories, STANDALONE | MESOS, CLUSTER, confKey = "spark.jars.repositories"), OptionAssigner(args.ivyRepoPath, STANDALONE | MESOS, CLUSTER, confKey = "spark.jars.ivy"), OptionAssigner(args.packagesExclusions, STANDALONE | MESOS, CLUSTER, confKey = "spark.jars.excludes"), // Yarn only OptionAssigner(args.queue, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.queue"), OptionAssigner(args.numExecutors, YARN, ALL_DEPLOY_MODES, confKey = "spark.executor.instances"), OptionAssigner(args.pyFiles, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.dist.pyFiles"), OptionAssigner(args.jars, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.dist.jars"), OptionAssigner(args.files, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.dist.files"), OptionAssigner(args.archives, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.dist.archives"), OptionAssigner(args.principal, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.principal"), OptionAssigner(args.keytab, YARN, ALL_DEPLOY_MODES, confKey = "spark.yarn.keytab"), // Other options OptionAssigner(args.executorCores, STANDALONE | YARN | KUBERNETES, ALL_DEPLOY_MODES, confKey = "spark.executor.cores"), OptionAssigner(args.executorMemory, STANDALONE | MESOS | YARN | KUBERNETES, ALL_DEPLOY_MODES, confKey = "spark.executor.memory"), OptionAssigner(args.totalExecutorCores, STANDALONE | MESOS | KUBERNETES, ALL_DEPLOY_MODES, confKey = "spark.cores.max"), OptionAssigner(args.files, LOCAL | STANDALONE | MESOS | KUBERNETES, ALL_DEPLOY_MODES, confKey = "spark.files"), OptionAssigner(args.jars, LOCAL, CLIENT, confKey = "spark.jars"), OptionAssigner(args.jars, STANDALONE | MESOS | KUBERNETES, ALL_DEPLOY_MODES, confKey = "spark.jars"), OptionAssigner(args.driverMemory, STANDALONE | MESOS | YARN | KUBERNETES, CLUSTER, confKey = "spark.driver.memory"), OptionAssigner(args.driverCores, STANDALONE | MESOS | YARN | KUBERNETES, CLUSTER, confKey = "spark.driver.cores"), OptionAssigner(args.supervise.toString, STANDALONE | MESOS, CLUSTER, confKey = "spark.driver.supervise"), OptionAssigner(args.ivyRepoPath, STANDALONE, CLUSTER, confKey = "spark.jars.ivy"), // An internal option used only for spark-shell to add user jars to repl's classloader, // previously it uses "spark.jars" or "spark.yarn.dist.jars" which now may be pointed to // remote jars, so adding a new option to only specify local jars for spark-shell internally. OptionAssigner(localJars, ALL_CLUSTER_MGRS, CLIENT, confKey = "spark.repl.local.jars") ) // In client mode, launch the application main class directly // In addition, add the main application jar and any added jars (if any) to the classpath if (deployMode == CLIENT) { childMainClass = args.mainClass if (localPrimaryResource != null && isUserJar(localPrimaryResource)) { childClasspath += localPrimaryResource } if (localJars != null) { childClasspath ++= localJars.split(",") } } //这个 if 会走 if (isYarnCluster) { //一般不会配这个参数 if (isUserJar(args.primaryResource)) { childClasspath += args.primaryResource } //把 --jars 参数 加到childClasspath //sparkConf childClasspath 已经初始化了 if (args.jars != null) { childClasspath ++= args.jars.split(",") } } if (deployMode == CLIENT) { if (args.childArgs != null) { childArgs ++= args.childArgs } } // Map all arguments to command-line options or system properties for our chosen mode //sparkConf childClasspath 已经初始化了 //把这个模式下的 OPtions 的 值都 依次 set到 sparkConf for (opt <- options) { if (opt.value != null && (deployMode & opt.deployMode) != 0 && (clusterManager & opt.clusterManager) != 0) { if (opt.clOption != null) { childArgs += (opt.clOption, opt.value) } if (opt.confKey != null) { sparkConf.set(opt.confKey, opt.value) } } } // In case of shells, spark.ui.showConsoleProgress can be true by default or by user. if (isShell(args.primaryResource) && !sparkConf.contains(UI_SHOW_CONSOLE_PROGRESS)) { sparkConf.set(UI_SHOW_CONSOLE_PROGRESS, true) } // Add the application jar automatically so the user doesn't have to call sc.addJar // For YARN cluster mode, the jar is already distributed on each node as "app.jar" // For python and R files, the primary resource is already distributed as a regular file if (!isYarnCluster && !args.isPython && !args.isR) { var jars = sparkConf.getOption("spark.jars").map(x => x.split(",").toSeq).getOrElse(Seq.empty) if (isUserJar(args.primaryResource)) { jars = jars ++ Seq(args.primaryResource) } sparkConf.set("spark.jars", jars.mkString(",")) } // In standalone cluster mode, use the REST client to submit the application (Spark 1.3+). // All Spark parameters are expected to be passed to the client through system properties. if (args.isStandaloneCluster) { if (args.useRest) { childMainClass = REST_CLUSTER_SUBMIT_CLASS childArgs += (args.primaryResource, args.mainClass) } else { // In legacy standalone cluster mode, use Client as a wrapper around the user class childMainClass = STANDALONE_CLUSTER_SUBMIT_CLASS if (args.supervise) { childArgs += "--supervise" } Option(args.driverMemory).foreach { m => childArgs += ("--memory", m) } Option(args.driverCores).foreach { c => childArgs += ("--cores", c) } childArgs += "launch" childArgs += (args.master, args.primaryResource, args.mainClass) } if (args.childArgs != null) { childArgs ++= args.childArgs } } // Let YARN know it's a pyspark app, so it distributes needed libraries. //sparkConf childClasspath 已经初始化了 if (clusterManager == YARN) { if (args.isPython) { sparkConf.set("spark.yarn.isPython", "true") } } if (clusterManager == MESOS && UserGroupInformation.isSecurityEnabled) { setRMPrincipal(sparkConf) } // In yarn-cluster mode, use yarn.Client as a wrapper around the user class //sparkConf childClasspath childMainClass 已经初始化了 if (isYarnCluster) { childMainClass = YARN_CLUSTER_SUBMIT_CLASS if (args.isPython) { childArgs += ("--primary-py-file", args.primaryResource) childArgs += ("--class", "org.apache.spark.deploy.PythonRunner") } else if (args.isR) { val mainFile = new Path(args.primaryResource).getName childArgs += ("--primary-r-file", mainFile) childArgs += ("--class", "org.apache.spark.deploy.RRunner") } else { if (args.primaryResource != SparkLauncher.NO_RESOURCE) { childArgs += ("--jar", args.primaryResource) } //sparkConf childClasspath childArgs 已经初始化了 //childArgs += --class value childArgs += ("--class", args.mainClass) } //sparkConf childClasspath childMainClass childArgs 已经初始化了 //childArgs += --arg value是用户自己的Job的参数 if (args.childArgs != null) { args.childArgs.foreach { arg => childArgs += ("--arg", arg) } } } if (isMesosCluster) { assert(args.useRest, "Mesos cluster mode is only supported through the REST submission API") childMainClass = REST_CLUSTER_SUBMIT_CLASS if (args.isPython) { // Second argument is main class childArgs += (args.primaryResource, "") if (args.pyFiles != null) { sparkConf.set("spark.submit.pyFiles", args.pyFiles) } } else if (args.isR) { // Second argument is main class childArgs += (args.primaryResource, "") } else { childArgs += (args.primaryResource, args.mainClass) } if (args.childArgs != null) { childArgs ++= args.childArgs } } if (isKubernetesCluster) { childMainClass = KUBERNETES_CLUSTER_SUBMIT_CLASS if (args.primaryResource != SparkLauncher.NO_RESOURCE) { childArgs ++= Array("--primary-java-resource", args.primaryResource) } childArgs ++= Array("--main-class", args.mainClass) if (args.childArgs != null) { args.childArgs.foreach { arg => childArgs += ("--arg", arg) } } } // Load any properties specified through --conf and the default properties file //sparkConf childClasspath childMainClass childArgs 已经初始化了 //sparkConf + 用户在spark-submit --conf的参数 for ((k, v) <- args.sparkProperties) { sparkConf.setIfMissing(k, v) } // Ignore invalid in cluster modes. if (deployMode == CLUSTER) { sparkConf.remove("") } // Resolve paths in certain spark properties val pathConfigs = Seq( "spark.jars", "spark.files", "spark.yarn.dist.files", "spark.yarn.dist.archives", "spark.yarn.dist.jars") pathConfigs.foreach { config => // Replace old URIs with resolved URIs, if they exist sparkConf.getOption(config).foreach { oldValue => sparkConf.set(config, Utils.resolveURIs(oldValue)) } } // Resolve and format python file paths properly before adding them to the PYTHONPATH. // The resolving part is redundant in the case of --py-files, but necessary if the user // explicitly sets `spark.submit.pyFiles` in his/her default properties file. sparkConf.getOption("spark.submit.pyFiles").foreach { pyFiles => val resolvedPyFiles = Utils.resolveURIs(pyFiles) val formattedPyFiles = if (!isYarnCluster && !isMesosCluster) { PythonRunner.formatPaths(resolvedPyFiles).mkString(",") } else { // Ignoring formatting python path in yarn and mesos cluster mode, these two modes // support dealing with remote python files, they could distribute and add python files // locally. resolvedPyFiles } sparkConf.set("spark.submit.pyFiles", formattedPyFiles) } //sparkConf childClasspath childMainClass childArgs 已经初始化了 //返回这些参数 //如果是yarn的话 childMainClass是org.apache.spark.deploy.yarn.YarnClusterApplication //childClasspath 里面已经有了用户的Job jar了 (childArgs, childClasspath, sparkConf, childMainClass) }


这个类在spark 源码的 resource-managers 的yarn 目录下。
这类继承自SparkApplication,并且重写 start方法。
childArgs 里面有 --class mainClass --jar primaryResource (–arg userselfargs) *
childClasspath 里面有 --jars的jar和primaryResource
sparkConf 里面有 spark-submit 的–conf 和 keytab、principal 以及其他必要的配置

private[spark] class YarnClusterApplication extends SparkApplication {
  //args 是用户Jar 的自己的参数 conf是 sparkConfig的配置
  override def start(args: Array[String], conf: SparkConf): Unit = {
    // SparkSubmit would use yarn cache to distribute files & jars in yarn mode,
    // so remove them from sparkConf here for yarn mode.
	//ClientArguments 和Client详情 继续看下面
    new Client(new ClientArguments(args), conf).run()



childArgs 里面有 --class mainClass --jar primaryResource (–arg userselfargs) *
childClasspath 里面有 --jars的jar和primaryResource
sparkConf 里面有 spark-submit 的–conf 和 keytab、principal 以及其他必要的配置
这个类的作用主要是解析传进来的 Array[String] 类的 args

private[spark] class ClientArguments(args: Array[String]) {

  var userJar: String = null //user self jar
  var userClass: String = null //user main class
  var primaryPyFile: String = null
  var primaryRFile: String = null
  var userArgs: ArrayBuffer[String] = new ArrayBuffer[String]()
 //user self args

  private def parseArgs(inputArgs: List[String]): Unit = {
    var args = inputArgs
	//通过模式匹配 提取参数
    while (!args.isEmpty) {
      args match {
        case ("--jar") :: value :: tail =>
          userJar = value
          args = tail

        case ("--class") :: value :: tail =>
          userClass = value
          args = tail

        case ("--primary-py-file") :: value :: tail =>
          primaryPyFile = value
          args = tail

        case ("--primary-r-file") :: value :: tail =>
          primaryRFile = value
          args = tail

        case ("--arg") :: value :: tail =>
          userArgs += value
          args = tail

        case Nil =>

        case _ =>
          throw new IllegalArgumentException(getUsageMessage(args))

    if (primaryPyFile != null && primaryRFile != null) {
      throw new IllegalArgumentException("Cannot have primary-py-file and primary-r-file" +
        " at the same time")


这个类的目的是 提交任务到 yarn。
childArgs 里面有 --class mainClass --jar primaryResource (–arg userselfargs) *
childClasspath 里面有 --jars的hdfs:// jar和primaryResource
sparkConf 里面有 spark-submit 的–conf 和 keytab、principal 以及其他必要的配置
args: ClientArguments 的作用主要解析乐的传进来的 Array[String] 类的 args(childArgs )

private val yarnClient = YarnClient.createYarnClient //创建yarn 客户端
  private val hadoopConf = new YarnConfiguration(SparkHadoopUtil.newConfiguration(sparkConf))
  //isClusterMode true
  private val isClusterMode = sparkConf.get("spark.submit.deployMode", "client") == "cluster"

  // AM 即 driver memory
  private val amMemory = if (isClusterMode) {
  } else {
  private val amMemoryOverhead = {
    val amMemoryOverheadEntry = if (isClusterMode) DRIVER_MEMORY_OVERHEAD else AM_MEMORY_OVERHEAD
      math.max((MEMORY_OVERHEAD_FACTOR * amMemory).toLong, MEMORY_OVERHEAD_MIN)).toInt
  private val amCores = if (isClusterMode) {
  } else {

  // Executor related configurations
  private val executorMemory = sparkConf.get(EXECUTOR_MEMORY)
  private val executorMemoryOverhead = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD).getOrElse(
    math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toLong, MEMORY_OVERHEAD_MIN)).toInt

  private val distCacheMgr = new ClientDistributedCacheManager()

  private var loginFromKeytab = false
  private var principal: String = null
  private var keytab: String = null
  private var credentials: Credentials = null
  private var amKeytabFileName: String = null
  private val launcherBackend = new LauncherBackend() {
    override protected def conf: SparkConf = sparkConf

    override def onStopRequest(): Unit = {
      if (isClusterMode && appId != null) {
      } else {
  private val fireAndForget = isClusterMode && !sparkConf.get(WAIT_FOR_APP_COMPLETION)

  private var appId: ApplicationId = null
  //Job Staging
  private val appStagingBaseDir = sparkConf.get(STAGING_DIR).map { new Path(_) }

  private val credentialManager = new YARNHadoopDelegationTokenManager(
    conf => YarnSparkHadoopUtil.hadoopFSsToAccess(sparkConf, conf))



def run(): Unit = {
    this.appId = submitApplication()
    if (!launcherBackend.isConnected() && fireAndForget) {
    //这个是监控Job运行的状态,即会在gateway上打印 zhegeJob的状态 Accept|Running|Faild
      val report = getApplicationReport(appId)
      val state = report.getYarnApplicationState
      logInfo(s"Application report for $appId (state: $state)")
      if (state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) {
        throw new SparkException(s"Application $appId finished with status: $state")
    } else {
      val (yarnApplicationState, finalApplicationStatus) = monitorApplication(appId)
      if (yarnApplicationState == YarnApplicationState.FAILED ||
        finalApplicationStatus == FinalApplicationStatus.FAILED) {
        throw new SparkException(s"Application $appId finished with failed status")
      if (yarnApplicationState == YarnApplicationState.KILLED ||
        finalApplicationStatus == FinalApplicationStatus.KILLED) {
        throw new SparkException(s"Application $appId is killed")
      if (finalApplicationStatus == FinalApplicationStatus.UNDEFINED) {
        throw new SparkException(s"The final status of application $appId is undefined")

val amArgs =
Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
Seq("–properties-file", buildPath(Environment.PWD.$$(),
amCLass = org.apache.spark.deploy.yarn.ApplicationMaster
监控的是这个类 org.apache.spark.deploy.yarn.ApplicationMaster 的结果,这个类执行的才是用户自己的Job Jar


def submitApplication(): ApplicationId = {
    var appId: ApplicationId = null
    try {
      // Setup the credentials before doing anything else,
      // so we have don't have issues at any point.
      初始化 yarnClient并启动

      logInfo("Requesting a new application from cluster with %d NodeManagers"

      // 创建一个应用 并且获取appId
      val newApp = yarnClient.createApplication()
      val newAppResponse = newApp.getNewApplicationResponse()
      appId = newAppResponse.getApplicationId()

      new CallerContext("CLIENT", sparkConf.get(APP_CALLER_CONTEXT),

      // 验证要申请的资源是否超过 一个节点的最大限制

      // Set up the appropriate contexts to launch our AM
      //AM 运行环境参数
      val containerContext = createContainerLaunchContext(newAppResponse)
      val appContext = createApplicationSubmissionContext(newApp, containerContext)

      // Finally, submit and monitor the application
      logInfo(s"Submitting application $appId to ResourceManager")

    } catch {
      case e: Throwable =>
        if (appId != null) {
        throw e

val amArgs =
Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
Seq("–properties-file", buildPath(Environment.PWD.$$(),
amCLass = org.apache.spark.deploy.yarn.ApplicationMaster

–properties-file 里面包含 spark-submit的conf 设置等

** 这里是真正的启动AppMaster **
如果要看这个里面的细节,可以继续 往下看。

这个类 ApplicationMaster 的 后续再详细解读。


val amArgs =
Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
Seq("–properties-file", buildPath(Environment.PWD.$$(),
amCLass = org.apache.spark.deploy.yarn.ApplicationMaster

private def createContainerLaunchContext(newAppResponse: GetNewApplicationResponse)
    : ContainerLaunchContext = {
    logInfo("Setting up container launch context for our AM")
    val appId = newAppResponse.getApplicationId
    //staging path 确定
    val appStagingDirPath = new Path(appStagingBaseDir, getAppStagingDir(appId))
    val pySparkArchives =
      if (sparkConf.get(IS_PYTHON_APP)) {
      } else {
	//am 运行参数
	//appStagingDirPath 是staging path pySparkArchives是Nil
	//launchEnv 确定了
    val launchEnv = setupLaunchEnv(appStagingDirPath, pySparkArchives)
    val localResources = prepareLocalResources(appStagingDirPath, pySparkArchives)

    val amContainer = Records.newRecord(classOf[ContainerLaunchContext])

    val javaOpts = ListBuffer[String]()

    // Set the environment variable through a command prefix
    // to append to the existing value of the variable
    var prefixEnv: Option[String] = None

    // Add Xmx for AM memory
    javaOpts += "-Xmx" + amMemory + "m"

    val tmpDir = new Path(Environment.PWD.$$(), YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR)
    javaOpts += "" + tmpDir
    val useConcurrentAndIncrementalGC = launchEnv.get("SPARK_USE_CONC_INCR_GC").exists(_.toBoolean)
    if (useConcurrentAndIncrementalGC) {
      // In our expts, using (default) throughput collector has severe perf ramifications in
      // multi-tenant machines
      javaOpts += "-XX:+UseConcMarkSweepGC"
      javaOpts += "-XX:MaxTenuringThreshold=31"
      javaOpts += "-XX:SurvivorRatio=8"
      javaOpts += "-XX:+CMSIncrementalMode"
      javaOpts += "-XX:+CMSIncrementalPacing"
      javaOpts += "-XX:CMSIncrementalDutyCycleMin=0"
      javaOpts += "-XX:CMSIncrementalDutyCycle=10"

    // Include driver-specific java options if we are launching a driver
    if (isClusterMode) {
      sparkConf.get(DRIVER_JAVA_OPTIONS).foreach { opts =>
        javaOpts ++= Utils.splitCommandString(opts).map(YarnSparkHadoopUtil.escapeForShell)
      val libraryPaths = Seq(sparkConf.get(DRIVER_LIBRARY_PATH),
      if (libraryPaths.nonEmpty) {
        prefixEnv = Some(getClusterPath(sparkConf, Utils.libraryPathEnvPrefix(libraryPaths)))
      if (sparkConf.get(AM_JAVA_OPTIONS).isDefined) {
        logWarning(s"${AM_JAVA_OPTIONS.key} will not take effect in cluster mode")
    } else {
      // Validate and include yarn am specific java options in yarn-client mode.
      sparkConf.get(AM_JAVA_OPTIONS).foreach { opts =>
        if (opts.contains("-Dspark")) {
          val msg = s"${AM_JAVA_OPTIONS.key} is not allowed to set Spark options (was '$opts')."
          throw new SparkException(msg)
        if (opts.contains("-Xmx")) {
          val msg = s"${AM_JAVA_OPTIONS.key} is not allowed to specify max heap memory settings " +
            s"(was '$opts'). Use instead."
          throw new SparkException(msg)
        javaOpts ++= Utils.splitCommandString(opts).map(YarnSparkHadoopUtil.escapeForShell)
      sparkConf.get(AM_LIBRARY_PATH).foreach { paths =>
        prefixEnv = Some(getClusterPath(sparkConf, Utils.libraryPathEnvPrefix(Seq(paths))))

    // For log4j configuration to reference
    javaOpts += ("" + ApplicationConstants.LOG_DIR_EXPANSION_VAR)

    val userClass =
      if (isClusterMode) {
        Seq("--class", YarnSparkHadoopUtil.escapeForShell(args.userClass))
      } else {
    val userJar =
      if (args.userJar != null) {
        Seq("--jar", args.userJar)
      } else {
    val primaryPyFile =
      if (isClusterMode && args.primaryPyFile != null) {
        Seq("--primary-py-file", new Path(args.primaryPyFile).getName())
      } else {
    val primaryRFile =
      if (args.primaryRFile != null) {
        Seq("--primary-r-file", args.primaryRFile)
      } else {
//这里 amClass 是 org.apache.spark.deploy.yarn.ApplicationMaster
    val amClass =
      if (isClusterMode) {
      } else {
    if (args.primaryRFile != null && args.primaryRFile.endsWith(".R")) {
      args.userArgs = ArrayBuffer(args.primaryRFile) ++ args.userArgs
    val userArgs = args.userArgs.flatMap { arg =>
      Seq("--arg", YarnSparkHadoopUtil.escapeForShell(arg))
    val amArgs =
      Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
      Seq("--properties-file", buildPath(Environment.PWD.$$(), LOCALIZED_CONF_DIR, SPARK_CONF_FILE))

    // Command for the ApplicationMaster
    //组装启动 AM的命令
    val commands = prefixEnv ++
      Seq(Environment.JAVA_HOME.$$() + "/bin/java", "-server") ++
      javaOpts ++ amArgs ++
        "1>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout",
        "2>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")

    // TODO: it would be nicer to just make sure there are no null commands here
    val printableCommands = => if (s == null) "null" else s).toList

    logDebug("YARN AM launch context:")
    logDebug(s"    user class: ${Option(args.userClass).getOrElse("N/A")}")
    logDebug("    env:")
    if (log.isDebugEnabled) {
      Utils.redact(sparkConf, launchEnv.toSeq).foreach { case (k, v) =>
        logDebug(s"        $k -> $v")
    logDebug("    resources:")
    localResources.foreach { case (k, v) => logDebug(s"        $k -> $v")}
    logDebug("    command:")
    logDebug(s"        ${printableCommands.mkString(" ")}")

    // send the acl settings into YARN to control who has access via YARN interfaces
    val securityManager = new SecurityManager(sparkConf)


childArgs 里面有 --class mainClass --jar primaryResource (–arg userselfargs) *
childClasspath 里面有 --jars的jar和primaryResource
sparkConf 里面有 spark-submit 的–conf 和 keytab、principal 以及其他必要的配置
args: ClientArguments 的作用主要解析乐的传进来的 Array[String] 类的 args(childArgs )
即 --class mainClass --jar primaryResource (–arg userselfargs) *

//appStagingDirPath 是staging path pySparkArchives是Nil
private def setupLaunchEnv(
      stagingDirPath: Path,
      pySparkArchives: Seq[String]): HashMap[String, String] = {
    logInfo("Setting up the launch environment for our AM container")
    val env = new HashMap[String, String]()
    // args是 --class mainClass --jar primaryResource (--arg userselfargs) *
    // env 保存的是运行的环境变量和参数
    populateClasspath(args, hadoopConf, sparkConf, env, sparkConf.get(DRIVER_CLASS_PATH))
    env("SPARK_YARN_STAGING_DIR") = stagingDirPath.toString
    env("SPARK_USER") = UserGroupInformation.getCurrentUser().getShortUserName()
    if (loginFromKeytab) {
      val credentialsFile = "credentials-" + UUID.randomUUID().toString
      sparkConf.set(CREDENTIALS_FILE_PATH, new Path(stagingDirPath, credentialsFile).toString)
      logInfo(s"Credentials file set to: $credentialsFile")

    // Pick up any environment variables for the AM provided through spark.yarn.appMasterEnv.*
    val amEnvPrefix = "spark.yarn.appMasterEnv."
      .filter { case (k, v) => k.startsWith(amEnvPrefix) }
      .map { case (k, v) => (k.substring(amEnvPrefix.length), v) }
      .foreach { case (k, v) => YarnSparkHadoopUtil.addPathToEnvironment(env, k, v) }

    // If pyFiles contains any .py files, we need to add LOCALIZED_PYTHON_DIR to the PYTHONPATH
    // of the container processes too. Add all files directly to PYTHONPATH.
    // NOTE: the code currently does not handle .py files defined with a "local:" scheme.
    val pythonPath = new ListBuffer[String]()
    val (pyFiles, pyArchives) = sparkConf.get(PY_FILES).partition(_.endsWith(".py"))
    if (pyFiles.nonEmpty) {
      pythonPath += buildPath(Environment.PWD.$$(), LOCALIZED_PYTHON_DIR)
    (pySparkArchives ++ pyArchives).foreach { path =>
      val uri = Utils.resolveURI(path)
      if (uri.getScheme != LOCAL_SCHEME) {
        pythonPath += buildPath(Environment.PWD.$$(), new Path(uri).getName())
      } else {
        pythonPath += uri.getPath()

    // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
    if (pythonPath.nonEmpty) {
      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
      env("PYTHONPATH") = pythonPathStr
      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)

    if (isClusterMode) {
      // propagate PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON to driver in cluster mode
      Seq("PYSPARK_DRIVER_PYTHON", "PYSPARK_PYTHON").foreach { envname =>
        if (!env.contains(envname)) {
          sys.env.get(envname).foreach(env(envname) = _)
      sys.env.get("PYTHONHASHSEED").foreach(env.put("PYTHONHASHSEED", _))

    sys.env.get(ENV_DIST_CLASSPATH).foreach { dcp =>
      env(ENV_DIST_CLASSPATH) = dcp



args: ClientArguments 的作用主要解析乐的传进来的 Array[String] 类的 args(childArgs )
即 --class mainClass --jar primaryResource (–arg userselfargs) *
extraClassPath is None

private[yarn] def populateClasspath(
      args: ClientArguments,
      conf: Configuration,
      sparkConf: SparkConf,
      env: HashMap[String, String],
      extraClassPath: Option[String] = None): Unit = {
    extraClassPath.foreach { cp =>
      addClasspathEntry(getClusterPath(sparkConf, cp), env)

	//env 设置 classpath
    addClasspathEntry(Environment.PWD.$$(), env)

    addClasspathEntry(Environment.PWD.$$() + Path.SEPARATOR + LOCALIZED_CONF_DIR, env)

    if (sparkConf.get(USER_CLASS_PATH_FIRST)) {
      // in order to properly add the app jar when user classpath is first
      // we have to do the mainJar separate in order to send the right thing
      // into addFileToClasspath
      val mainJar =
        if (args != null) {
        } else {
      mainJar.foreach(addFileToClasspath(sparkConf, conf, _, APP_JAR_NAME, env))

      val secondaryJars =
        if (args != null) {
        } else {
      secondaryJars.foreach { x =>
        addFileToClasspath(sparkConf, conf, x, null, env)

    // Add the Spark jars to the classpath, depending on how they were distributed.
    addClasspathEntry(buildPath(Environment.PWD.$$(), LOCALIZED_LIB_DIR, "*"), env)
    if (sparkConf.get(SPARK_ARCHIVE).isEmpty) {
      sparkConf.get(SPARK_JARS).foreach { jars =>
        jars.filter(isLocalUri).foreach { jar =>
          val uri = new URI(jar)
          addClasspathEntry(getClusterPath(sparkConf, uri.getPath()), env)

    populateHadoopClasspath(conf, env)
    sys.env.get(ENV_DIST_CLASSPATH).foreach { cp =>
      addClasspathEntry(getClusterPath(sparkConf, cp), env)

    // Add the localized Hadoop config at the end of the classpath, in case it contains other
    // files (such as configuration files for different services) that are not part of the
    // YARN cluster's config.
      buildPath(Environment.PWD.$$(), LOCALIZED_CONF_DIR, LOCALIZED_HADOOP_CONF_DIR), env)
