spark启动

spark启动流程

sbin/start-all.sh -> start-master.sh -> start-slaves.sh

sbin/start-master.sh -> 先读取变量 sbin/spark-daemon.sh start org.apache.spark.deploy.master.Master 1 --ip $SPARK_MASTER_IP --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT

sbin/spark-daemon.sh -> /bin/spark-class $command "$@"

/bin/spark-class -> exec "$RUNNER" -cp "$CLASSPATH" $JAVA_OPTS "$@"


spark提交任务过程

bin/spark-submit --class cn.itcast.spark.WordCount --master spark://node-1.itcast.cn:7077 --executor-memory 2g --total-executor-cores 4

exec "$SPARK_HOME"/bin/spark-class org.apache.spark.deploy.SparkSubmit -> exec "$RUNNER" -cp "$CLASSPATH" $JAVA_OPTS "$@"

重点看spark-class org.apache.spark.deploy.SparkSubmit ->submit -> doRunMain (args->class cn.itcast.spark.WordCount …)
class.forname 在自己进程里反射
--> Class.forName通过反射调用自定义类的main方法(只有一个进程)

sparkcontext运行在sparksubmit(driver)进程中,然后与master建立连接,以后要rpc通信

创建DAGScheduler->TaskScheduler

New sparkcontext ->调用主构造器->1.创建aparkEnv(创建actorsystem)2.创建taskscheduler->create dagscheduler->start tasksceduler

你可能感兴趣的:(spark启动)