我们先整体画一张spark程序执行的全流程
1-2.我们通过spark-submit提交application时候,程序会通过反射的方式创建出一个DriverActor进程出来,Driver进程会创建一个SparkContext,SparkContext会初始化最重要的两个组件,DAGScheduler和TaskScheduler。
3-7.TaskScheduler会通知Master,Master会接到通知之后会初始化多个worker,worker会启动Executor进程,Executor进程启动之后会反向注册到TaskScheduler上
8-10.我们自己编写的spark程序,每遇见一个action,就会产生一个job,DAGScheduler通过是否产生shuffle来对一个job进程stage划分,每一个stage就是一个task,DAGScheduler会把task封装在taskSet中,提交给TaskScheduler。TaskScheduler会把封装好的taskSet提交给Executor,Executor内部会把Task封装成TaskRunner,然后从线程池中取出一个线程来执行这个TaskRunner。同时会把执行的过程和结果反馈给Driver。
-------------------------------------------------
以上是spark程序执行的全流程。然后我们从最开始的new SparkContext()开始分析。这是所有spark程序的最初阶段
先画一张图来说明sparkContext初始化的流程
打开spark源码,进入SparkContext.scala类
_heartbeatReceiver = env.rpcEnv.setupEndpoint( HeartbeatReceiver.ENDPOINT_NAME, new HeartbeatReceiver(this)) // Create and start the scheduler val (sched, ts) = SparkContext.createTaskScheduler(this, master)------1 _schedulerBackend = sched _taskScheduler = ts _dagScheduler = new DAGScheduler(this) _heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet) // start TaskScheduler after taskScheduler sets DAGScheduler reference in DAGScheduler's // constructor _taskScheduler.start()
1进入之后
private def createTaskScheduler( sc: SparkContext, master: String): (SchedulerBackend, TaskScheduler) = { import SparkMasterRegex._ // When running locally, don't try to re-execute tasks on failure. val MAX_LOCAL_TASK_FAILURES = 1 master match { case "local" => val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true) val backend = new LocalBackend(sc.getConf, scheduler, 1) scheduler.initialize(backend) (backend, scheduler) case LOCAL_N_REGEX(threads) => def localCpuCount: Int = Runtime.getRuntime.availableProcessors() // local[*] estimates the number of cores on the machine; local[N] uses exactly N threads. val threadCount = if (threads == "*") localCpuCount else threads.toInt if (threadCount <= 0) { throw new SparkException(s"Asked to run locally with $threadCount threads") } val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true) val backend = new LocalBackend(sc.getConf, scheduler, threadCount) scheduler.initialize(backend) (backend, scheduler) case LOCAL_N_FAILURES_REGEX(threads, maxFailures) => def localCpuCount: Int = Runtime.getRuntime.availableProcessors() // local[*, M] means the number of cores on the computer with M failures // local[N, M] means exactly N threads with M failures val threadCount = if (threads == "*") localCpuCount else threads.toInt val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true) val backend = new LocalBackend(sc.getConf, scheduler, threadCount) scheduler.initialize(backend) (backend, scheduler) case SPARK_REGEX(sparkUrl) => val scheduler = new TaskSchedulerImpl(sc) val masterUrls = sparkUrl.split(",").map("spark://" + _) val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls) scheduler.initialize(backend) (backend, scheduler)
。。。。
这是一段模式匹配,我们看最后一段
首先会new一个TaskScheduler对象,和一个SparkDeploySchedulerBackend对象,然后会执行TaskScheduler的initialize方法
def initialize(backend: SchedulerBackend) { this.backend = backend // temporarily set rootPool name to empty rootPool = new Pool("", schedulingMode, 0, 0) schedulableBuilder = { schedulingMode match { case SchedulingMode.FIFO => new FIFOSchedulableBuilder(rootPool) case SchedulingMode.FAIR => new FairSchedulableBuilder(rootPool, conf) } } schedulableBuilder.buildPools()
这里面本质上是一个初始化scheduler pool的方法,有两个形式,FIFO和FAIR模式
执行完成之后会返回TaskSchedulerImpl和SparkDeploySchedulerBackend组成的一个元组
回到SparkContext方法,解析去会创建一个DAGScheduler对象
进入taskScheduler.start()
override def start() { backend.start() if (!isLocal && conf.getBoolean("spark.speculation", false)) { logInfo("Starting speculative execution thread") speculationScheduler.scheduleAtFixedRate(new Runnable { override def run(): Unit = Utils.tryOrStopSparkContext(sc) { checkSpeculatableTasks() } }, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS) } }
关注第一句话backend.start()
override def start() { 。。。。。。 val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor) client = new AppClient(sc.env.rpcEnv, masters, appDesc, this, conf) client.start() launcherBackend.setState(SparkAppHandle.State.SUBMITTED) waitForRegistration() launcherBackend.setState(SparkAppHandle.State.RUNNING) }
前端一坨代码都不去看,这里有一个appDesc,这是一个ApplicationDescription对象,点击进去可以看到
name: String, maxCores: Option[Int], memoryPerExecutorMB: Int, command: Command, appUiUrl: String, eventLogDir: Option[URI] = None, // short name of compression codec used when writing event logs, if any (e.g. lzf) eventLogCodec: Option[String] = None, coresPerExecutor: Option[Int] = None, user: String = System.getProperty("user.name", ""))
这其实就是我们再spark-submit之后后面跟的一些参数
接着会new一个AppClient对象,然后调用start()方法
endpoint.set(rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv)))
注意到ClientEndPoint的onStart方法
override def onStart(): Unit = { try { registerWithMaster(1) } catch { case e: Exception => logWarning("Failed to connect to master", e) markDisconnected() stop() } }
这里就是注册到Master上
private def registerWithMaster(nthRetry: Int) { registerMasterFutures.set(tryRegisterAllMasters()) registrationRetryTimer.set(registrationRetryThread.scheduleAtFixedRate(new Runnable { override def run(): Unit = { Utils.tryOrExit { if (registered.get) { registerMasterFutures.get.foreach(_.cancel(true)) registerMasterThreadPool.shutdownNow() } else if (nthRetry >= REGISTRATION_RETRIES) { markDead("All masters are unresponsive! Giving up.") } else { registerMasterFutures.get.foreach(_.cancel(true)) registerWithMaster(nthRetry + 1) } } } }, REGISTRATION_TIMEOUT_SECONDS, REGISTRATION_TIMEOUT_SECONDS, TimeUnit.SECONDS)) }
这里面本质是一个scheduler线程池。
--------
我们再进入Master.scala类,
找到receive 方法中的
case RegisterApplication(description, driver)
case RegisterApplication(description, driver) => { // TODO Prevent repeated registrations from some driver if (state == RecoveryState.STANDBY) { // ignore, don't send response } else { logInfo("Registering app " + description.name) val app = createApplication(description, driver) registerApplication(app) logInfo("Registered app " + description.name + " with ID " + app.id) persistenceEngine.addApplication(app) driver.send(RegisteredApplication(app.id, self)) schedule() } }
首先会判断Master的状态,只有不是STANDBY才能继续往下面执行
首先会根据appDescription来创建一个app,然后注册application
val appAddress = app.driver.address if (addressToApp.contains(appAddress)) { logInfo("Attempted to re-register application at same address: " + appAddress) return } applicationMetricsSystem.registerSource(app.appSource) apps += app idToApp(app.id) = app endpointToApp(app.driver) = app addressToApp(appAddress) = app waitingApps += app
这是注册application的方法,其实就是加入到各种容器中,HashMap和ArrayBuffer
然后会试用持久化引擎持久化app。这里说明一下持久化引擎,一般常用的是zookeeper,也就是会把app信息持久化到zookeeper中,方便做HA。然后会向driver发送注册app成功的信息。然后调用schedule方法
private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } // Drivers take strict precedence over executors val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) { for (driver <- waitingDrivers) { if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver } } } startExecutorsOnWorkers() }
scheduler方法中最重要的就是最后一句话,启动Executors。
-------------
接下去看一下DAGScheduler
这个类中包含了
eventProcessLoop
它通过这个线程组与下层的TaskScheduler进行通信
------
sparkContext中还有一个sparkUI
本质上sparkUI通过启动一个jetty容器来提供服务,进行可以进行页面访问