http://blog.csdn.net/anzhsoft/article/details/30802603
Spark的Cluster Manager可以有几种部署模式:
- Standlone
- Mesos
- YARN
- EC2
- Local
在向集群提交计算任务后,系统的运算模型就是Driver Program定义的SparkContext向APP Master提交,有APP Master进行计算资源的调度并最终完成计算。具体阐述可以阅读《Spark:大数据的电花火石! 》。
那么Standalone模式下,Client,Master和Worker是如何进行通信,注册并开启服务的呢?
1. node之间的RPC - akka
模块间通信有很多成熟的实现,现在很多成熟的Framework已经早已经让我们摆脱原始的Socket编程了。简单归类,可以归纳为基于消息的传递和基于资源共享的同步机制。
基于消息的传递的机制应用比较广泛的有Message Queue。Message Queue, 是一种应用程序对应用程序的通信方法。应用程序通过读写出入队列的消息(针对应用程序的数据)来通信,而无需专用连接来链接它们。消 息传递指的是程序之间通过在消息中发送数据进行通信,而不是通过直接调用彼此来通信,直接调用通常是用于诸如远程过程调用的技术。排队指的是应用程序通过 队列来通信。队列的使用除去了接收和发送应用程序同时执行的要求。其中较为成熟的MQ产品有IBM WEBSPHERE MQ和RabbitMQ(AMQP的开源实现,现在由Pivotal维护)。
还有不得不提的是ZeroMQ,一个致力于进入Linux内核的基于Socket的编程框架。官方的说法: “ZeroMQ是一个简单好用的传输层,像框架一样的一个socket library,它使得Socket编程更加简单、简洁和性能更高。是一个消息处理队列库,可在多个线程、内核和主机盒之间弹性伸缩。ZMQ的明确目标是“成为标准网络协议栈的一部分,之后进入Linux内核”。
Spark在很多模块之间的通信选择是Scala原生支持的akka,一个用 Scala 编写的库,用于简化编写容错的、高可伸缩性的 Java 和 Scala 的 Actor 模型应用。akka有以下5个特性:
- 易于构建并行和分布式应用 (Simple Concurrency & Distribution): Akka在设计时采用了异步通讯和分布式架构,并对上层进行抽象,如Actors、Futures ,STM等。
- 可靠性(Resilient by Design): 系统具备自愈能力,在本地/远程都有监护。
- 高性能(High Performance):在单机中每秒可发送50,000,000个消息。内存占用小,1GB内存中可保存2,500,000个actors。
- 弹性,无中心(Elastic — Decentralized):自适应的负责均衡,路由,分区,配置
- 可扩展(Extensible):可以使用Akka 扩展包进行扩展。
在Spark中的Client,Master和Worker实际上都是一个actor,拿Client来说:
- import akka.actor._
- import akka.pattern.ask
- import akka.remote.{AssociationErrorEvent, DisassociatedEvent, RemotingLifecycleEvent}
-
- private class ClientActor(driverArgs: ClientArguments, conf: SparkConf) extends Actor with Logging {
- var masterActor: ActorSelection = _
- val timeout = AkkaUtils.askTimeout(conf)
-
- override def preStart() = {
- masterActor = context.actorSelection(Master.toAkkaUrl(driverArgs.master))
-
- context.system.eventStream.subscribe(self, classOf[RemotingLifecycleEvent])
-
- println(s"Sending ${driverArgs.cmd} command to ${driverArgs.master}")
-
- driverArgs.cmd match {
- case "launch" =>
- ...
- masterActor ! RequestSubmitDriver(driverDescription)
-
- case "kill" =>
- val driverId = driverArgs.driverId
- val killFuture = masterActor ! RequestKillDriver(driverId)
- }
- }
-
- override def receive = {
-
- case SubmitDriverResponse(success, driverId, message) =>
- println(message)
- if (success) pollAndReportStatus(driverId.get) else System.exit(-1)
-
- case KillDriverResponse(driverId, success, message) =>
- println(message)
- if (success) pollAndReportStatus(driverId) else System.exit(-1)
-
- case DisassociatedEvent(_, remoteAddress, _) =>
- println(s"Error connecting to master ${driverArgs.master} ($remoteAddress), exiting.")
- System.exit(-1)
-
- case AssociationErrorEvent(cause, _, remoteAddress, _) =>
- println(s"Error connecting to master ${driverArgs.master} ($remoteAddress), exiting.")
- println(s"Cause was: $cause")
- System.exit(-1)
- }
- }
-
-
-
-
- object Client {
- def main(args: Array[String]) {
- println("WARNING: This client is deprecated and will be removed in a future version of Spark.")
- println("Use ./bin/spark-submit with \"--master spark://host:port\"")
-
- val conf = new SparkConf()
- val driverArgs = new ClientArguments(args)
-
- if (!driverArgs.logLevel.isGreaterOrEqual(Level.WARN)) {
- conf.set("spark.akka.logLifecycleEvents", "true")
- }
- conf.set("spark.akka.askTimeout", "10")
- conf.set("akka.loglevel", driverArgs.logLevel.toString.replace("WARN", "WARNING"))
- Logger.getRootLogger.setLevel(driverArgs.logLevel)
-
-
-
- val (actorSystem, _) = AkkaUtils.createActorSystem(
- "driverClient", Utils.localHostName(), 0, conf, new SecurityManager(conf))
-
- actorSystem.actorOf(Props(classOf[ClientActor], driverArgs, conf))
-
- actorSystem.awaitTermination()
- }
- }
其中第19行的含义就是向Master提交Driver的请求,
- masterActor ! RequestSubmitDriver(driverDescription)
而Master将在receive里处理这个请求。当然了27行到44行的是处理Client Actor收到的消息。
可以看出,通过akka,可以非常简单高效的处理模块间的通信,这可以说是Spark RPC的一大特色。
2. Client,Master和Workerq启动通信详解
源码位置:spark-1.0.0\core\src\main\scala\org\apache\spark\deploy。主要涉及的类:Client.scala, Master.scala和Worker.scala。这三大模块之间的通信框架如下图。
Standalone模式下存在的角色:
-
Client:负责提交作业到Master。
-
Master:接收Client提交的作业,管理Worker,并命令Worker启动Driver和Executor。
-
Worker:负责管理本节点的资源,定期向Master汇报心跳,接收Master的命令,比如启动Driver和Executor。
实际上,Master和Worker要处理的消息要比这多得多,本图只是反映了集群启动和向集群提交运算时候的主要消息处理。
接下来将分别走读这三大角色的源码。
2.1 Client源码解析
Client启动:
- object Client {
- def main(args: Array[String]) {
- println("WARNING: This client is deprecated and will be removed in a future version of Spark.")
- println("Use ./bin/spark-submit with \"--master spark://host:port\"")
-
- val conf = new SparkConf()
- val driverArgs = new ClientArguments(args)
-
- if (!driverArgs.logLevel.isGreaterOrEqual(Level.WARN)) {
- conf.set("spark.akka.logLifecycleEvents", "true")
- }
- conf.set("spark.akka.askTimeout", "10")
- conf.set("akka.loglevel", driverArgs.logLevel.toString.replace("WARN", "WARNING"))
- Logger.getRootLogger.setLevel(driverArgs.logLevel)
-
-
-
- val (actorSystem, _) = AkkaUtils.createActorSystem(
- "driverClient", Utils.localHostName(), 0, conf, new SecurityManager(conf))
-
- actorSystem.actorOf(Props(classOf[ClientActor], driverArgs, conf))
-
- actorSystem.awaitTermination()
- }
- }
从行21可以看出,核心实现是由ClientActor实现的。Client的Actor是akka.Actor的一个扩展。对于Actor,从它对recevie的override就可以看出它需要处理的消息。
- override def receive = {
-
- case SubmitDriverResponse(success, driverId, message) =>
- println(message)
- if (success) pollAndReportStatus(driverId.get) else System.exit(-1)
-
- case KillDriverResponse(driverId, success, message) =>
- println(message)
- if (success) pollAndReportStatus(driverId) else System.exit(-1)
-
- case DisassociatedEvent(_, remoteAddress, _) =>
- println(s"Error connecting to master ${driverArgs.master} ($remoteAddress), exiting.")
- System.exit(-1)
-
- case AssociationErrorEvent(cause, _, remoteAddress, _) =>
- println(s"Error connecting to master ${driverArgs.master} ($remoteAddress), exiting.")
- println(s"Cause was: $cause")
- System.exit(-1)
- }
2.2 Master的源码分析
源码分析详见注释。
- override def receive = {
- case ElectedLeader => {
-
- }
- case CompleteRecovery => completeRecovery()
- case RevokedLeadership => {
-
- }
- case RegisterWorker(id, workerHost, workerPort, cores, memory, workerUiPort, publicAddress) =>
- {
-
-
- sender ! RegisterWorkerFailed("Duplicate worker ID")
-
- sender ! RegisteredWorker(masterUrl, masterWebUiUrl)
- schedule()
-
- sender ! RegisterWorkerFailed("Attempted to re-register worker at same address: " + workerAddress)
- }
- case RequestSubmitDriver(description) => {
-
- sender ! SubmitDriverResponse(false, None, msg)
-
- sender ! SubmitDriverResponse(true, Some(driver.id), s"Driver successfully submitted as ${driver.id}")
- }
- }
- case RequestKillDriver(driverId) => {
- if (state != RecoveryState.ALIVE) {
-
- val msg = s"Can only kill drivers in ALIVE state. Current state: $state."
- sender ! KillDriverResponse(driverId, success = false, msg)
- } else {
- logInfo("Asked to kill driver " + driverId)
- val driver = drivers.find(_.id == driverId)
- driver match {
- case Some(d) =>
-
- } else {
-
- d.worker.foreach { w => w.actor ! KillDriver(driverId) }
- }
-
- sender ! KillDriverResponse(driverId, success = true, msg)
- case None =>
-
- sender ! KillDriverResponse(driverId, success = false, msg)
- }
- }
- }
- case RequestDriverStatus(driverId) => {
-
- (drivers ++ completedDrivers).find(_.id == driverId) match {
- case Some(driver) =>
- sender ! DriverStatusResponse(found = true, Some(driver.state),
- driver.worker.map(_.id), driver.worker.map(_.hostPort), driver.exception)
- case None =>
- sender ! DriverStatusResponse(found = false, None, None, None, None)
- }
- }
- case RegisterApplication(description) => {
-
-
- }
- case ExecutorStateChanged(appId, execId, state, message, exitStatus) => {
-
- val execOption = idToApp.get(appId).flatMap(app => app.executors.get(execId))
- execOption match {
- case Some(exec) => {
- exec.state = state
- exec.application.driver ! ExecutorUpdated(execId, state, message, exitStatus)
- if (ExecutorState.isFinished(state)) {
- val appInfo = idToApp(appId)
-
- logInfo("Removing executor " + exec.fullId + " because it is " + state)
- appInfo.removeExecutor(exec)
- exec.worker.removeExecutor(exec)
- }
- }
- }
- case DriverStateChanged(driverId, state, exception) => {
-
- }
- case Heartbeat(workerId) => {
-
- }
- case MasterChangeAcknowledged(appId) => {
-
- }
- case WorkerSchedulerStateResponse(workerId, executors, driverIds) => {
-
-
-
- idToWorker.get(workerId) match {
- case Some(worker) =>
- logInfo("Worker has been re-registered: " + workerId)
- worker.state = WorkerState.ALIVE
-
- val validExecutors = executors.filter(exec => idToApp.get(exec.appId).isDefined)
- for (exec <- validExecutors) {
- val app = idToApp.get(exec.appId).get
- val execInfo = app.addExecutor(worker, exec.cores, Some(exec.execId))
- worker.addExecutor(execInfo)
- execInfo.copyState(exec)
- }
-
- for (driverId <- driverIds) {
- drivers.find(_.id == driverId).foreach { driver =>
- driver.worker = Some(worker)
- driver.state = DriverState.RUNNING
- worker.drivers(driverId) = driver
- }
- }
- }
- }
- case DisassociatedEvent(_, address, _) => {
-
-
- }
- case RequestMasterState => {
-
- sender ! MasterStateResponse(host, port, workers.toArray, apps.toArray, completedApps.toArray, drivers.toArray, completedDrivers.toArray, state)
- }
- case CheckForWorkerTimeOut => {
-
- }
- case RequestWebUIPort => {
-
- sender ! WebUIPortResponse(webUi.boundPort)
- }
- }
2.3 Worker 源码解析
通过对Client和Master的源码解析,相信你也知道如何去分析Worker是如何和Master进行通信的了,没错,答案就在下面: