以下内容整理来源于DT大数据梦工厂:新浪微博:www.weibo.com/ilovepains/
1、Master接受Driver注册
2、master接受Application注册
3、master接受work注册内幕
4、Master处理Driver状态变化
5、Master处理Exectror状态变化
一、MAster对其他组件注册的处理
1、master接受这册的对象主要是:Driver,Application,Application,worker,Executor不会注册给Master, Executor是注册给Driver中的SchedulerBackend的
2、worker 是在启动后主动向Master注册的,所以如果在生产环境下加入新的worker到已经正在运行的spark集群上,此时不需要重新启动spark集群就能够使用新加入的worker以提升处理能力
private val addressToWorker = new HashMap[RpcAddress, WorkerInfo]
case RegisterWorker( id, workerHost, workerPort, workerRef, cores, memory, workerUiPort, publicAddress) => { logInfo("Registering worker %s:%d with %d cores, %s RAM".format( workerHost, workerPort, cores, Utils.megabytesToString(memory))) if (state == RecoveryState.STANDBY) { context.reply(MasterInStandby) } else if (idToWorker.contains(id)) { context.reply(RegisterWorkerFailed("Duplicate worker ID")) } else { val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory, workerRef, workerUiPort, publicAddress) if (registerWorker(worker)) { persistenceEngine.addWorker(worker) context.reply(RegisteredWorker(self, masterWebUiUrl)) schedule() } else { val workerAddress = worker.endpoint.address logWarning("Worker registration failed. Attempted to re-register worker at same " + "address: " + workerAddress) context.reply(RegisterWorkerFailed("Attempted to re-register worker at same address: " + workerAddress)) } } }3、Master在接受处理到workwe后会先判断一下当前节点的Master是否是standby的模式,如果是的话就不处理,然后会判断当前master的内存数据idWorker
private def registerWorker(worker: WorkerInfo): Boolean = { // There may be one or more refs to dead workers on this same node (w/ different ID's), // remove them. workers.filter { w => (w.host == worker.host && w.port == worker.port) && (w.state == WorkerState.DEAD) }.foreach { w => workers -= w } val workerAddress = worker.endpoint.address if (addressToWorker.contains(workerAddress)) { val oldWorker = addressToWorker(workerAddress) if (oldWorker.state == WorkerState.UNKNOWN) { // A worker registering from UNKNOWN implies that the worker was restarted during recovery. // The old worker must thus be dead, so we will remove it and accept the new worker. removeWorker(oldWorker) } else { logInfo("Attempted to re-register worker at same address: " + workerAddress) return false } }
4、Master如果决定接受注册的work,首先会创建WorkInfo对象来保存注册的work信息:
然后调用RegisterWorker来执行具体的注册过程,如果work的状态是为Dead直接过滤掉,对于UNKNOWN装的内容调用removeWorker进行清理(包括清理该worker下的Executor和Drivers)
5、注册的时候是先注册Driver然后注册
DT大数据梦工厂联系方式:
微信公众号:DT_Spark
博客:http://.blog.sina.com.cn/ilovepains
TEL:18610086859
Email:[email protected]
master--Worker----Thread --RegisterWorker-- workInfo