day30:Master的注册机制和状态管理解密

以下内容整理来源于DT大数据梦工厂:新浪微博:www.weibo.com/ilovepains/

1、Master接受Driver注册

2、master接受Application注册

3、master接受work注册内幕

4、Master处理Driver状态变化

5、Master处理Exectror状态变化

一、MAster对其他组件注册的处理

1、master接受这册的对象主要是:Driver,Application,Application,worker,Executor不会注册给Master, Executor是注册给Driver中的SchedulerBackend的

2、worker 是在启动后主动向Master注册的,所以如果在生产环境下加入新的worker到已经正在运行的spark集群上,此时不需要重新启动spark集群就能够使用新加入的worker以提升处理能力

 private val addressToWorker = new HashMap[RpcAddress, WorkerInfo]

case RegisterWorker(
        id, workerHost, workerPort, workerRef, cores, memory, workerUiPort, publicAddress) => {
      logInfo("Registering worker %s:%d with %d cores, %s RAM".format(
        workerHost, workerPort, cores, Utils.megabytesToString(memory)))
      if (state == RecoveryState.STANDBY) {
        context.reply(MasterInStandby)
      } else if (idToWorker.contains(id)) {
        context.reply(RegisterWorkerFailed("Duplicate worker ID"))
      } else {
        val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory,
          workerRef, workerUiPort, publicAddress)
        if (registerWorker(worker)) {
          persistenceEngine.addWorker(worker)
          context.reply(RegisteredWorker(self, masterWebUiUrl))
          schedule()
        } else {
          val workerAddress = worker.endpoint.address
          logWarning("Worker registration failed. Attempted to re-register worker at same " +
            "address: " + workerAddress)
          context.reply(RegisterWorkerFailed("Attempted to re-register worker at same address: "
            + workerAddress))
        }
      }
    }
3、Master在接受处理到workwe后会先判断一下当前节点的Master是否是standby的模式,如果是的话就不处理,然后会判断当前master的内存数据idWorker
是否已经有该worker的注册,如果有的话就不会重新注册。

private def registerWorker(worker: WorkerInfo): Boolean = {
    // There may be one or more refs to dead workers on this same node (w/ different ID's),
    // remove them.
    workers.filter { w =>
      (w.host == worker.host && w.port == worker.port) && (w.state == WorkerState.DEAD)
    }.foreach { w =>
      workers -= w
    }

    val workerAddress = worker.endpoint.address
    if (addressToWorker.contains(workerAddress)) {
      val oldWorker = addressToWorker(workerAddress)
      if (oldWorker.state == WorkerState.UNKNOWN) {
        // A worker registering from UNKNOWN implies that the worker was restarted during recovery.
        // The old worker must thus be dead, so we will remove it and accept the new worker.
        removeWorker(oldWorker)
      } else {
        logInfo("Attempted to re-register worker at same address: " + workerAddress)
        return false
      }
    }


4、Master如果决定接受注册的work,首先会创建WorkInfo对象来保存注册的work信息:

然后调用RegisterWorker来执行具体的注册过程,如果work的状态是为Dead直接过滤掉,对于UNKNOWN装的内容调用removeWorker进行清理(包括清理该worker下的Executor和Drivers)

5、注册的时候是先注册Driver然后注册



DT大数据梦工厂联系方式:
微信公众号:DT_Spark
博客:http://.blog.sina.com.cn/ilovepains
TEL:18610086859
Email:[email protected]

master--Worker----Thread --RegisterWorker-- workInfo


你可能感兴趣的:(day30:Master的注册机制和状态管理解密)