ResourceManager#serviceInit()方法
1、判断是否启动HA。如果yarn.resourcemanager.ha.enabled配置参数为true,则为启动HA。
2、如果启动HA,判断是否启用自动失败重启。如果yarn.resourcemanager.ha.automatic-failover.enabled配置参数为true,则为启动自动失败重启。如果启用自动失败重启,创建EmbeddedElector。
EmbeddedElector有2种类型:
如果yarn.resourcemanager.ha.curator-leader-elector.enabled参数为true,则EmbeddedElector类型为CuratorBasedElectorService,该参数默认为false。
CuratorBasedElectorService实现了curator框架的LeaderLatchListener接口。
curator框架的LeaderLatch封装了zk的主从选举,在当前进程当选为leader节点时,LeaderLatch会回调LeaderLatchListener的isLeader方法。
isLeader方法是RM在当选为leader节点后的执行逻辑。在RM当选为leader节点后,将会启动RMActiveServices。
// Set HA configuration should be done before login
this.rmContext.setHAEnabled(HAUtil.isHAEnabled(this.conf));
if (this.rmContext.isHAEnabled()) {
HAUtil.verifyAndSetConfiguration(this.conf);
}
// elector must be added post adminservice
if (this.rmContext.isHAEnabled()) {
// If the RM is configured to use an embedded leader elector,
// initialize the leader elector.
if (HAUtil.isAutomaticFailoverEnabled(conf)
&& HAUtil.isAutomaticFailoverEmbedded(conf)) {
EmbeddedElector elector = createEmbeddedElector();
addIfService(elector);
rmContext.setLeaderElectorService(elector);
}
}
CuratorBasedElectorService启动leaderLatch
private void initAndStartLeaderLatch() throws Exception {
leaderLatch = new LeaderLatch(curator, latchPath, rmId);
leaderLatch.addListener(this);
leaderLatch.start();
}
leaderLatch竞选leader节点,并当选为leader节点时回调LeaderLatchListener的isLeader方法
略,请参阅curator2.1源码分析之LeaderLatch封装ZK主从选举
CuratorBasedElectorService#isLeader方法
RM在当选为leader节点后的处理逻辑
public void isLeader() {
LOG.info(rmId + "is elected leader, transitioning to active");
try {
rm.getRMContext().getRMAdminService()
.transitionToActive(
new HAServiceProtocol.StateChangeRequestInfo(
HAServiceProtocol.RequestSource.REQUEST_BY_ZKFC));
} catch (Exception e) {
LOG.info(rmId + " failed to transition to active, giving up leadership",
e);
notLeader();
rejoinElection();
}
}
AdminService#transitionToActive()方法
public synchronized void transitionToActive(
HAServiceProtocol.StateChangeRequestInfo reqInfo) throws IOException {
if (isRMActive()) {
return;
}
// call refreshAdminAcls before HA state transition
// for the case that adminAcls have been updated in previous active RM
try {
refreshAdminAcls(false);
} catch (YarnException ex) {
throw new ServiceFailedException("Can not execute refreshAdminAcls", ex);
}
UserGroupInformation user = checkAccess("transitionToActive");
checkHaStateChange(reqInfo);
try {
// call all refresh*s for active RM to get the updated configurations.
refreshAll();
} catch (Exception e) {
rm.getRMContext()
.getDispatcher()
.getEventHandler()
.handle(
new RMFatalEvent(RMFatalEventType.TRANSITION_TO_ACTIVE_FAILED,
e, "failure to refresh configuration settings"));
throw new ServiceFailedException(
"Error on refreshAll during transition to Active", e);
}
try {
rm.transitionToActive();
} catch (Exception e) {
RMAuditLogger.logFailure(user.getShortUserName(), "transitionToActive",
"", "RM",
"Exception transitioning to active");
throw new ServiceFailedException(
"Error when transitioning to Active mode", e);
}
RMAuditLogger.logSuccess(user.getShortUserName(), "transitionToActive",
"RM");
}
ResourceManager#transitionToActive()方法
synchronized void transitionToActive() throws Exception {
if (rmContext.getHAServiceState() == HAServiceProtocol.HAServiceState.ACTIVE) {
LOG.info("Already in active state");
return;
}
LOG.info("Transitioning to active state");
this.rmLoginUGI.doAs(new PrivilegedExceptionAction() {
@Override
public Void run() throws Exception {
try {
//启动RMActiveServices
startActiveServices();
return null;
} catch (Exception e) {
reinitialize(true);
throw e;
}
}
});
rmContext.setHAServiceState(HAServiceProtocol.HAServiceState.ACTIVE);
LOG.info("Transitioned to active state");
}
ResourceManager#startActiveServices()方法
void startActiveServices() throws Exception {
if (activeServices != null) {
clusterTimeStamp = System.currentTimeMillis();
activeServices.start();
}
}
ResourceManager.RMActiveServices#serviceInit()方法
RMActiveServices在服务初始化时,根据yarn.resourcemanager.recovery.enabled参数决定是否在active RM启动后恢复它的状态。如果参数参数为true,调用RMStateStoreFactory#getStore()初始化RMStateStore。RMStateStoreFactory会根据yarn.resourcemanager.store.class参数反射生成相应的RMStateStore。
所以,如果yarn.resourcemanager.recovery.enabled参数为true,必须设置yarn.resourcemanager.store.class参数。
RMStateStore有几种类型如下:
如果设置yarn.resourcemanager.store.class参数为ZKRMStateStore,则ResourceManager使用基于zk的状态存储。
recoveryEnabled = conf.getBoolean(YarnConfiguration.RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED);
RMStateStore rmStore = null;
if (recoveryEnabled) {
rmStore = RMStateStoreFactory.getStore(conf);
boolean isWorkPreservingRecoveryEnabled =
conf.getBoolean(
YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED);
rmContext
.setWorkPreservingRecoveryEnabled(isWorkPreservingRecoveryEnabled);
} else {
rmStore = new NullRMStateStore();
}
ResourceManager#serviceStart()方法
@Override
protected void serviceStart() throws Exception {
RMStateStore rmStore = rmContext.getStateStore();
// The state store needs to start irrespective of recoveryEnabled as apps
// need events to move to further states.
rmStore.start();
//是否在active RM启动后恢复它的状态
if(recoveryEnabled) {
try {
LOG.info("Recovery started");
rmStore.checkVersion();
if (rmContext.isWorkPreservingRecoveryEnabled()) {
rmContext.setEpoch(rmStore.getAndIncrementEpoch());
}
RMState state = rmStore.loadState();
recover(state);
LOG.info("Recovery ended");
} catch (Exception e) {
// the Exception from loadState() needs to be handled for
// HA and we need to give up master status if we got fenced
LOG.error("Failed to load/recover state", e);
throw e;
}
} else {
if (HAUtil.isFederationEnabled(conf)) {
long epoch = conf.getLong(YarnConfiguration.RM_EPOCH,
YarnConfiguration.DEFAULT_RM_EPOCH);
rmContext.setEpoch(epoch);
LOG.info("Epoch set for Federation: " + epoch);
}
}
super.serviceStart();
}
ResourceManager#recover()方法
@Override
public void recover(RMState state) throws Exception {
// recover RMdelegationTokenSecretManager
rmContext.getRMDelegationTokenSecretManager().recover(state);
// recover AMRMTokenSecretManager
rmContext.getAMRMTokenSecretManager().recover(state);
// recover reservations
if (reservationSystem != null) {
reservationSystem.recover(state);
}
// recover applications
rmAppManager.recover(state);
setSchedulerRecoveryStartAndWaitTime(state, conf);
}
ZKRMSateStore#getAndIncrementEpoch()方法
public synchronized long getAndIncrementEpoch() throws Exception {
String epochNodePath = getNodePath(zkRootNodePath, EPOCH_NODE);
long currentEpoch = baseEpoch;
if (exists(epochNodePath)) {
// load current epoch
byte[] data = getData(epochNodePath);
Epoch epoch = new EpochPBImpl(EpochProto.parseFrom(data));
currentEpoch = epoch.getEpoch();
// increment epoch and store it
byte[] storeData = Epoch.newInstance(nextEpoch(currentEpoch)).getProto()
.toByteArray();
zkManager.safeSetData(epochNodePath, storeData, -1, zkAcl,
fencingNodePath);
} else {
// initialize epoch node with 1 for the next time.
byte[] storeData = Epoch.newInstance(nextEpoch(currentEpoch)).getProto()
.toByteArray();
zkManager.safeCreate(epochNodePath, storeData, zkAcl,
CreateMode.PERSISTENT, zkAcl, fencingNodePath);
}
return currentEpoch;
}
待续。。