在Yarn中状态转移和事件驱动往往协同工作,一个处理请求首先会作为某种事件发送给集群,然后经事件调度后传给具体的事件处理器,在事件处理器中调用状态机完成状态转移处理逻辑(具体事件驱动的过程见另一篇《学习笔记之Yarn中事件驱动模型.md》)。
状态机由一组状态组成,这些状态大体分为三类:初始状态、中间状态和最终状态。状态机首先由初始状态A开始运行,经过一系列的中间状态后到达最终状态,并在最终状态退出,从而形成一个有向无环图。其状态处理的逻辑是收到一个事件,触发状态A到状态B的转换,而转换操作是由事件对应的hook完成的。
以下介绍源码版本为Hadoop2.7.1。
ResourceManager中状态机:
RMAppImpl
RMAppAttemptImpl
RMContainerImpl
RMNodeImpl
NodeManager中状态机:
ApplicationImpl
ContainerImpl
LocalizedResource
MapReduce中的状态机:Job,Task,TaskAttempt。
几种状态机的可视化方法可参考:yarn状态机可视化
我这导出了一份可参考:hadoop2.7.1状态机图
状态机的核心类是 StateMachineFactory ,最重要就是构建这个 Map
对象(由于使用了泛型,在实例化时候确定,例如在RMAppImpl
中则为 Map
,这里面存了状态机的元信息。后续调用完全是根据这个Map来运行的。这个map的组成,从from到to端,第一个STATE是from端的状态。从一个状态转移可以有多个事件触发,其中的每一个事件可以有一个Transition,每个Transition就是有一个OPERAND操作。下面看下Yarn状态机的UML类图如下:
下面分析我们以 RMAppImpl 为例,继续分析…
从之前的笔记 Yarn源码分析之集群启动流程 知道,Yarn启动会实例化ResourceManager
,然后在ResourceManager
中会初始化子服务RMAppManager
做为常驻服务,当Client向RMAppManager
通过RPC提交一个任务后就会新创建一个RMAppImpl
实例,初始状态为RMAppState.New
,然后发送RMAppEvent
事件使其状态转移为RMAppState.START
,如下在 RMAppManager 中实例化的关键代码:
protected void submitApplication(
ApplicationSubmissionContext submissionContext, long submitTime,
String user) throws YarnException {
ApplicationId applicationId = submissionContext.getApplicationId();
// 初始化RMAppImpl,构建状态机拓扑图和状态级初始状态为New
RMAppImpl application =
createAndPopulateNewRMApp(submissionContext, submitTime, user, false);
ApplicationId appId = submissionContext.getApplicationId();
...
// 发送RMAppEvent事件,经事件调度后在事件处理器中调用状态机进行状态转移:NEW -> START
this.rmContext.getDispatcher().getEventHandler()
.handle(new RMAppEvent(applicationId, RMAppEventType.START));
}
private RMAppImpl createAndPopulateNewRMApp(
ApplicationSubmissionContext submissionContext, long submitTime,
String user, boolean isRecovery) throws YarnException {
ApplicationId applicationId = submissionContext.getApplicationId();
ResourceRequest amReq =
validateAndCreateResourceRequest(submissionContext, isRecovery);
// Create RMApp
RMAppImpl application =
new RMAppImpl(applicationId, rmContext, this.conf,
submissionContext.getApplicationName(), user,
submissionContext.getQueue(),
submissionContext, this.scheduler, this.masterService,
submitTime, submissionContext.getApplicationType(),
submissionContext.getApplicationTags(), amReq);
...
}
下面我们重点看RMAppImpl
中状态机的构建过程…
首先首先声明了一个静态属性 stateMachineFactory
用于构建状态转移拓扑图,代码如下:
private static final StateMachineFactory<RMAppImpl,
RMAppState,
RMAppEventType,
RMAppEvent> stateMachineFactory
= new StateMachineFactory<RMAppImpl,
RMAppState,
RMAppEventType,
RMAppEvent>(RMAppState.NEW)
// Transitions from NEW state
.addTransition(RMAppState.NEW, RMAppState.NEW,
RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
.addTransition(RMAppState.NEW, RMAppState.NEW_SAVING,
RMAppEventType.START, new RMAppNewlySavingTransition())
// 省略其他addTransition
...
.installTopology();
另外声明了一个属性 stateMachine
用于维护该Application实例的当前状态,如下:
private final StateMachine<RMAppState, RMAppEventType, RMAppEvent>
stateMachine;
public RMAppImpl(ApplicationId applicationId, RMContext rmContext,
Configuration config, String name, String user, String queue,
ApplicationSubmissionContext submissionContext, YarnScheduler scheduler,
ApplicationMasterService masterService, long submitTime,
String applicationType, Set<String> applicationTags,
ResourceRequest amReq) {
...
this.stateMachine = stateMachineFactory.make(this);
...
}
初始化完 stateMachineFactory
和 stateMachine
后就可以接受状态转移事件处理业务逻辑并根据状态拓扑图跳转下一个状态了。下面摘之网上图片列出每一个状态的变化都是有哪种类型的事件触发的,根据这个图,可以方便地查阅源码,如下图:
但在此我们先不急进入下一章节(如果不关心细节,可以跳过该章节继续往下),我们继续介绍这两个变量的代码逻辑,首先看下stateMachineFactory
变量,如上面代码可以看出 StateMachineFactory 的状态拓扑图是通过多种addTransition让用户添加各种状态转移,最后通过installTopology完成一个状态机拓扑的搭建,其中初始状态是通过StateMachineFactory
的构造函数指定的。
首先看下 addTransition
函数,中有五个addTransition方法,如下:
public StateMachineFactory
<OPERAND, STATE, EVENTTYPE, EVENT>
addTransition(STATE preState, STATE postState, EVENTTYPE eventType) {
return addTransition(preState, postState, eventType, null);
}
public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
STATE preState, STATE postState, Set<EVENTTYPE> eventTypes) {
return addTransition(preState, postState, eventTypes, null);
}
public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
STATE preState, STATE postState, Set<EVENTTYPE> eventTypes,
SingleArcTransition<OPERAND, EVENT> hook) {
StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> factory = null;
for (EVENTTYPE event : eventTypes) {
if (factory == null) {
factory = addTransition(preState, postState, event, hook);
} else {
factory = factory.addTransition(preState, postState, event, hook);
}
}
return factory;
}
public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(STATE preState, STATE postState,
EVENTTYPE eventType,
SingleArcTransition<OPERAND, EVENT> hook){
return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
(this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
(preState, eventType, new SingleInternalArc(postState, hook)));
}
public StateMachineFactory
<OPERAND, STATE, EVENTTYPE, EVENT>
addTransition(STATE preState, Set<STATE> postStates,
EVENTTYPE eventType,
MultipleArcTransition<OPERAND, EVENT, STATE> hook){
return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
(this,
new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
(preState, eventType, new MultipleInternalArc(postStates, hook)));
}
由其上的addTransition方法可以看出定义了三种状态转换方式,分别是
addTransition(STATE preState, STATE postState, EVENTTYPE eventType, SingleArcTransition hook)
addTransition(STATE preState, STATE postState, Set eventTypes, SingleArcTransition hook)
addTransition(STATE preState, Set postStates, EVENTTYPE eventType, MultipleArcTransition hook)
细查代码可以知道addTransition方法将SingleArcTransition封装为SingleInternalArc(如果是MultipleArcTransition则封装为MultipleInternalArc),然后将SingleInternalArc(或MultipleInternalArc)封装为ApplicableSingleOrMultipleTransition,最后使用封装完的ApplicableSingleOrMultipleTransition参与下面链表的构建。
上面addTransition函数最主要的作用通过函数式编程方式每次返回新建StateMachineFactory
,并在新建的StateMachineFactory
的构造函数中实例化TransitionsListNode
,通过TransitionsListNode
把状态机中的transition按照状态转移的顺利逆序的链成一个链表。TtransitionsListNode
实现链表的代码如下:
private class TransitionsListNode {
final ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
final TransitionsListNode next;
TransitionsListNode
(ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition,
TransitionsListNode next) {
this.transition = transition;
this.next = next;
}
}
有两个属性,分别为ApplicableTransition(一个接口,ApplicableSingleOrMultipleTransition实现了该接口)的transition和TransitionsListNode的next属性。从构造函数中可以看出transition是当前状态转移对应的处理类,next指向的是下一个TransitionsListNode,此时的下一个TransitionsListNode其实是上一个StateMachineFactory中的TransitionListNode。
通过addTransition把状态都添加到StateMachineFactory中之后,最后调用installTopology进行状态链的初始化,初始化过程会new最终的StateMachineFactory
,最终的初始化在 StateMachineFactory.makeStateMachineTable
函数调用上面提到的封装完的 ApplicableSingleOrMultipleTransition.apply
函数完成生成 stateMachineTable
变量,代码如下:
private void makeStateMachineTable() {
Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>> stack =
new Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>>();
Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>
prototype = new HashMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>();
prototype.put(defaultInitialState, null);
// I use EnumMap here because it'll be faster and denser. I would
// expect most of the states to have at least one transition.
stateMachineTable
= new EnumMap<STATE, Map<EVENTTYPE,
Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(prototype);
for (TransitionsListNode cursor = transitionsListNode;
cursor != null;
cursor = cursor.next) {
stack.push(cursor.transition);
}
while (!stack.isEmpty()) {
stack.pop().apply(this);
}
}
static private class ApplicableSingleOrMultipleTransition
<OPERAND, STATE extends Enum<STATE>,
EVENTTYPE extends Enum<EVENTTYPE>, EVENT>
implements ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> {
final STATE preState;
final EVENTTYPE eventType;
final Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
ApplicableSingleOrMultipleTransition
(STATE preState, EVENTTYPE eventType,
Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition) {
this.preState = preState;
this.eventType = eventType;
this.transition = transition;
}
@Override
public void apply
(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject) {
Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
= subject.stateMachineTable.get(preState);
if (transitionMap == null) {
// I use HashMap here because I would expect most EVENTTYPE's to not
// apply out of a particular state, so FSM sizes would be
// quadratic if I use EnumMap's here as I do at the top level.
transitionMap = new HashMap<EVENTTYPE,
Transition<OPERAND, STATE, EVENTTYPE, EVENT>>();
subject.stateMachineTable.put(preState, transitionMap);
}
transitionMap.put(eventType, transition);
}
}
然后我们看另外一个变量 stateMachine
的代码逻辑,该变量的初始化是在 RMAppImpl
的构造函数中 this.stateMachine = stateMachineFactory.make(this);
,我们继续看对应的 make
函数如下:
public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
return new InternalStateMachine(operand, defaultInitialState);
}
private class InternalStateMachine
implements StateMachine<STATE, EVENTTYPE, EVENT> {
private final OPERAND operand;
private STATE currentState;
InternalStateMachine(OPERAND operand, STATE initialState) {
this.operand = operand;
this.currentState = initialState;
if (!optimized) {
maybeMakeStateMachineTable();
}
}
@Override
public synchronized STATE getCurrentState() {
return currentState;
}
@Override
public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event)
throws InvalidStateTransitonException {
currentState = StateMachineFactory.this.doTransition
(operand, currentState, eventType, event);
return currentState;
}
}
从代码可以看出,变量 stateMachine
的初始化就是实例化了一个内部类 InternalStateMachine
用于维护某个Application的状态,变量 currentState
存储Application状态机的当前状态,最终状态转移的实现通过 InternalStateMachine.doTransition
函数实现,调用 StateMachineFactory.doTransition
根据已构建的状态图对应的transition进行状态转移逻辑处理,并把处理后返回的状态赋值新状态, StateMachineFactory.doTransition
代码如下:
private STATE doTransition
(OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
throws InvalidStateTransitonException {
// We can assume that stateMachineTable is non-null because we call
// maybeMakeStateMachineTable() when we build an InnerStateMachine ,
// and this code only gets called from inside a working InnerStateMachine .
Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
= stateMachineTable.get(oldState);
if (transitionMap != null) {
Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition
= transitionMap.get(eventType);
if (transition != null) {
return transition.doTransition(operand, oldState, event, eventType);
}
}
throw new InvalidStateTransitonException(oldState, eventType);
}
如上关键代码为return transition.doTransition(operand, oldState, event, eventType);
,会调用transition.doTransition
函数,Transition
接口实现类包含两个SingleInternalArc
和MultipleInternalArc
,如上文介绍过的该接口主要封装transition用于投建拓扑图中,主要区别是封装到SingleInternalArc
中的transition不需要返回转换后的状态,而是返回拓扑图中的postState;而封装到MultipleInternalArc
中的transition需要返回转换后的状态。两个类的实现代码如下:
private interface Transition<OPERAND, STATE extends Enum<STATE>,
EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
STATE doTransition(OPERAND operand, STATE oldState,
EVENT event, EVENTTYPE eventType);
}
private class SingleInternalArc
implements Transition<OPERAND, STATE, EVENTTYPE, EVENT> {
private STATE postState;
private SingleArcTransition<OPERAND, EVENT> hook; // transition hook
SingleInternalArc(STATE postState,
SingleArcTransition<OPERAND, EVENT> hook) {
this.postState = postState;
this.hook = hook;
}
@Override
public STATE doTransition(OPERAND operand, STATE oldState,
EVENT event, EVENTTYPE eventType) {
if (hook != null) {
hook.transition(operand, event);
}
return postState;
}
}
private class MultipleInternalArc
implements Transition<OPERAND, STATE, EVENTTYPE, EVENT>{
// Fields
private Set<STATE> validPostStates;
private MultipleArcTransition<OPERAND, EVENT, STATE> hook; // transition hook
MultipleInternalArc(Set<STATE> postStates,
MultipleArcTransition<OPERAND, EVENT, STATE> hook) {
this.validPostStates = postStates;
this.hook = hook;
}
@Override
public STATE doTransition(OPERAND operand, STATE oldState,
EVENT event, EVENTTYPE eventType)
throws InvalidStateTransitonException {
STATE postState = hook.transition(operand, event);
if (!validPostStates.contains(postState)) {
throw new InvalidStateTransitonException(oldState, eventType);
}
return postState;
}
}
如上一章完成了状态机拓扑图的构建和当天状态实例化,下面就是怎么调用过程,如上面我们例举了一个Client提交任务使RMAppImpl是状态 RMAppState.New -> RMAppState.START
转移的例子,调用过程关键代码如下:
this.rmContext.getDispatcher().getEventHandler()
.handle(new RMAppEvent(applicationId, RMAppEventType.START));
根据Yarn的事件驱动机制(见另一篇文章 Yarn源码分析之事件模型),我们查阅 ResourceManager 类可知其事件处理器为 ApplicationEventDispatcher
类 ,其关键代码包含:
@Override
protected void serviceInit(Configuration configuration) throws Exception {
...
// Register event handler for RmAppEvents
rmDispatcher.register(RMAppEventType.class,
new ApplicationEventDispatcher(rmContext));
...
}
public static final class ApplicationEventDispatcher implements
EventHandler<RMAppEvent> {
private final RMContext rmContext;
public ApplicationEventDispatcher(RMContext rmContext) {
this.rmContext = rmContext;
}
@Override
public void handle(RMAppEvent event) {
ApplicationId appID = event.getApplicationId();
RMApp rmApp = this.rmContext.getRMApps().get(appID);
if (rmApp != null) {
try {
rmApp.handle(event);
} catch (Throwable t) {
LOG.error("Error in handling event type " + event.getType()
+ " for application " + appID, t);
}
}
}
}
可以看出,最终在事件处理器的 handle
函数中调用状态机 rmApp.handle(event)
进行状态转移,其中RMApp的实现类是 RMAppImpl ,我们查阅 RMAppImpl.handle
函数如下:
@Override
public void handle(RMAppEvent event) {
this.writeLock.lock();
try {
ApplicationId appID = event.getApplicationId();
LOG.debug("Processing event for " + appID + " of type "
+ event.getType());
final RMAppState oldState = getState();
try {
/* keep the master in sync with the state machine */
this.stateMachine.doTransition(event.getType(), event);
} catch (InvalidStateTransitonException e) {
LOG.error("Can't handle this event at current state", e);
/* TODO fail the application on the failed transition */
}
if (oldState != getState()) {
LOG.info(appID + " State change from " + oldState + " to "
+ getState());
}
} finally {
this.writeLock.unlock();
}
}
可以看到上面核心代码是this.stateMachine.doTransition(event.getType(), event);
,继续分析其调用函数如上一章分析的为InternalStateMachine.doTransition -> StateMachineFactory.doTransition -> SingleInternalArc.doTransition -> RMAppNewlySavingTransition.transition
,最终实现逻辑在 RMAppNewlySavingTransition.transition
完成,如下图