1:Master项目,核心服务线程Bean
是MasterSchedulerService
,在Master服务启动的时候,由容器注入、初始化、启动。
MasterServer
/**
* run master server
*/
@PostConstruct
public void run() throws SchedulerException {
// init remoting server
NettyServerConfig serverConfig = new NettyServerConfig();
serverConfig.setListenPort(masterConfig.getListenPort());
this.nettyRemotingServer = new NettyRemotingServer(serverConfig);
this.nettyRemotingServer.registerProcessor(CommandType.TASK_EXECUTE_RESPONSE, taskResponseProcessor);
this.nettyRemotingServer.registerProcessor(CommandType.TASK_EXECUTE_ACK, taskAckProcessor);
this.nettyRemotingServer.registerProcessor(CommandType.TASK_KILL_RESPONSE, new TaskKillResponseProcessor());
this.nettyRemotingServer.registerProcessor(CommandType.STATE_EVENT_REQUEST, stateEventProcessor);
this.nettyRemotingServer.registerProcessor(CommandType.TASK_FORCE_STATE_EVENT_REQUEST, taskEventProcessor);
this.nettyRemotingServer.registerProcessor(CommandType.TASK_WAKEUP_EVENT_REQUEST, taskEventProcessor);
this.nettyRemotingServer.registerProcessor(CommandType.CACHE_EXPIRE, cacheProcessor);
this.nettyRemotingServer.start();
// self tolerant
this.masterRegistryClient.init();
this.masterRegistryClient.start();
this.masterRegistryClient.setRegistryStoppable(this);
this.eventExecuteService.init();
this.eventExecuteService.start();
// 初始化MasterSchedulerService
this.masterSchedulerService.init();
// 启动服务进程
this.masterSchedulerService.start();
this.scheduler.start();
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
if (Stopper.isRunning()) {
close("shutdownHook");
}
}));
}
MasterSchedulerService
是一个独立线程Bean
,线程主要执行体:
/**
* run of MasterSchedulerService
*/
@Override
public void run() {
logger.info("master scheduler started");
while (Stopper.isRunning()) {
try {
// 判断当前服务器CPU、内存负载是否达到配置的上限
boolean runCheckFlag = OSUtils.checkResource(masterConfig.getMaxCpuLoadAvg(), masterConfig.getReservedMemory());
if (!runCheckFlag) {
Thread.sleep(Constants.SLEEP_TIME_MILLIS);
continue;
}
// 调度一次
scheduleProcess();
} catch (Exception e) {
logger.error("master scheduler thread error", e);
}
}
}
scheduleProcess()方法
该方法主要的功能是用工作流实例processInstance
封装WorkflowExecuteThread
线程对象,然后交给ExecutorService
线程池执行。
/**
* 1\. get command by slot
* 2\. donot handle command if slot is empty
*/
private void scheduleProcess() throws Exception {
// 一次只取到一个Command对象(根据分片算法)。
// 这里根绝分片算法获取Command的方式,其实有个问题,大家伙有发现的吗,可以在下方留言。
Command command = findOneCommand();
if (command != null) {
logger.info("find one command: id: {}, type: {}", command.getId(), command.getCommandType());
try {
ProcessInstance processInstance = processService.handleCommand(logger, getLocalAddress(), command);
if (processInstance != null) {
WorkflowExecuteThread workflowExecuteThread = new WorkflowExecuteThread(
processInstance
, taskResponseService
, processService
, nettyExecutorManager
, processAlertManager
, masterConfig
, taskTimeoutCheckList
, taskRetryCheckList);
this.processInstanceExecMaps.put(processInstance.getId(), workflowExecuteThread);
if (processInstance.getTimeout() > 0) {
this.processTimeoutCheckList.put(processInstance.getId(), processInstance);
}
logger.info("handle command end, command {} process {} start...",
command.getId(), processInstance.getId());
masterExecService.execute(workflowExecuteThread);
}
} catch (Exception e) {
logger.error("scan command error ", e);
processService.moveToErrorCommand(command, e.toString());
}
} else {
//indicate that no command ,sleep for 1s
Thread.sleep(Constants.SLEEP_TIME_MILLIS);
}
}
WorkflowExecuteThread
线程体run
中的主要功能是构建工作流任务的DAG(有向无环图),以及一堆及其复杂的任务状态等内部业务逻辑处理,最后将拆分可提交的任务提交给WorkServer
进行执行。
我们只关注主流程,Master的内部业务逻辑我们不去深究。
private void startProcess() throws Exception {
if (this.taskInstanceHashMap.size() == 0) {
isStart = false;
// 根据工作流实例构建DAG
buildFlowDag();
// 初始化任务队列。因为WorkflowExecuteThread的操作,主要基于对象内的各种task list对象操作的,这里
// 就是对这些列表对象的初始化操作等。
initTaskQueue();
submitPostNode(null);
isStart = true;
}
}
下面主要看提交任务Node
的方法submitPostNode()
Set submitTaskNodeList = DagHelper.parsePostNodes(parentNodeCode, skipTaskNodeList, dag, completeTaskList);
// 1:一堆提交前的任务Node的各种业务处理,不看了
List taskInstances = new ArrayList<>();
for (String taskNode : submitTaskNodeList) {
TaskNode taskNodeObject = dag.getNode(taskNode);
if (taskInstanceHashMap.containsColumn(taskNodeObject.getCode())) {
continue;
}
TaskInstance task = createTaskInstance(processInstance, taskNodeObject);
taskInstances.add(task);
}
// if previous node success , post node submit
for (TaskInstance task : taskInstances) {
if (readyToSubmitTaskQueue.contains(task)) {
continue;
}
if (completeTaskList.containsKey(Long.toString(task.getTaskCode()))) {
logger.info("task {} has already run success, task id:{}", task.getName(), task.getId());
continue;
}
if (task.getState().typeIsPause() || task.getState().typeIsCancel()) {
logger.info("task {} stopped, the state is {}, task id:{}", task.getName(), task.getState(), task.getId());
} else {
addTaskToStandByList(task);
}
}
// 2:提交任务Node。提交操作主要是操作属性队列readyToSubmitTaskQueue。上面的代码中,经过各种业务处理后,符合条件的Task会加到这个队列中。
submitStandByTask();
// 3:更新工作流实例状态
updateProcessInstanceState();
最终在调用到方法notifyProcessHostUpdate(TaskInstance)
// 调用netty发送命令体的功能,很不解为什么方法名叫这个哈,有看明白的可以在下方留言下。
private void notifyProcessHostUpdate(TaskInstance taskInstance) {
if (StringUtils.isEmpty(taskInstance.getHost())) {
return;
}
try {
HostUpdateCommand hostUpdateCommand = new HostUpdateCommand();
hostUpdateCommand.setProcessHost(NetUtils.getAddr(masterConfig.getListenPort()));
hostUpdateCommand.setTaskInstanceId(taskInstance.getId());
Host host = new Host(taskInstance.getHost());
// 这里是是具体调用NettyClient封装对象进行提交的代码
nettyExecutorManager.doExecute(host, hostUpdateCommand.convert2Command());
} catch (Exception e) {
logger.error("notify process host update", e);
}
}
继续看NettyExecutorManager.doExecute
public void doExecute(final Host host, final Command command) throws ExecuteException {
/**
* retry count,default retry 3
*/
int retryCount = 3;
boolean success = false;
do {
try {
// 这个对象就是Netty Client对象了。
// 可以看到Netty Client封装的发送方式,基本是一个Host 目标对象 + Command 命令对象的形式。
nettyRemotingClient.send(host, command);
success = true;
} catch (Exception ex) {
logger.error(String.format("send command : %s to %s error", command, host), ex);
retryCount--;
ThreadUtils.sleep(100);
}
} while (retryCount >= 0 && !success);
if (!success) {
throw new ExecuteException(String.format("send command : %s to %s error", command, host));
}
}
至此,MasterServer的主要逻辑我们已经理清,如下:
经
Quartz
调度或者网页端操作产生Command
记录MasterSchedulerService
会定时查询Command
表,解析成ProcessInstance
,然后将其封装成WorkflowExecuteThread
,交给线程池处理WorkflowExecuteThread
内部,在对ProcessInstance
解析成相应的TaskInstance
对象,然后加到readyToSubmitTaskQueue
队列,
经NettyExecutorManager
读取该队列将相应的taskInstance
提交到相应的WorkServer