DataX篇—分布式任务调度框架xxl-job学习

参考资料:

​ https://juejin.cn/post/6938034809197297694

前言

DataX-Web页面负责管理调度DataX插件,而DataX-web实现DataX插件调度的底层框架是分布式任务调度框架XXL-Job,参考了大佬的笔记,仅供学习用。

一、作业类型

xxl-job支持七种作业类型:

BeanGLUE(Java)GLUE(Shell)GLUE(Python)GLUE(PHP)GLUE(Nodejs)GLUE(PowerShell)

其中,GLUE类型作业都是在admin管理端编辑业务代码,而Bean类型作业是将用户业务代码逻辑集成到xxl-job进行调度,源码位于用户项目中,而非xxl-jobadmin模块

xxl-job抽象IJobHandler组件,用于执行作业,其实现有三种(见下图):

DataX篇—分布式任务调度框架xxl-job学习_第1张图片

MethodJobHandler : Bean类型作业处理器,Bean类型作业逻辑实际上封装在带有@XxlJob注解的Method中;

ScriptJobHandler:脚本类型作业处理器,如ShellPythonPHPNodejsPowerShell等都可以看出脚本类型作业,使用该处理器;

GlueJobHandler:该种作业处理器专门用于处理Glue(Java)类型作业,上节分析过Java类型作业会被GlueFactory编译、初始化成实例,然后封装到GlueJobHandler中进行执行。

二、执行流程

服务端流程

服务端作业执行触发入口见JobTriggerPoolHelper#addTrigger

public void addTrigger(final int jobId,
                       final TriggerTypeEnum triggerType,
                       final int failRetryCount,
                       final String executorShardingParam,
                       final String executorParam,
                       final String addressList) {
   

    // 这里根据一定规则将触发任务从两个线程池中选取一个进行投递
    // fastTriggerPool:默认投递线程池
    // slowTriggerPool:慢作业投递到该线程池
    // 慢作业定义:投递超过500ms,且累计一分钟超过10次(每分钟重置缓存重新计算),则该作业就是慢作业,后续执行时使用slowTriggerPool 
    ThreadPoolExecutor triggerPool_ = fastTriggerPool;
    AtomicInteger jobTimeoutCount = jobTimeoutCountMap.get(jobId);
    if (jobTimeoutCount!=null && jobTimeoutCount.get() > 10) {
         // job-timeout 10 times in 1 min
        triggerPool_ = slowTriggerPool;
    }

    // trigger
    triggerPool_.execute(new Runnable() {
   
        @Override
        public void run() {
   

            long start = System.currentTimeMillis();

            try {
   
                // 触发作业
                XxlJobTrigger.trigger(jobId, triggerType, failRetryCount, executorShardingParam, executorParam, addressList);
            } catch (Exception e) {
   
                logger.error(e.getMessage(), e);
            } finally {
   

                // 每分钟清空慢作业累计缓存
                long minTim_now = System.currentTimeMillis()/60000;
                if (minTim != minTim_now) {
   
                    minTim = minTim_now;
                    jobTimeoutCountMap.clear();
                }

                // 超过500ms则慢作业执行次数累计+1,
                // 执行端采用异步模式:作业下发到执行端放入到队列中即返回,所以,这个时间是不包括作业本身执行时间
                long cost = System.currentTimeMillis()-start;
                if (cost > 500) {
          // ob-timeout threshold 500ms
                    AtomicInteger timeoutCount = jobTimeoutCountMap.putIfAbsent(jobId, new AtomicInteger(1));
                    if (timeoutCount != null) {
   
                        timeoutCount.incrementAndGet();
                    }
                }
            }

        }
    });
}

继续向下跟踪XxlJobTrigger#trigger:

private static void processTrigger(XxlJobGroup group, XxlJobInfo jobInfo, int finalFailRetryCount, TriggerTypeEnum triggerType, int index, int total){
   

    // 阻塞处理策略
    ExecutorBlockStrategyEnum blockStrategy = ExecutorBlockStrategyEnum.match(jobInfo.getExecutorBlockStrategy(), ExecutorBlockStrategyEnum.SERIAL_EXECUTION);
    // 路由策略
    ExecutorRouteStrategyEnum executorRouteStrategyEnum = ExecutorRouteStrategyEnum.match(jobInfo.getExecutorRouteStrategy(), null);    // route strategy
    // 分片参数
    String shardingParam = (ExecutorRouteStrategyEnum.SHARDING_BROADCAST==executorRouteStrategyEnum)?String.valueOf(index).concat("/").concat(String.valueOf(total)):null;

    // 1、save log-id
    XxlJobLog jobLog = new XxlJobLog();
    jobLog.setJobGroup(jobInfo.getJobGroup());
    jobLog.setJobId(jobInfo.getId());
    jobLog.setTriggerTime(new Date());
    // xxl_job_log插入运行日志
    XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().save(jobLog);
    logger.debug(">>>>>>>>>>> xxl-job trigger start, jobId:{}", jobLog.getId());

    // 2、init trigger-param
    TriggerParam triggerParam = new TriggerParam();
    triggerParam.setJobId(jobInfo.getId());
    triggerParam.setExecutorHandler(jobInfo.getExecutorHandler());
    triggerParam.setExecutorParams(jobInfo.getExecutorParam());
    triggerParam.setExecutorBlockStrategy(jobInfo.getExecutorBlockStrategy());
    triggerParam.setExecutorTimeout(jobInfo.getExecutorTimeout());
    triggerParam.setLogId(jobLog.getId());
    triggerParam.setLogDateTime(jobLog.getTriggerTime().getTime());
    triggerParam.setGlueType(jobInfo.getGlueType());
    triggerParam.setGlueSource(jobInfo.getGlueSource());
    triggerParam.setGlueUpdatetime(jobInfo.getGlueUpdatetime().getTime());
    triggerParam.setBroadcastIndex(index);
    triggerParam.setBroadcastTotal(total);

    // 初始化执行器地址
    String address = null;
    ReturnT<String> routeAddressResult = null;
    if (group.getRegistryList()!=null && !group.getRegistryList().isEmpty()) 

你可能感兴趣的:(DataX,XXL-Job,java)