quartz是最常用的定时调度框架之一,最近做分布式定时调度,利用周末的时间研读了quartz的源码,对其基本架构有了一定的了解。
接口
quartz的主要接口如下:
SchedulerFactory:负责各类初始化,并作为scheduler的工厂。
Scheduler:调度器接口,负责任务的创建、修改、删除、调度等。
Job:任务接口,自定义的job实现其execute方法。
JobDetail:任务信息接口,包括名称,组,job类,jobkey等。
Trigger:触发器接口,包括名称,组,job名称,job组,jobData等。有两个主要的子接口,SimpleTrigger和CronTrigger,SimpleTrigger主要负责按指定频率、指定次数的任务,CronTrigger主要负责执行频率按Cron表达式定义的任务。
ThreadPool:线程池接口。其实现类有SimpleThreadPool和ZeroSizeThreadPool。
JobStore:job存储器。其实现类有RAMJobStore,JobStoreCMT,TerracottaJobStore。RAMJobStore是将任务信息存储于内存中,JobStoreCMT是支持事务的jobstore,TerracottaJobStore是将任务信息存储在Terracotta中。
类图
1. Scheduler相关类
StdScheculer与RemoteScheduler是QuartzScheduler的代理,具备QuartzScheduler的全部方法,实际工作的是QuartzScheduler。在QuartzScheduler的构造器中,会初始化调度主线程。
public QuartzScheduler(QuartzSchedulerResources resources, long idleWaitTime, @Deprecated long dbRetryInterval)
throws SchedulerException {
this.resources = resources;
if (resources.getJobStore() instanceof JobListener) {
addInternalJobListener((JobListener)resources.getJobStore());
}
this.schedThread = new QuartzSchedulerThread(this, resources);
ThreadExecutor schedThreadExecutor = resources.getThreadExecutor();
schedThreadExecutor.execute(this.schedThread);
if (idleWaitTime > 0) {
this.schedThread.setIdleWaitTime(idleWaitTime);
}
jobMgr = new ExecutingJobsManager();
addInternalJobListener(jobMgr);
errLogger = new ErrorLogger();
addInternalSchedulerListener(errLogger);
signaler = new SchedulerSignalerImpl(this, this.schedThread);
getLog().info("Quartz Scheduler v." + getVersion() + " created.");
}
2. job与trigger相关类
JobDetail的实现类为JobDetailImpl。
SimpleTrigger的实现类是SimpleTriggerImpl。
CronTrigger的实现类是CronTriggerImpl。
3. 线程池类
关键逻辑
1. SchedulerFactory初始化
- 配置初始化,读取配置文件,解析配置属性。
- 初始化ClassLoader
- 初始化remoteJmxScheduler
- 初始化ThreadPool
- 初始化JobStore
- 初始化DataSource
- 初始化SchedulerPlugins
- 初始化JobListeners
- 初始化TriggerListeners
- 初始化ThreadExecutor
- 初始化QuartzScheduler,构造Scheduler
2. 主线程QuartzSchedulerThread处理逻辑
/**
*
* The main processing loop of the QuartzSchedulerThread
.
*
*/
@Override
public void run() {
int acquiresFailed = 0;
while (!halted.get()) {
try {
// check if we're supposed to pause...
// 锁等待
synchronized (sigLock) {
// 如果暂停且未停止
while (paused && !halted.get()) {
try {
// wait until togglePause(false) is called...
// 等待1秒
sigLock.wait(1000L);
} catch (InterruptedException ignore) {
}
// reset failure counter when paused, so that we don't
// wait again after unpausing
acquiresFailed = 0;
}
if (halted.get()) {
break;
}
}
// wait a bit, if reading from job store is consistently
// failing (e.g. DB is down or restarting)..
if (acquiresFailed > 1) {
try {
long delay = computeDelayForRepeatedErrors(qsRsrcs.getJobStore(), acquiresFailed);
Thread.sleep(delay);
} catch (Exception ignore) {
}
}
int availThreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
if(availThreadCount > 0) { // will always be true, due to semantics of blockForAvailableThreads...
List triggers;
// 获取当前时间戳
long now = System.currentTimeMillis();
clearSignaledSchedulingChange();
try {
// 获取未来将要触发的trigger列表,该列表为排序后列表
triggers = qsRsrcs.getJobStore().acquireNextTriggers(
now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
acquiresFailed = 0;
if (log.isDebugEnabled())
log.debug("batch acquisition of " + (triggers == null ? 0 : triggers.size()) + " triggers");
} catch (JobPersistenceException jpe) {
if (acquiresFailed == 0) {
qs.notifySchedulerListenersError(
"An error occurred while scanning for the next triggers to fire.",
jpe);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
} catch (RuntimeException e) {
if (acquiresFailed == 0) {
getLog().error("quartzSchedulerThreadLoop: RuntimeException "
+e.getMessage(), e);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
}
if (triggers != null && !triggers.isEmpty()) {
// 获取当前时间戳
now = System.currentTimeMillis();
// 获取将要执行的第一个trigger
long triggerTime = triggers.get(0).getNextFireTime().getTime();
// 计算现在到trigger执行时间点之间的差
long timeUntilTrigger = triggerTime - now;
while(timeUntilTrigger > 2) {
synchronized (sigLock) {
if (halted.get()) {
break;
}
if (!isCandidateNewTimeEarlierWithinReason(triggerTime, false)) {
try {
// we could have blocked a long while
// on 'synchronize', so we must recompute
now = System.currentTimeMillis();
// 计算时间差
timeUntilTrigger = triggerTime - now;
if(timeUntilTrigger >= 1)
// 等待,指导trigger触发事件到达
sigLock.wait(timeUntilTrigger);
} catch (InterruptedException ignore) {
}
}
}
if(releaseIfScheduleChangedSignificantly(triggers, triggerTime)) {
break;
}
now = System.currentTimeMillis();
timeUntilTrigger = triggerTime - now;
}
// this happens if releaseIfScheduleChangedSignificantly decided to release triggers
if(triggers.isEmpty())
continue;
// set triggers to 'executing'
List bndles = new ArrayList();
boolean goAhead = true;
synchronized(sigLock) {
goAhead = !halted.get();
}
if(goAhead) {
try {
// 构造triggerresult
List res = qsRsrcs.getJobStore().triggersFired(triggers);
if(res != null)
bndles = res;
} catch (SchedulerException se) {
qs.notifySchedulerListenersError(
"An error occurred while firing triggers '"
+ triggers + "'", se);
//QTZ-179 : a problem occurred interacting with the triggers from the db
//we release them and loop again
for (int i = 0; i < triggers.size(); i++) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
}
continue;
}
}
for (int i = 0; i < bndles.size(); i++) {
TriggerFiredResult result = bndles.get(i);
TriggerFiredBundle bndle = result.getTriggerFiredBundle();
Exception exception = result.getException();
if (exception instanceof RuntimeException) {
getLog().error("RuntimeException while firing trigger " + triggers.get(i), exception);
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
// it's possible to get 'null' if the triggers was paused,
// blocked, or other similar occurrences that prevent it being
// fired at this time... or if the scheduler was shutdown (halted)
if (bndle == null) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
JobRunShell shell = null;
try {
// 创建JobRunShell
shell = qsRsrcs.getJobRunShellFactory().createJobRunShell(bndle);
// JobRunShell初始化
shell.initialize(qs);
} catch (SchedulerException se) {
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
continue;
}
// 在线程池中执行job
if (qsRsrcs.getThreadPool().runInThread(shell) == false) {
// this case should never happen, as it is indicative of the
// scheduler being shutdown or a bug in the thread pool or
// a thread pool being used concurrently - which the docs
// say not to do...
getLog().error("ThreadPool.runInThread() return false!");
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
}
}
continue; // while (!halted)
}
} else { // if(availThreadCount > 0)
// should never happen, if threadPool.blockForAvailableThreads() follows contract
continue; // while (!halted)
}
long now = System.currentTimeMillis();
long waitTime = now + getRandomizedIdleWaitTime();
long timeUntilContinue = waitTime - now;
synchronized(sigLock) {
try {
if(!halted.get()) {
// QTZ-336 A job might have been completed in the mean time and we might have
// missed the scheduled changed signal by not waiting for the notify() yet
// Check that before waiting for too long in case this very job needs to be
// scheduled very soon
if (!isScheduleChanged()) {
sigLock.wait(timeUntilContinue);
}
}
} catch (InterruptedException ignore) {
}
}
} catch(RuntimeException re) {
getLog().error("Runtime error occurred in main trigger firing loop.", re);
}
} // while (!halted)
// drop references to scheduler stuff to aid garbage collection...
qs = null;
qsRsrcs = null;
}
在调度主线程内,有一个sigLock,通过该锁的wait和notify方法,可以控制主线程的等待与活动。
主线程在做无线循环的操作,当收到外部的停止信号,主线程会停止并退出。
主线程每一循环内,会查询将要触发的trigger列表,该列表已经按触发事件进行了排序,线程会取出第一条trigger,即最先被触发的trigger,计算触发时间点与现在的差值,然后sigLock.wait()等待该差值的时间,即等待至trigger触发的时间点。时间一到,便触发trigger执行。
job执行是通过JobRunShell,每次任务执行会生成一个JobRunShell,然后通过ThreadPool.runInThread(shell)在线程池中执行该任务。
3. 接下来,我们看JobRunShell是如何处理任务的,JobRunShell的处理方法如下:
public void run() {
qs.addInternalSchedulerListener(this);
try {
// 获取trigger和jobdetail
OperableTrigger trigger = (OperableTrigger) jec.getTrigger();
JobDetail jobDetail = jec.getJobDetail();
do {
JobExecutionException jobExEx = null;
// 获取job实例
Job job = jec.getJobInstance();
try {
// 开始 do nothing
begin();
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't begin execution.", se);
break;
}
// notify job & trigger listeners...
try {
// 通知job监听器,trigger监听器,开始通知
if (!notifyListenersBeginning(jec)) {
break;
}
} catch(VetoedException ve) {
try {
CompletedExecutionInstruction instCode = trigger.executionComplete(jec, null);
qs.notifyJobStoreJobVetoed(trigger, jobDetail, instCode);
// QTZ-205
// Even if trigger got vetoed, we still needs to check to see if it's the trigger's finalized run or not.
if (jec.getTrigger().getNextFireTime() == null) {
qs.notifySchedulerListenersFinalized(jec.getTrigger());
}
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error during veto of Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
break;
}
// 记录开始时间
long startTime = System.currentTimeMillis();
long endTime = startTime;
// execute the job
try {
log.debug("Calling execute on job " + jobDetail.getKey());
// 执行job
job.execute(jec);
// 记录结束时间
endTime = System.currentTimeMillis();
} catch (JobExecutionException jee) {
endTime = System.currentTimeMillis();
jobExEx = jee;
getLog().info("Job " + jobDetail.getKey() +
" threw a JobExecutionException: ", jobExEx);
} catch (Throwable e) {
endTime = System.currentTimeMillis();
getLog().error("Job " + jobDetail.getKey() +
" threw an unhandled Exception: ", e);
SchedulerException se = new SchedulerException(
"Job threw an unhandled exception.", e);
qs.notifySchedulerListenersError("Job ("
+ jec.getJobDetail().getKey()
+ " threw an exception.", se);
jobExEx = new JobExecutionException(se, false);
}
jec.setJobRunTime(endTime - startTime);
// notify all job listeners
// 通知job监听器,结束通知
if (!notifyJobListenersComplete(jec, jobExEx)) {
break;
}
CompletedExecutionInstruction instCode = CompletedExecutionInstruction.NOOP;
// update the trigger
try {
// 从trigger获取结束指令
instCode = trigger.executionComplete(jec, jobExEx);
} catch (Exception e) {
// If this happens, there's a bug in the trigger...
SchedulerException se = new SchedulerException(
"Trigger threw an unhandled exception.", e);
qs.notifySchedulerListenersError(
"Please report this error to the Quartz developers.",
se);
}
// notify all trigger listeners
// 通知trigger监听器
if (!notifyTriggerListenersComplete(jec, instCode)) {
break;
}
// update job/trigger or re-execute job
if (instCode == CompletedExecutionInstruction.RE_EXECUTE_JOB) {
jec.incrementRefireCount();
try {
complete(false);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
continue;
}
try {
// 完成 do nothing
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
continue;
}
// 通知JobStroe 任务完成
qs.notifyJobStoreJobComplete(trigger, jobDetail, instCode);
break;
} while (true);
} finally {
// 移除调度器监听器
qs.removeInternalSchedulerListener(this);
}
}
JobRunShell在job执行前会调用begin()方法,该方法为空方法,然后会通知Job监听器和trigger监听器任务已经开始,接着执行任务,执行完成后会通知job监听器任务已经完成,也会通知trigger监听器trigger已经完成,然后执行complete空方法。最后,会通知JobStore任务已完成,删除调度器的监听器。
说到这里,有人会疑问了,调度器主线程在等待下一个trigger触发的过程中,如果有新的任务被添加,或有任务被删除,会怎么处理呢。
接下来,我们看看添加和删除任务的处理逻辑。
添加和删除任务主要还是QuartzScheduler来处理的,该类的addJob()方法和deleteJob()方法负责添加任务和删除任务:
/**
*
* Add the given Job
to the Scheduler - with no associated
* Trigger
. The Job
will be 'dormant' until
* it is scheduled with a Trigger
, or Scheduler.triggerJob()
* is called for it.
*
*
*
* The Job
must by definition be 'durable', if it is not,
* SchedulerException will be thrown.
*
*
* @throws SchedulerException
* if there is an internal Scheduler error, or if the Job is not
* durable, or a Job with the same name already exists, and
* replace
is false
.
*/
public void addJob(JobDetail jobDetail, boolean replace) throws SchedulerException {
addJob(jobDetail, replace, false);
}
public void addJob(JobDetail jobDetail, boolean replace, boolean storeNonDurableWhileAwaitingScheduling) throws SchedulerException {
validateState();
if (!storeNonDurableWhileAwaitingScheduling && !jobDetail.isDurable()) {
throw new SchedulerException(
"Jobs added with no trigger must be durable.");
}
resources.getJobStore().storeJob(jobDetail, replace);
notifySchedulerThread(0L);
notifySchedulerListenersJobAdded(jobDetail);
}
/**
*
* Delete the identified Job
from the Scheduler - and any
* associated Trigger
s.
*
*
* @return true if the Job was found and deleted.
* @throws SchedulerException
* if there is an internal Scheduler error.
*/
public boolean deleteJob(JobKey jobKey) throws SchedulerException {
validateState();
boolean result = false;
List extends Trigger> triggers = getTriggersOfJob(jobKey);
for (Trigger trigger : triggers) {
if (!unscheduleJob(trigger.getKey())) {
StringBuilder sb = new StringBuilder().append(
"Unable to unschedule trigger [").append(
trigger.getKey()).append("] while deleting job [")
.append(jobKey).append(
"]");
throw new SchedulerException(sb.toString());
}
result = true;
}
result = resources.getJobStore().removeJob(jobKey) || result;
if (result) {
notifySchedulerThread(0L);
notifySchedulerListenersJobDeleted(jobKey);
}
return result;
}
我们看到,QuartzScheduler会先向JobStore中添加或删除任务,然后会通知主线程,调用主线程的sigLock.notifyAll()唤醒主线程,并且会通知监听器管理器,向其中添加job监听器或删除job监听器。所以当主线程在处于等待状态时,如果有任务发生变化,会唤醒主线程进行处理。
至此,调度器框架quartz的核心源码和架构就分析完了。