Quartz中管理定时任务的Scheduler对应一个QuartzScheduler,其中,负责管理定时任务的线程QuartzSchedulerThread也在其构造方法中被启动。
public QuartzScheduler(QuartzSchedulerResources resources, long idleWaitTime, @Deprecated long dbRetryInterval)
throws SchedulerException {
this.resources = resources;
if (resources.getJobStore() instanceof JobListener) {
addInternalJobListener((JobListener)resources.getJobStore());
}
this.schedThread = new QuartzSchedulerThread(this, resources);
ThreadExecutor schedThreadExecutor = resources.getThreadExecutor();
schedThreadExecutor.execute(this.schedThread);
if (idleWaitTime > 0) {
this.schedThread.setIdleWaitTime(idleWaitTime);
}
jobMgr = new ExecutingJobsManager();
addInternalJobListener(jobMgr);
errLogger = new ErrorLogger();
addInternalSchedulerListener(errLogger);
signaler = new SchedulerSignalerImpl(this, this.schedThread);
if(shouldRunUpdateCheck())
updateTimer = scheduleUpdateCheck();
else
updateTimer = null;
getLog().info("Quartz Scheduler v." + getVersion() + " created.");
}
在这里直接实现了QuartzSchedulerThread的构造方法,生成了管理定时任务的主要线程,并通过Quartz自己实现的线程池实现,此处的线程池非常简单,只是简单的调用的QuartzSchedulerThread,也就是定时任务管理线程的start()方法正式启动。但此时的QuartzSchedulerThread还并没有真正开始发挥作用。
QuartzSchedulerThread(QuartzScheduler qs, QuartzSchedulerResources qsRsrcs, boolean setDaemon, int threadPrio) {
super(qs.getSchedulerThreadGroup(), qsRsrcs.getThreadName());
this.qs = qs;
this.qsRsrcs = qsRsrcs;
this.setDaemon(setDaemon);
if(qsRsrcs.isThreadsInheritInitializersClassLoadContext()) {
log.info("QuartzSchedulerThread Inheriting ContextClassLoader of thread: " + Thread.currentThread().getName());
this.setContextClassLoader(Thread.currentThread().getContextClassLoader());
}
this.setPriority(threadPrio);
// start the underlying thread, but put this object into the 'paused'
// state
// so processing doesn't start yet...
paused = true;
halted = new AtomicBoolean(false);
}
在其构造方法中,paused在一开始为true,导致在run()方法中被等待,不会一开始就准备定时任务的开启。
定时任务管理线程的启动在于QuartzScheduler的start()方法中,会将QuartzSchedulerThread的paused属性设为true,这里也会保证唤醒正因为这个参数而在run()方法中处于等待状态的QuartzSchedulerThread会被唤醒。随着QuartzScheduler的start()方法的被调用,定时任务的管理启动也正式开始。
@Override
public void run() {
boolean lastAcquireFailed = false;
while (!halted.get()) {
try {
// check if we're supposed to pause...
synchronized (sigLock) {
while (paused && !halted.get()) {
try {
// wait until togglePause(false) is called...
sigLock.wait(1000L);
} catch (InterruptedException ignore) {
}
}
if (halted.get()) {
break;
}
}
int availThreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
if(availThreadCount > 0) { // will always be true, due to semantics of blockForAvailableThreads...
List triggers = null;
long now = System.currentTimeMillis();
clearSignaledSchedulingChange();
try {
triggers = qsRsrcs.getJobStore().acquireNextTriggers(
now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
lastAcquireFailed = false;
if (log.isDebugEnabled())
log.debug("batch acquisition of " + (triggers == null ? 0 : triggers.size()) + " triggers");
} catch (JobPersistenceException jpe) {
if(!lastAcquireFailed) {
qs.notifySchedulerListenersError(
"An error occurred while scanning for the next triggers to fire.",
jpe);
}
lastAcquireFailed = true;
continue;
} catch (RuntimeException e) {
if(!lastAcquireFailed) {
getLog().error("quartzSchedulerThreadLoop: RuntimeException "
+e.getMessage(), e);
}
lastAcquireFailed = true;
continue;
}
if (triggers != null && !triggers.isEmpty()) {
now = System.currentTimeMillis();
long triggerTime = triggers.get(0).getNextFireTime().getTime();
long timeUntilTrigger = triggerTime - now;
while(timeUntilTrigger > 2) {
synchronized (sigLock) {
if (halted.get()) {
break;
}
if (!isCandidateNewTimeEarlierWithinReason(triggerTime, false)) {
try {
// we could have blocked a long while
// on 'synchronize', so we must recompute
now = System.currentTimeMillis();
timeUntilTrigger = triggerTime - now;
if(timeUntilTrigger >= 1)
sigLock.wait(timeUntilTrigger);
} catch (InterruptedException ignore) {
}
}
}
if(releaseIfScheduleChangedSignificantly(triggers, triggerTime)) {
break;
}
now = System.currentTimeMillis();
timeUntilTrigger = triggerTime - now;
}
// this happens if releaseIfScheduleChangedSignificantly decided to release triggers
if(triggers.isEmpty())
continue;
// set triggers to 'executing'
List bndles = new ArrayList();
boolean goAhead = true;
synchronized(sigLock) {
goAhead = !halted.get();
}
if(goAhead) {
try {
List res = qsRsrcs.getJobStore().triggersFired(triggers);
if(res != null)
bndles = res;
} catch (SchedulerException se) {
qs.notifySchedulerListenersError(
"An error occurred while firing triggers '"
+ triggers + "'", se);
//QTZ-179 : a problem occurred interacting with the triggers from the db
//we release them and loop again
for (int i = 0; i < triggers.size(); i++) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
}
continue;
}
}
for (int i = 0; i < bndles.size(); i++) {
TriggerFiredResult result = bndles.get(i);
TriggerFiredBundle bndle = result.getTriggerFiredBundle();
Exception exception = result.getException();
if (exception instanceof RuntimeException) {
getLog().error("RuntimeException while firing trigger " + triggers.get(i), exception);
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
// it's possible to get 'null' if the triggers was paused,
// blocked, or other similar occurrences that prevent it being
// fired at this time... or if the scheduler was shutdown (halted)
if (bndle == null) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
JobRunShell shell = null;
try {
shell = qsRsrcs.getJobRunShellFactory().createJobRunShell(bndle);
shell.initialize(qs);
} catch (SchedulerException se) {
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
continue;
}
if (qsRsrcs.getThreadPool().runInThread(shell) == false) {
// this case should never happen, as it is indicative of the
// scheduler being shutdown or a bug in the thread pool or
// a thread pool being used concurrently - which the docs
// say not to do...
getLog().error("ThreadPool.runInThread() return false!");
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
}
}
continue; // while (!halted)
}
} else { // if(availThreadCount > 0)
// should never happen, if threadPool.blockForAvailableThreads() follows contract
continue; // while (!halted)
}
long now = System.currentTimeMillis();
long waitTime = now + getRandomizedIdleWaitTime();
long timeUntilContinue = waitTime - now;
synchronized(sigLock) {
try {
if(!halted.get()) {
// QTZ-336 A job might have been completed in the mean time and we might have
// missed the scheduled changed signal by not waiting for the notify() yet
// Check that before waiting for too long in case this very job needs to be
// scheduled very soon
if (!isScheduleChanged()) {
sigLock.wait(timeUntilContinue);
}
}
} catch (InterruptedException ignore) {
}
}
} catch(RuntimeException re) {
getLog().error("Runtime error occurred in main trigger firing loop.", re);
}
} // while (!halted)
// drop references to scheduler stuff to aid garbage collection...
qs = null;
qsRsrcs = null;
}
在run()方法的一开始可以看到,如果paused属性为false,那么就会一直等待,不往下走,知道paused被改为true。
在一开始会获取执行任务的线程池中的可用线程池数量,这里的线程池与刚刚启动管理线程的线程池不是同一个线程池,实现要复杂的多,在Scheduler的获取中就已经生成,并根据配置的线程数量不断生成相应的线程数量,此处的线程应被称为执行线程。这里的线程数量,将被当做后面一次性取得准备运行的定时任务数量的一个依据。
之后,将会从jobStore中通过acquireNextTriggers()方法,根据当前时间来获取接下来要准备启动的定时任务。这里的取得的数量取 可用的执行线程数量 和 配置的一次最大执行数量的最小值,并传入idleWaitTime,当取得到的任务执行时间与当前时间之差大于这个时间,将不会取得。
JobStore的默认实现为RAMJobStore,可以看其acquireNextTriggers()方法的实现。
public List acquireNextTriggers(long noLaterThan, int maxCount, long timeWindow) {
synchronized (lock) {
List result = new ArrayList();
Set acquiredJobKeysForNoConcurrentExec = new HashSet();
Set excludedTriggers = new HashSet();
long firstAcquiredTriggerFireTime = 0;
// return empty list if store has no triggers.
if (timeTriggers.size() == 0)
return result;
while (true) {
TriggerWrapper tw;
try {
tw = timeTriggers.first();
if (tw == null)
break;
timeTriggers.remove(tw);
} catch (java.util.NoSuchElementException nsee) {
break;
}
if (tw.trigger.getNextFireTime() == null) {
continue;
}
if (applyMisfire(tw)) {
if (tw.trigger.getNextFireTime() != null) {
timeTriggers.add(tw);
}
continue;
}
if (tw.getTrigger().getNextFireTime().getTime() > noLaterThan + timeWindow) {
timeTriggers.add(tw);
break;
}
// If trigger's job is set as @DisallowConcurrentExecution, and it has already been added to result, then
// put it back into the timeTriggers set and continue to search for next trigger.
JobKey jobKey = tw.trigger.getJobKey();
JobDetail job = jobsByKey.get(tw.trigger.getJobKey()).jobDetail;
if (job.isConcurrentExectionDisallowed()) {
if (acquiredJobKeysForNoConcurrentExec.contains(jobKey)) {
excludedTriggers.add(tw);
continue; // go to next trigger in store.
} else {
acquiredJobKeysForNoConcurrentExec.add(jobKey);
}
}
tw.state = TriggerWrapper.STATE_ACQUIRED;
tw.trigger.setFireInstanceId(getFiredTriggerRecordId());
OperableTrigger trig = (OperableTrigger) tw.trigger.clone();
result.add(trig);
if(firstAcquiredTriggerFireTime == 0)
firstAcquiredTriggerFireTime = tw.trigger.getNextFireTime().getTime();
if (result.size() == maxCount)
break;
}
// If we did excluded triggers to prevent ACQUIRE state due to DisallowConcurrentExecution, we need to add them back to store.
if (excludedTriggers.size() > 0)
timeTriggers.addAll(excludedTriggers);
return result;
}
}
可以看到,下一次执行时间离当前最近的定时任务就是TimeTriggers的第一个,为什么呢?
protected TreeSet timeTriggers = new TreeSet(new TriggerWrapperComparator());
class TriggerWrapperComparator implements Comparator, java.io.Serializable {
private static final long serialVersionUID = 8809557142191514261L;
TriggerTimeComparator ttc = new TriggerTimeComparator();
public int compare(TriggerWrapper trig1, TriggerWrapper trig2) {
return ttc.compare(trig1.trigger, trig2.trigger);
}
@Override
public boolean equals(Object obj) {
return (obj instanceof TriggerWrapperComparator);
}
@Override
public int hashCode() {
return super.hashCode();
}
}
因为TimeTriggers是一个TreeSet,也就是红黑树,同时重写了Comparator的compare()方法,这也保证了每一个加进去的定时任务Trigger都会根据下一次任务触发时间排序,存储在二叉树中,保证了在取得定时任务的时候可以简单的只取第一个,就能保证是最先触发的定时任务。
在取得到了此时TimeTriggers中最先触发的定时任务之后,同时将其移出,方便接下来的下一次获取。但此时这里存在这样一个问题,定时任务下一次触发的时间在向Scheduler中配置时就已经确定,但此时并没有调用管理线程还因为paused为false处于等待状态,如果这个任务的触发时间早于其结束等待的时间呢?这个问题的解决在接下来的applyMisFire()方法中得以解决。
protected boolean applyMisfire(TriggerWrapper tw) {
long misfireTime = System.currentTimeMillis();
if (getMisfireThreshold() > 0) {
misfireTime -= getMisfireThreshold();
}
Date tnft = tw.trigger.getNextFireTime();
if (tnft == null || tnft.getTime() > misfireTime
|| tw.trigger.getMisfireInstruction() == Trigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY) {
return false;
}
Calendar cal = null;
if (tw.trigger.getCalendarName() != null) {
cal = retrieveCalendar(tw.trigger.getCalendarName());
}
signaler.notifyTriggerListenersMisfired((OperableTrigger)tw.trigger.clone());
tw.trigger.updateAfterMisfire(cal);
if (tw.trigger.getNextFireTime() == null) {
tw.state = TriggerWrapper.STATE_COMPLETE;
signaler.notifySchedulerListenersFinalized(tw.trigger);
synchronized (lock) {
timeTriggers.remove(tw);
}
} else if (tnft.equals(tw.trigger.getNextFireTime())) {
return false;
}
return true;
}
这里当前时间减去一定时间与该任务的触发时间相比较,如果大于,那么这里并没有错过这个定时任务的触发,接下来还走正常的流程进一步准备执行但是一旦小于,就说明可能出现了上述的情况,那么就会重新计算这个任务的下一次触发时间。如果没有,说明这个任务已经再也不会触发,直接就将状态变为COMPLETED。
在结束了applyMisFire()方法后,错过的任务被重新计算了下一次触发时间,如果还有,则重新放入TimeTriggers种,由于其是一个TreeSet,所以在放入中会重新排序,等待下一次取得。
在解决了上述问题后,会根据下一次时间与当前时间的间隔与 最大等待时间进行对比, 如果超过,则不会取得,重新放入TImeTriggers。由于TImeTriggers是有序的,这也保证了接下来的所有任务都没有必要再取得,会直接结束取定时任务的过程。
接下来,将会根据取得的Trigger根据JobKey取得具体的对应任务。
此处,会根据isConcurrentExectionDisallowed属性进行下一步。
此处解决另一个问题,比如一个任务间隔1秒,但其执行间隔为2秒,那么如果上述属性为true,那么在同一时间只允许一个任务的执行,否则,可以同时有多个任务一起执行。
这里具体的操作为,如果为true,则先会在一个集合中检验是否已经有这个任务,如果没有,则准备继续执行,并放入这一集合,防止在执行完毕之后接下来的任务不会重复触发。而如果检查到该集合已经存在,则放入另一集合并重新准备取新的任务,而这一集合将会在最后重新放入TimeTriggers中。
在上述过程都结束后,取得的任务是真的准备触发的,那么将会把状态设为Acquired就绪态,并计入最先准备出发的时间,深克隆所要触发的Trigger,放入要返回的数组中。
这样不断从TimeTriggers中取任务直到任务个数大于先前所说的最大值或者没有在最大等待时间之内的任务为止。
在通过acquireNextTriggers()方法取得完毕之后,返回的Trigger数组就是接下来准备触发的任务。
但此时,并不保证这些已经是最后要触发的任务。
此时还有一个问题,如果在这段时间内,又有一个新的任务被加进,同时这个任务的触发时间早于所有刚刚取得的任务的触发时间呢?
这里每一个新加入的任务触发时间都会更新管理线程的signaledNextFireTime,也就是新任务的触发时间,在取得上述的Trigger数组之后,会调用isCandidateNewTimeEarlierWithinReason()方法,会将最先触发的Trigger时间与signaledNextFireTime相比,如果大于这一时间,就印证了上述的问题的确发生,则会在releaseIfScheduleChangedSignificantly()方法中,将之前取得Trigger就绪态取消,Trigger数组被清空,重新在下一次循环中取所要触发的任务。
private boolean releaseIfScheduleChangedSignificantly(
List triggers, long triggerTime) {
if (isCandidateNewTimeEarlierWithinReason(triggerTime, true)) {
// above call does a clearSignaledSchedulingChange()
for (OperableTrigger trigger : triggers) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(trigger);
}
triggers.clear();
return true;
}
return false;
}
由此,上述问题被解决。
若没有出现上述的情况,则说明,Trigger数组里的所有任务都将被触发执行。
根据最早的时间与当前时间的差,作为管理线程所要等待的时间,来等待,以便第一个定时任务的准时触发,但值得一提的是,虽然线程等待在此处,但是新任务的添加会唤醒等待在此处的线程,以便重复上述的过程,保证新添加的触发时间更近的定时任务能够准时触发。
在等待结束之后,并且顺利离第一个任务的触发时间只差2ms以内时,开始准备执行Trigger数组的第一个任务,同时在triggersFired()方法中更新下一次触发时间,重新放入在TimeTriggers中,准备周期方法的下一次执行。
之后,要被执行的任务将会被封装在一个JobRunShell内,在配置定时任务执行的job类的实例将会在其initialize()方法中被创建。
JobRunShell实现了Runnable接口,所以交给执行线程池的实则是JobRunShell类,并且被调用的run()方法则是JobRunShell的。
public void run() {
qs.addInternalSchedulerListener(this);
try {
OperableTrigger trigger = (OperableTrigger) jec.getTrigger();
JobDetail jobDetail = jec.getJobDetail();
do {
JobExecutionException jobExEx = null;
Job job = jec.getJobInstance();
try {
begin();
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't begin execution.", se);
break;
}
// notify job & trigger listeners...
try {
if (!notifyListenersBeginning(jec)) {
break;
}
} catch(VetoedException ve) {
try {
CompletedExecutionInstruction instCode = trigger.executionComplete(jec, null);
qs.notifyJobStoreJobVetoed(trigger, jobDetail, instCode);
// QTZ-205
// Even if trigger got vetoed, we still needs to check to see if it's the trigger's finalized run or not.
if (jec.getTrigger().getNextFireTime() == null) {
qs.notifySchedulerListenersFinalized(jec.getTrigger());
}
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error during veto of Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
break;
}
long startTime = System.currentTimeMillis();
long endTime = startTime;
// execute the job
try {
log.debug("Calling execute on job " + jobDetail.getKey());
job.execute(jec);
endTime = System.currentTimeMillis();
} catch (JobExecutionException jee) {
endTime = System.currentTimeMillis();
jobExEx = jee;
getLog().info("Job " + jobDetail.getKey() +
" threw a JobExecutionException: ", jobExEx);
} catch (Throwable e) {
endTime = System.currentTimeMillis();
getLog().error("Job " + jobDetail.getKey() +
" threw an unhandled Exception: ", e);
SchedulerException se = new SchedulerException(
"Job threw an unhandled exception.", e);
qs.notifySchedulerListenersError("Job ("
+ jec.getJobDetail().getKey()
+ " threw an exception.", se);
jobExEx = new JobExecutionException(se, false);
}
jec.setJobRunTime(endTime - startTime);
// notify all job listeners
if (!notifyJobListenersComplete(jec, jobExEx)) {
break;
}
CompletedExecutionInstruction instCode = CompletedExecutionInstruction.NOOP;
// update the trigger
try {
instCode = trigger.executionComplete(jec, jobExEx);
} catch (Exception e) {
// If this happens, there's a bug in the trigger...
SchedulerException se = new SchedulerException(
"Trigger threw an unhandled exception.", e);
qs.notifySchedulerListenersError(
"Please report this error to the Quartz developers.",
se);
}
// notify all trigger listeners
if (!notifyTriggerListenersComplete(jec, instCode)) {
break;
}
// update job/trigger or re-execute job
if (instCode == CompletedExecutionInstruction.RE_EXECUTE_JOB) {
jec.incrementRefireCount();
try {
complete(false);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
continue;
}
try {
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
continue;
}
qs.notifyJobStoreJobComplete(trigger, jobDetail, instCode);
break;
} while (true);
} finally {
qs.removeInternalSchedulerListener(this);
}
}
在此处,job的execute()方法被调用,在上述艰难繁琐的过程后,定时任务终于被准时执行。
但是,定时任务完成之后的后续操作还没有结束。待job执行完毕其execute()方法,将会调用QuartzScheduler的notifyJobStoreComplete()方法执行定时任务的结束操作。
protected void notifyJobStoreJobComplete(OperableTrigger trigger, JobDetail detail, CompletedExecutionInstruction instCode) {
resources.getJobStore().triggeredJobComplete(trigger, detail, instCode);
}
QuartzScheduler调用了JobStore的triggeredJobComplete()方法。根据配置持久化执行过得任务数据,将因为isConcurrentExectionDisallowed参数阻塞的任务准备执行,移出不会再触发的定时任务种种。
以上就是Quartz的定时任务实现。