企业搜索引擎开发之连接器connector(十四)

回顾Context类的start方法,还有一部分是启动调度器的方法

/**

   * Start up the Scheduler.

   */

  private void startScheduler() {

    traversalScheduler =

        (TraversalScheduler) getRequiredBean("TraversalScheduler",

            TraversalScheduler.class);

    if (traversalScheduler != null) {

      traversalScheduler.init();

    }

  }

即执行TraversalScheduler类对象的init()方法

TraversalScheduler是实现Runnable接口的类,实现该接口的多线程方法,其源码如下: 

/**

 * Scheduler that schedules connector traversal.  This class is thread safe.

 * Must initialize TraversalScheduler before running it.

 *

 * <p> This facility includes a schedule thread that runs a loop.

 * Each iteration it asks the instantiator for the schedule

 * for each Connector Instance and runs batches for those that

 * are

 * <OL>

 * <LI> scheduled to run.

 * <LI> have not exhausted their quota for the current time interval.

 * <LI> are not currently running.

 * </OL>

 * The implementation must handle the situation that a Connector

 * Instance is running.

 */

public class TraversalScheduler implements Runnable {

  public static final String SCHEDULER_CURRENT_TIME = "/Scheduler/currentTime";



  private static final Logger LOGGER =

    Logger.getLogger(TraversalScheduler.class.getName());



  private final Instantiator instantiator;



  private boolean isInitialized; // Protected by instance lock.

  private boolean isShutdown; // Protected by instance lock.



  /**

   * Create a scheduler object.

   *

   * @param instantiator used to get schedule for connector instances

   */

  public TraversalScheduler(Instantiator instantiator) {

    this.instantiator = instantiator;

    this.isInitialized = false;

    this.isShutdown = false;

  }



  public synchronized void init() {

    if (isInitialized) {

      return;

    }

    isInitialized = true;

    isShutdown = false;

    new Thread(this, "TraversalScheduler").start();

  }



  public synchronized void shutdown() {

    if (isShutdown) {

      return;

    }

    isInitialized = false;

    isShutdown = true;

  }



  /**

   * Determines whether scheduler should run.

   *

   * @return true if we are in a running state and scheduler should run or

   *         continue running.

   */

  private synchronized boolean isRunningState() {

    return isInitialized && !isShutdown;

  }



  private void scheduleBatches() {

    for (String connectorName : instantiator.getConnectorNames()) {

      NDC.pushAppend(connectorName);

      try {

        instantiator.getConnectorCoordinator(connectorName).startBatch();

      } catch (ConnectorNotFoundException e) {

        // Looks like the connector just got deleted.  Don't schedule it.

      } finally {

        NDC.pop();

      }

    }

  }



  public void run() {

    NDC.push("Traverse");

    try {

      while (true) {

        try {

          if (!isRunningState()) {

            LOGGER.info("TraversalScheduler thread is stopping due to "

                + "shutdown or not being initialized.");

            return;

          }

          scheduleBatches();

          // Give someone else a chance to run.

          try {

            synchronized (this) {

              wait(1000);

            }

          } catch (InterruptedException e) {

            // May have been interrupted for shutdown.

          }

        } catch (Throwable t) {

          LOGGER.log(Level.SEVERE,

              "TraversalScheduler caught unexpected Throwable: ", t);

        }

      }

    } finally {

      NDC.remove();

    }

  }

}

TraversalScheduler类依赖于Instantiator类,用于Instantiator遍历所有连接器的ConnectorCoordinatorImpl对象并启用startBatch()方法

多线程实现方法run()里面是一个死循环,不断的轮询执行scheduleBatches()方法

我们回顾前面的ConnectorCoordinatorImpl类的startBatch()方法

//@Override

  public synchronized boolean startBatch() throws ConnectorNotFoundException {

    verifyConnectorInstanceAvailable();

    if (!shouldRun()) {

      return false;

    }



    BatchSize batchSize = loadManager.determineBatchSize();

    if (batchSize.getMaximum() == 0) {

      return false;

    }

    taskHandle = null;

    currentBatchKey = new Object();



    try {

      BatchCoordinator batchCoordinator = new BatchCoordinator(this);

      TraversalManager traversalManager =

          getConnectorInterfaces().getTraversalManager();

      Traverser traverser = new QueryTraverser(pusherFactory,

          traversalManager, batchCoordinator, name,

          Context.getInstance().getTraversalContext());

      TimedCancelable batch =  new CancelableBatch(traverser, name,

          batchCoordinator, batchCoordinator, batchSize);

      taskHandle = threadPool.submit(batch);

      return true;

    } catch (ConnectorNotFoundException cnfe) {

      LOGGER.log(Level.WARNING, "Connector not found - this is normal if you "

          + " recently reconfigured your connector instance: " + cnfe);

    } catch (InstantiatorException ie) {

      LOGGER.log(Level.WARNING,

          "Failed to perform connector content traversal.", ie);

      delayTraversal(TraversalDelayPolicy.ERROR);

    }

    return false;

  }

 方法首先执行!shouldRun()的判断,我们分析一下该方法的源码:

/**

   * Returns {@code true} if it is OK to start a traversal,

   * {@code false} otherwise.

   */

  // Package access because this is called by tests.

  synchronized boolean shouldRun() {

    // Are we already running? If so, we shouldn't run again.

    if (taskHandle != null && !taskHandle.isDone()) {

      return false;

    }



    // Don't run if we have postponed traversals.

    if (System.currentTimeMillis() < traversalDelayEnd) {

      return false;

    }



    Schedule schedule = getSchedule();



    // Don't run if traversals are disabled.

    if (schedule.isDisabled()) {

      return false;

    }



    // Don't run if we have exceeded our configured host load.

    if (loadManager.shouldDelay()) {

      return false;

    }



    // OK to run if we are within one of the Schedule's traversal intervals.

    Calendar now = Calendar.getInstance();

    int hour = now.get(Calendar.HOUR_OF_DAY);

    for (ScheduleTimeInterval interval : schedule.getTimeIntervals()) {

      int startHour = interval.getStartTime().getHour();

      int endHour = interval.getEndTime().getHour();

      if (0 == endHour) {

        endHour = 24;

      }

      if (endHour < startHour) {

        // The traversal interval straddles midnight.

        if ((hour >= startHour) || (hour < endHour)) {

          return true;

        }

      } else {

        // The traversal interval falls wholly within the day.

        if ((hour >= startHour) && (hour < endHour)) {

          return true;

        }

      }

    }



    return false;

  }

该方法对是否调度连接器做出审查,如上轮调度是否完成、调度设置是否可用、加载管理器是否要求延迟、调度时机是否成熟等

---------------------------------------------------------------------------

本系列企业搜索引擎开发之连接器connector系本人原创

转载请注明出处 博客园 刺猬的温驯

本文链接 http://www.cnblogs.com/chenying99/archive/2013/03/20/2970378.html

你可能感兴趣的:(connector)