结合JUC谈Springboot优雅停机

废话少说

  • springboot的优雅停机是借助于ShutdownHook回调实现的(网上文章都说烂了)。

  • 在执行hook流程时,spring借助CountDownLatch阻塞线程达到在一定时间内不退出程序,来处理剩下的任务。

  • 原地址:https://juejin.cn/post/7197292579057221693 发表在掘金,这次拿到csdn,书写格式可能不是很友好。

涉及到的知识点


Spring

  • SmartLifecycle

  • DefaultLifecycleProcessor

  • WebServerGracefulShutdownLifecycle

  • WebServerStartStopLifecycle

  • WebServerManager

  • TomcatWebServer implements WebServer

java

  • java.util.concurrent.CountDownLatch

  • java.lang.Runtime

  • java.lang.ApplicationShutdownHooks

  • java.lang.Shutdown


关于spring hook的二三事


  • 什么时候设置的hook

  • 什么时候触发的hook

  • 触发hook后续的流程

设置hook线程和触发hook


SpringApplication#run()

  • org.springframework.boot.SpringApplication#refreshContext()

  • org.springframework.context.support.AbstractApplicationContext#registerShutdownHook()

@Override
public void registerShutdownHook() {
   if (this.shutdownHook == null) {
      // No shutdown hook registered yet.this.shutdownHook = newThread(SHUTDOWN_HOOK_THREAD_NAME) {
         @Override public void run() {
            synchronized (startupShutdownMonitor) {
               doClose();
            }
         }
      };
      Runtime.getRuntime().addShutdownHook(this.shutdownHook);
   }
}
  • 从上述代码中可以看到,spring在刷新上下文时会向Runtime中注册一个shutdownHook,根据Runtime api中注释可以看出,当虚拟机响应关闭信号后(有些信号不会响应例如 kill -9),会执行这个线程

触发hook后续的流程


核心入口

  • 从注册hook时可以看到,当虚拟机回调时会执行 doClose()方法,也就是说这个方法是关闭容器的核心入口

  • org.springframework.context.support.AbstractApplicationContext#doClose()

模拟关闭

public static void main(String[] args){
 ConfigurableApplicationContext context = SpringApplication.run(MvcApplication.class, args);
 // 模拟 shutdown调用
 context.close();
}
@Override
public void close() {
   synchronized (this.startupShutdownMonitor) {
      // 此处调用真正的关闭方法doClose();
      if (this.shutdownHook != null) {
         try {
            Runtime.getRuntime().removeShutdownHook(this.shutdownHook);
         }
         catch (IllegalStateException ex) {
            // ignore - VM is already shutting down
         }
      }
   }
}

protected void doClose() {
       ....... 忽略不在本次范围的代码,有兴趣的可以去源码看看
       // Stop all Lifecycle beans, to avoid delays during individual destruction.if (this.lifecycleProcessor != null) {
         try {
            // 停止实现Lifecycle的bean 
            this.lifecycleProcessor.onClose();
         }
         catch (Throwable ex) {
            logger.warn("Exception thrown from LifecycleProcessor on context close", ex);
         }
      }
    .....
}
  • 上述代码可以忽略不看,只是Springboot停机的外部代码

private void stopBeans() {
   Map lifecycleBeans = getLifecycleBeans();
   Map phases = new HashMap<>();
   lifecycleBeans.forEach((beanName, bean) -> {
      int shutdownPhase = getPhase(bean);
      LifecycleGroup group = phases.get(shutdownPhase);
      if (group == null) {
         group = new LifecycleGroup(shutdownPhase, this.timeoutPerShutdownPhase, lifecycleBeans, false);
         phases.put(shutdownPhase, group);
      }
      group.add(beanName, bean);
   });
   if (!phases.isEmpty()) {
      List keys = new ArrayList<>(phases.keySet());
      keys.sort(Collections.reverseOrder());
      for (Integer key : keys) {
         // TODO 重点
         phases.get(key).stop();
      }
   }
}
  • stopBeans 一共做了两件事 组装 和 排序 这个不重要

  • 重要的是 经过一系列组装,将相同排序的lifecycle加入到同一个 LifecycleGroup 这个类 里面会维护多个 lifecycle成员,在执行stop的时候,多个成员for循环依次执行

// LifecycleGroup
public void stop() {
   if (this.members.isEmpty()) {
      return;
   }
   if (logger.isDebugEnabled()) {
      logger.debug("Stopping beans in phase " + this.phase);
   }
   this.members.sort(Collections.reverseOrder());
   // 倒数器, count数量就是 lifecycle成员的数量
   CountDownLatch latch = new CountDownLatch(this.smartMemberCount);
   
   Set countDownBeanNames = Collections.synchronizedSet(new LinkedHashSet<>());
   
   // 里面的类名,会在doStop时被移除
   Set lifecycleBeanNames = new HashSet<>(this.lifecycleBeans.keySet());
   
   for (LifecycleGroupMember member : this.members) {
      if (lifecycleBeanNames.contains(member.name)) {
         doStop(this.lifecycleBeans, member.name, latch, countDownBeanNames);
      }
      else if (member.bean instanceof SmartLifecycle) {
         // Already removed: must have been a dependent bean from another phase
         latch.countDown();
      }
   }
   try {
       // await 等待, 也就意味着 如果在上述方法时候,一直不执行countDown ,这里就是一个兜底方案,强制放行
      latch.await(this.timeout, TimeUnit.MILLISECONDS);
      if (latch.getCount() > 0 && !countDownBeanNames.isEmpty() && logger.isInfoEnabled()) {
         logger.info("Failed to shut down " + countDownBeanNames.size() + " bean" +
               (countDownBeanNames.size() > 1 ? "s" : "") + " with phase value " +
               this.phase + " within timeout of " + this.timeout + "ms: " + countDownBeanNames);
      }
   }
   catch (InterruptedException ex) {
      Thread.currentThread().interrupt();
   }
}
private void doStop(Map lifecycleBeans, final String beanName,
      final CountDownLatch latch, final Set countDownBeanNames) {
    // 移除当前这个bean,并返回bean的实例
   Lifecycle bean = lifecycleBeans.remove(beanName);
   if (bean != null) {
       // 依赖关系 依次stop
      String[] dependentBeans = getBeanFactory().getDependentBeans(beanName);
      for (String dependentBean : dependentBeans) {
         doStop(lifecycleBeans, dependentBean, latch, countDownBeanNames);
      }
      try {
         if (bean.isRunning()) {
            if (bean instanceof SmartLifecycle) {
               if (logger.isTraceEnabled()) {
                  logger.trace("Asking bean '" + beanName + "' of type [" +
                        bean.getClass().getName() + "] to stop");
               }
               countDownBeanNames.add(beanName);
               // 核心 执行stop,执行完毕后回调函数中 进行countDown
               ((SmartLifecycle) bean).stop(() -> {
                  latch.countDown();
                  countDownBeanNames.remove(beanName);
                  if (logger.isDebugEnabled()) {
                     logger.debug("Bean '" + beanName + "' completed its stop procedure");
                  }
               });
            }
            else {
               if (logger.isTraceEnabled()) {
                  logger.trace("Stopping bean '" + beanName + "' of type [" +
                        bean.getClass().getName() + "]");
               }
               bean.stop();
               if (logger.isDebugEnabled()) {
                  logger.debug("Successfully stopped bean '" + beanName + "'");
               }
            }
         }
         else if (bean instanceof SmartLifecycle) {
            // Don't wait for beans that aren't running...
            latch.countDown();
         }
      }
      catch (Throwable ex) {
         if (logger.isWarnEnabled()) {
            logger.warn("Failed to stop bean '" + beanName + "'", ex);
         }
      }
   }
}
  • 上述两段代码,其实真正核心的就是一个CountDownLatch的运用

  • LifecycleGroup的member作为countDown的count,stop成功一个释放一个count,直到全部释放成功

  • latch.await(this.timeout, TimeUnit.MILLISECONDS)

  • 如果countDown内部的count一直没被消费,则一直阻塞在这里

  • 作为一个兜底,如果超过timeout时间还没有stop完毕,则不再阻塞线程,这里的timeout就是咱们在yaml文件中配置的

SmartLifecycle的回调

default void stop(Runnable callback) {
   stop();
   callback.run();
}
((SmartLifecycle) bean).stop(() -> {
   latch.countDown();
   countDownBeanNames.remove(beanName);
   if (logger.isDebugEnabled()) {
      logger.debug("Bean '" + beanName + "' completed its stop procedure");
   }
});

具体看下SmartLifecycle这个方法,我们发现,是一个callback函数,只有当stop完成后,再会执行我们设置的函数,也就是latch.countDown()

什么情况下stop迟迟不结束

  • org.springframework.boot.web.reactive.context.WebServerGracefulShutdownLifecycle#stop(java.lang.Runnable)

  • org.springframework.boot.web.reactive.context.WebServerManager#shutDownGracefully

  • org.springframework.boot.web.embedded.tomcat.TomcatWebServer#shutDownGracefully

  • org.springframework.boot.web.embedded.tomcat.GracefulShutdown#shutDownGracefully

void shutDownGracefully(GracefulShutdownCallback callback) {
   logger.info("Commencing graceful shutdown. Waiting for active requests to complete");
   new Thread(() -> doShutdown(callback), "tomcat-shutdown").start();
}

private void doShutdown(GracefulShutdownCallback callback) {
   List connectors = getConnectors();
   connectors.forEach(this::close);
   try {
      for (Container host : this.tomcat.getEngine().findChildren()) {
         for (Container context : host.findChildren()) {
            while (isActive(context)) {
               if (this.aborted) {
                  logger.info("Graceful shutdown aborted with one or more requests still active");
                  callback.shutdownComplete(GracefulShutdownResult.REQUESTS_ACTIVE);
                  return;
               }
               Thread.sleep(50);
            }
         }
      }

   }
   catch (InterruptedException ex) {
      Thread.currentThread().interrupt();
   }
   logger.info("Graceful shutdown complete");
   callback.shutdownComplete(GracefulShutdownResult.IDLE);
}
  • 代码可能有点多,既然坚持到这里了,还是把调用栈详细写出来

  • shutDownGracefully (callback)

  • 我们看到这里启动了一个新的线程,并且执行,全部交给异步执行(不要忘了入参是个 callback)

  • 内部再调用doShutDown(callback)

  • doShutdown(callback) 关键

  • 关闭所有Connector,熟悉tomcat的都知道,Connector是管理socket连接的,关闭了Connector也就代表不再接受新的请求了。

  • isActive(context) == true就一直执行,进入内部源码看下就会清楚,里面是tomcat正在处理的任务,只要有一个任务没结束就返回true,这个方法也就是说明了,优雅关闭的核心,当有请求没有处理完,就允许他继续处理

总结

  • 定义 countDownLatch 阻塞hook的线程, count数量就是 实现lifecycle的子类

  • 循环每一个lifecycle进行stop,stop完成后会进行countDownLatch.countDown()

  • 最外层countDownLatch.await,设置超时时间,如果超时不再阻塞主进程,正常走完hook流程,结束进程

编写不易,转载请标明出处。

你可能感兴趣的:(spring,spring,boot,java,spring)