偶然出现一次项目异常spring却没有正常停止的情况,最终发现是Spring Shutdown导致的死锁现象。
某个框架里嵌入了类似这样的一段代码
@Component
public class ShutDownHookTest implements ApplicationListener {
@Override
public void onApplicationEvent(ContextRefreshedEvent event) {
if (onException) {
System.out.println("test shutdown hook deadlock");
System.exit(0);
}
}
}
它的逻辑就是想要在出现异常后,通过System.exit来确保应用程序退出。而且没有使用异步事件,是在主线程下跑了System.exit,然后就发现springboot server还是正常运行着的。而且程序看着好像也没问题,由于我们是dubbo服务化系统,在测试环境上服务还是正常的。
这很明显不符常理,正常来说,System.exit这样的指令是spring能够感知到的,并且会执行shutDown处理的,先来看看Spring 注册ShutdownHook
public abstract class AbstractApplicationContext extends DefaultResourceLoader
implements ConfigurableApplicationContext {
@Override
public void registerShutdownHook() {
if (this.shutdownHook == null) {
// No shutdown hook registered yet.
this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) {
@Override
public void run() {
//重点在这里获取startupShutdownMonitor的监视器锁
synchronized (startupShutdownMonitor) {
doClose();
}
}
};
Runtime.getRuntime().addShutdownHook(this.shutdownHook);
}
}
protected void doClose() {
// Check whether an actual close attempt is necessary...
if (this.active.get() && this.closed.compareAndSet(false, true)) {
if (logger.isDebugEnabled()) {
logger.debug("Closing " + this);
}
if (!NativeDetector.inNativeImage()) {
LiveBeansView.unregisterApplicationContext(this);
}
try {
// Publish shutdown event.
publishEvent(new ContextClosedEvent(this));
}
catch (Throwable ex) {
logger.warn("Exception thrown from ApplicationListener handling ContextClosedEvent", ex);
}
// Stop all Lifecycle beans, to avoid delays during individual destruction.
if (this.lifecycleProcessor != null) {
try {
this.lifecycleProcessor.onClose();
}
catch (Throwable ex) {
logger.warn("Exception thrown from LifecycleProcessor on context close", ex);
}
}
// Destroy all cached singletons in the context's BeanFactory.
destroyBeans();
// Close the state of this context itself.
closeBeanFactory();
// Let subclasses do some final clean-up if they wish...
onClose();
// Reset local application listeners to pre-refresh state.
if (this.earlyApplicationListeners != null) {
this.applicationListeners.clear();
this.applicationListeners.addAll(this.earlyApplicationListeners);
}
// Switch to inactive.
this.active.set(false);
}
}
}
也就是说spring新起了一个线程,加入了JVM Shutdown钩子函数。重点是close前要获取startupShutdownMonitor的对象监视器锁,这个锁看着就很眼熟,Spring在refresh时也会获取这把锁。
public abstract class AbstractApplicationContext extends DefaultResourceLoader
implements ConfigurableApplicationContext {
@Override
public void refresh() throws BeansException, IllegalStateException {
synchronized (this.startupShutdownMonitor) {
StartupStep contextRefresh = this.applicationStartup.start("spring.context.refresh");
// Prepare this context for refreshing.
prepareRefresh();
// Tell the subclass to refresh the internal bean factory.
ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();
// Prepare the bean factory for use in this context.
prepareBeanFactory(beanFactory);
......
}
}
}
这个时候我们猜想,是获取startupShutdownMonitor死锁了。jstack打下线程栈看看
"SpringContextShutdownHook" #18 prio=5 os_prio=0 tid=0x0000000024e00800 nid=0x407c waiting for monitor entry [0x000000002921f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:991)
- waiting to lock <0x00000006c494f430> (a java.lang.Object)
"main" #1 prio=5 os_prio=0 tid=0x0000000002de4000 nid=0x1ff4 in Object.wait() [0x0000000002dde000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1252)
- locked <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1326)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0x00000006c4845128> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:12)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:7)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:176)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:169)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:143)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
- locked <0x00000006c494f430> (a java.lang.Object)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:144)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:771)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:763)
乍一看jstack并没有提示线程死锁(jvisualvm、jconsle之类的工具也不行),但是从线程栈来看:
根本原因是main线程调System.exit阻塞住了,一直往下追踪,会发现阻塞在ApplicationShutdownHooks这里
class ApplicationShutdownHooks {
/* Iterates over all application hooks creating a new thread for each
* to run in. Hooks are run concurrently and this method waits for
* them to finish.
*/
static void runHooks() {
Collection threads;
synchronized(ApplicationShutdownHooks.class) {
threads = hooks.keySet();
hooks = null;
}
for (Thread hook : threads) {
hook.start();
}
for (Thread hook : threads) {
while (true) {
try {
// 等待shutdow线程结束
hook.join();
break;
} catch (InterruptedException ignored) {
}
}
}
}
}
整个死锁的流程:
所以在Spring未完成refresh时,是不能够触发System.exit指令的