记一次Nacos的issue修复之并发导致的NPE异常

ISSUE

Spring boot 应用启动被终止 #21

错误分析

DeferredApplicationEventPublisher的继承关系

import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationEvent;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.ApplicationListener;
import org.springframework.context.ConfigurableApplicationContext;
import org.springframework.context.event.ContextRefreshedEvent;

public class DeferredApplicationEventPublisher implements ApplicationEventPublisher, ApplicationListener<ContextRefreshedEvent> {
  ...
}
复制代码

DeferredApplicationEventPublisher的依赖图

现在来分析具体出现NPE错误的原因

先看EventPublishingConfigService中的addListener

@Override
public void addListener(String dataId, String group, Listener listener) throws NacosException {
  Listener listenerAdapter = new DelegatingEventPublishingListener(configService, dataId, group, applicationEventPublisher, executor, listener);
  configService.addListener(dataId, group, listenerAdapter);
  publishEvent(new NacosConfigListenerRegisteredEvent(configService, dataId, group, listener, true));
}
复制代码

然后看DelegatingEventPublishingListener代码的继承关系

import com.alibaba.nacos.api.config.ConfigService;
import com.alibaba.nacos.api.config.listener.Listener;
import org.springframework.context.ApplicationEventPublisher;

import java.util.concurrent.Executor;

final class DelegatingEventPublishingListener implements Listener {
  DelegatingEventPublishingListener(ConfigService configService, String dataId, String groupId, ApplicationEventPublisher applicationEventPublisher, Executor executor, Listener delegate) {
    this.configService = configService;
    this.dataId = dataId;
    this.groupId = groupId;
    this.applicationEventPublisher = applicationEventPublisher;
    this.executor = executor;
    this.delegate = delegate;
  }
}
复制代码

可以看到,在创建DelegatingEventPublishingListener对象的时候,会传入一个线程池Executor,以及一个ApplicationEventPublisher(其实就是DeferredApplicationEventPublisher

然后再看看CacheData.safeNotifyListener()方法做了什么操作

private void safeNotifyListener(final String dataId, final String group, final String content, final String md5, final ManagerListenerWrap listenerWrap) {
        final Listener listener = listenerWrap.listener;
        Runnable job = new Runnable() {
            public void run() {
                ClassLoader myClassLoader = Thread.currentThread().getContextClassLoader();
                ClassLoader appClassLoader = listener.getClass().getClassLoader();
                try {
                    if (listener instanceof AbstractSharedListener) {
                        AbstractSharedListener adapter = (AbstractSharedListener)listener;
                        adapter.fillContext(dataId, group);
                        LOGGER.info("[{}] [notify-context] dataId={}, group={}, md5={}", name, dataId, group, md5);
                    }
                    // 执行回调之前先将线程classloader设置为具体webapp的classloader,以免回调方法中调用spi接口是出现异常或错用(多应用部署才会有该问题)。
                    Thread.currentThread().setContextClassLoader(appClassLoader);

                    ConfigResponse cr = new ConfigResponse();
                    cr.setDataId(dataId);
                    cr.setGroup(group);
                    cr.setContent(content);
                    configFilterChainManager.doFilter(null, cr);
                    String contentTmp = cr.getContent();
                    listener.receiveConfigInfo(contentTmp);
                    listenerWrap.lastCallMd5 = md5;
                    LOGGER.info("[{}] [notify-ok] dataId={}, group={}, md5={}, listener={} ", name, dataId, group, md5,
                        listener);
                } catch (NacosException de) {
                    LOGGER.error("[{}] [notify-error] dataId={}, group={}, md5={}, listener={} errCode={} errMsg={}", name,
                        dataId, group, md5, listener, de.getErrCode(), de.getErrMsg());
                } catch (Throwable t) {
                    LOGGER.error("[{}] [notify-error] dataId={}, group={}, md5={}, listener={} tx={}", name, dataId, group,
                        md5, listener, t.getCause());
                } finally {
                    Thread.currentThread().setContextClassLoader(myClassLoader);
                }
            }
        };

        final long startNotify = System.currentTimeMillis();
        try {
            if (null != listener.getExecutor()) {
                listener.getExecutor().execute(job);
            } else {
                job.run();
            }
        }
  ...
}
复制代码

这里看到,safeNotifyListener是将事件广播给所有的Listener,然后有一段及其重要的代码段,它就是导致LinkedList出现并发使用的原因

listener.getExecutor().execute(job);
复制代码

这里还记得刚刚说过的DelegatingEventPublishingListener对象在创建之初有传入Executor参数吗?这里Listener调用Executor将上述的任务调入线程池中进行调度,因此,导致了DeferredApplicationEventPublisher可能存在并发的使用

错误复现

public class DeferrNPE {

    private static LinkedList list = new LinkedList<>();

    private static CountDownLatch latch = new CountDownLatch(3);
    private static CountDownLatch start = new CountDownLatch(3);

    private static class MyListener implements Runnable {

        @Override
        public void run() {
            start.countDown();
            try {
                start.await();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            list.add(String.valueOf(System.currentTimeMillis()));
            latch.countDown();
        }
    }

    public static void main(String[] args) {
        MyListener l1 = new MyListener();
        MyListener l2 = new MyListener();
        MyListener l3 = new MyListener();
        new Thread(l1).start();
        new Thread(l2).start();
        new Thread(l3).start();
        try {
            latch.await();
            Iterator iterator = list.iterator();
            while (iterator.hasNext()) {
                System.out.println(iterator.next());
                iterator.remove();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

}
复制代码

最终修正

由于是非线程安全使用在并发的场景下,因此只能更改上层nacos-spring-context的容器使用,将原先的非线程安全的LinkedList转为线程安全的ConcurrentLinkedQueue

转载于:https://juejin.im/post/5cee66f66fb9a07eeb138b55

你可能感兴趣的:(记一次Nacos的issue修复之并发导致的NPE异常)