日志框架冲突引发的问题及思考

问题

最近遇到了一个奇怪的问题,在线上发布应用时,有一部分机器能正常运行,有一部分起不来。代码部署的都是同一套,应用是tomcat中部署的SpringBoot war包,应用日志看不到,查看tomcat日志,发现报错信息为:

Exception in thread "main" java.lang.IllegalArgumentException: LoggerFactory is not a Logback LoggerContext but Logback is on the classpath. Either remove Logback or the competing implementation (class org.apache.logging.slf4j.Log4jLoggerFactory loaded from ......WEB-INFO/lib/log4j-slf4j-impl-2.13.3.jar). If you are using WebLogic you will need to add 'org.slf4j' to prefer-application-packages in WEB-INF/weblogic.xml: org.apache.logging.slf4j.Log4jLoggerFactory
    at org.springframework.util.Assert.instanceCheckFailed(Assert.java:699)
    at org.springframework.util.Assert.isInstanceOf(Assert.java:599)
    at org.springframework.boot.logging.logback.LogbackLoggingSystem.getLoggerContext(LogbackLoggingSystem.java:284)
    at org.springframework.boot.logging.logback.LogbackLoggingSystem.beforeInitialize(LogbackLoggingSystem.java:104)
......

看提示信息是日志框架冲突了,我们用的是logback却在新版本中无意中引入了log4j的jar包,排除掉log4j的相关引用后都可以正常发布运行了!

问题虽然找到原因也解决了,但是为什么这个问题会引发异常还不是很清楚,为什么有的机器可以有的不行也很奇怪,正好也学习下日志框架。

观察到有一段SLF4J的日志,提示有多个日志实现框架,报错的机器Actual binding都是log4j,不报错的是logback。
启动成功的日志:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:../ch/qos/logback/logback-classic/1.2.3/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:.../org/apache/logging/log4j/log4j-slf4j-impl/2.13.3/log4j-slf4j-impl-2.13.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]

启动失败的日志:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:.../org/apache/logging/log4j/log4j-slf4j-impl/2.13.3/log4j-slf4j-impl-2.13.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:../ch/qos/logback/logback-classic/1.2.3/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

就先研究下SLF4J。

SLF4J

slf4j是日志接口层,提供了日志的统一接口,可以有不同的日志实现。
slf4j通过LoggerFactory.getLogger方法返回Logger接口供上层使用,我们通过源码来看下是如何实现绑定不同的日志框架的。以下代码来自slf4j-api-1.7.25:

public static Logger getLogger(Class clazz) {
        Logger logger = getLogger(clazz.getName());
        ......下面省略非主流程......
}

public static Logger getLogger(String name) {
        ILoggerFactory iLoggerFactory = getILoggerFactory();
        return iLoggerFactory.getLogger(name);
    }

使用的工厂模式来创建日志对象,那关键就是看如何获取的不同日志工厂了。

public static ILoggerFactory getILoggerFactory() {
        if (INITIALIZATION_STATE == UNINITIALIZED) {
            synchronized (LoggerFactory.class) {
                if (INITIALIZATION_STATE == UNINITIALIZED) {
                    INITIALIZATION_STATE = ONGOING_INITIALIZATION;
                    performInitialization();
                }
            }
        }
        ......下面省略非主流程......
}

private final static void performInitialization() {
        bind();
        if (INITIALIZATION_STATE == SUCCESSFUL_INITIALIZATION) {
            versionSanityCheck();
        }
    }

private final static void bind() {
        try {
            Set staticLoggerBinderPathSet = null;
            // skip check under android, see also
            // http://jira.qos.ch/browse/SLF4J-328
            if (!isAndroid()) {
                staticLoggerBinderPathSet = findPossibleStaticLoggerBinderPathSet();
                reportMultipleBindingAmbiguity(staticLoggerBinderPathSet);
            }
            // the next line does the binding
            StaticLoggerBinder.getSingleton();
            INITIALIZATION_STATE = SUCCESSFUL_INITIALIZATION;
            reportActualBinding(staticLoggerBinderPathSet);
            ......下面省略非主流程......
        }
}

可以看到,绑定日志实现类的流程在bind方法里面,reportMultipleBindingAmbiguity和reportActualBinding就是打印上面SLF4J日志的地方了。但是具体是如何绑定的呢?

            // the next line does the binding
            StaticLoggerBinder.getSingleton();

纳尼?StaticLoggerBinder这个类的实现不存在slf4j的jar包中啊,为什么调用一下获取的单例方法就能够绑定日志框架了?
这里是通过java的类加载机制,在slf4j框架中不仅定义了绑定接口(LoggerFactoryBinder),还固定了它的实现类必须得是org.slf4j.impl.StaticLoggerBinder,也就是说,需要绑定slf4j的日志实现框架,不仅要实现LoggerFactoryBinder接口,而且还必须得定义org.slf4j.impl.StaticLoggerBinder这个类来实现,这样在slf4j绑定的时候就可以通过类名来加载这个类了,哪个先被类加载器加载到,就绑定的那个实现框架。不知道这里为啥这样设计,一般定义好接口,获取外部可拓展的实现类时,都是通过java SPI机制来实现的,SLF4J这样设计猜测可能是出于性能考虑。

原因

上面看到SLF4J绑定日志框架是看先加载到哪个类,而绑定到logback时应用正常启动,绑定到log4j时应用启动失败。而失败堆栈是springboot打印出来的,猜测应该是springboot适配日志系统的问题。
出错处代码在LogbackLoggingSystem中:

private LoggerContext getLoggerContext() {
        ILoggerFactory factory = StaticLoggerBinder.getSingleton().getLoggerFactory();
        Assert.isInstanceOf(LoggerContext.class, factory,
                String.format(
                        "LoggerFactory is not a Logback LoggerContext but Logback is on "
                                + "the classpath. Either remove Logback or the competing "
                                + "implementation (%s loaded from %s). If you are using "
                                + "WebLogic you will need to add 'org.slf4j' to "
                                + "prefer-application-packages in WEB-INF/weblogic.xml",
                        factory.getClass(), getLocation(factory)));
        return (LoggerContext) factory;
    }

也就是springboot在获得LoggingSystem取到的是LogbackLoggingSystem,所以在LoggerBinder绑定Logback的时候正常,而在绑定Log4j的时候会抛异常。这就是只有一部分机器启动失败的原因了。那么为什么LoggingSystem会是LogbackLoggingSystem呢,再看下获取LoggingSystem的代码(spring-boot-2.3.2.RELEASE)

public static LoggingSystem get(ClassLoader classLoader) {
        String loggingSystem = System.getProperty(SYSTEM_PROPERTY);
        if (StringUtils.hasLength(loggingSystem)) {
            if (NONE.equals(loggingSystem)) {
                return new NoOpLoggingSystem();
            }
            return get(classLoader, loggingSystem);
        }
        return SYSTEMS.entrySet().stream().filter((entry) -> ClassUtils.isPresent(entry.getKey(), classLoader))
                .map((entry) -> get(classLoader, entry.getValue())).findFirst()
                .orElseThrow(() -> new IllegalStateException("No suitable logging system located"));
    }

可以看到是先从系统属性中取loggingSystem,没有的话再从SYSTEMS变量中遍历当前类存在的。SYSTEMS是这样初始化的:

private static final Map SYSTEMS;
    static {
        Map systems = new LinkedHashMap<>();
        systems.put("ch.qos.logback.core.Appender", "org.springframework.boot.logging.logback.LogbackLoggingSystem");
        systems.put("org.apache.logging.log4j.core.impl.Log4jContextFactory",
                "org.springframework.boot.logging.log4j2.Log4J2LoggingSystem");
        systems.put("java.util.logging.LogManager", "org.springframework.boot.logging.java.JavaLoggingSystem");
        SYSTEMS = Collections.unmodifiableMap(systems);
    }

由于没有设置SYSTEM_PROPERTY的系统属性,而SYSTEMS中第一项是logback,同时logback的jar包也在类路径中,所以LoggingSystem每次取到的都是LogbackLoggingSystem了。至此问题出现的原因已找到,同时也找到了另外一种修改的方式,即通过设置系统属性来让springboot不使用日志系统输出日志:

System.setProperty(LoggingSystem.class.getName(), "none");

经测试果然也能避免异常的发生。

猜测

现在问题的原因基本已找到,但还有一个最根本的问题没有解决,那就是,为什么有的机器是先加载的Log4j的StaticLoggerBinder类,有的先加载的是Logback中的。这个问题暂时还没有找到答案,只能有一些大概的猜测,哪位大佬如果有想法可以留言交流下。
以下分析tomcat版本为8.0.30:
Tomcat中Web应用的类加载器使用的是WebappClassLoader来加载WEB-INF/lib中的jar包,在数据完全一致的时候为什么会出现不一样的结果呢?查看WebappClassLoader加载类的流程,发现是将lib中的jar包封装成WebResourceSet存到WebResourceRoot中,而加载lib中的jar的代码在StandardRoot.processWebInfLib:

protected void processWebInfLib() throws LifecycleException {
        WebResource[] possibleJars = listResources("/WEB-INF/lib", false);

        for (WebResource possibleJar : possibleJars) {
            if (possibleJar.isFile() && possibleJar.getName().endsWith(".jar")) {
                createWebResourceSet(ResourceSetType.CLASSES_JAR,
                        "/WEB-INF/classes", possibleJar.getURL(), "/");
            }
        }
    }

jar包加载的顺序是listResources决定的,里面会调用File.list()方法来获取目录中的所有文件,而java中这个方法的注释有说:

  There is no guarantee that the name strings in the resulting array
     * will appear in any specific order; they are not, in particular,
     * guaranteed to appear in alphabetical order.
      ...还有一些别的注释...
      public String[] list() {
        SecurityManager security = System.getSecurityManager();
        if (security != null) {
            security.checkRead(path);
        }
        if (isInvalid()) {
            return null;
        }
        return fs.list(this);
    }

翻译过来也就是说不保证返回的顺序,尤其是不保证安字母顺序返回,我猜测这就是加载类顺序部固定的原因,但是也have no guarantee

你可能感兴趣的:(日志框架冲突引发的问题及思考)