hive服务启动异常定位记录

环境说明

hadoop-2.7.4
hive-2.3.2
hbase-1.4.2
jdk1.8.0_161

问题现象
原先启动hiveserver2和metastore的两个服务一直运行状况良好,重启这个两个服务后都出现如下异常信息

启动命令示例:hive --service hiveserver2

Exception in thread "main" java.lang.NoSuchMethodError: com.ibm.icu.impl.ICUBinary.getRequiredData(Ljava/lang/String;)Ljava/nio/ByteBuffer;
    at com.ibm.icu.charset.UConverterAlias.haveAliasData(UConverterAlias.java:131)
    at com.ibm.icu.charset.UConverterAlias.getCanonicalName(UConverterAlias.java:525)
    at com.ibm.icu.charset.CharsetProviderICU.getICUCanonicalName(CharsetProviderICU.java:126)
    at com.ibm.icu.charset.CharsetProviderICU.charsetForName(CharsetProviderICU.java:62)
    at java.nio.charset.Charset$2.run(Charset.java:412)
    at java.nio.charset.Charset$2.run(Charset.java:407)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.nio.charset.Charset.lookupViaProviders(Charset.java:406)
    at java.nio.charset.Charset.lookup2(Charset.java:477)
    at java.nio.charset.Charset.lookup(Charset.java:464)
    at java.nio.charset.Charset.forName(Charset.java:528)
    at com.sun.org.apache.xml.internal.serializer.Encodings$EncodingInfos.findCharsetNameFor(Encodings.java:386)
    at com.sun.org.apache.xml.internal.serializer.Encodings$EncodingInfos.findCharsetNameFor(Encodings.java:422)
    at com.sun.org.apache.xml.internal.serializer.Encodings$EncodingInfos.loadEncodingInfo(Encodings.java:450)
    at com.sun.org.apache.xml.internal.serializer.Encodings$EncodingInfos.(Encodings.java:308)
    at com.sun.org.apache.xml.internal.serializer.Encodings$EncodingInfos.(Encodings.java:296)
    at com.sun.org.apache.xml.internal.serializer.Encodings.(Encodings.java:564)
    at com.sun.org.apache.xml.internal.serializer.ToStream.(ToStream.java:134)
    at com.sun.org.apache.xml.internal.serializer.ToXMLStream.(ToXMLStream.java:67)
    at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.(ToUnknownStream.java:143)
    at com.sun.org.apache.xalan.internal.xsltc.runtime.output.TransletOutputHandlerFactory.getSerializationHandler(TransletOutputHandlerFactory.java:159)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.getOutputHandler(TransformerImpl.java:438)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:328)
    at org.apache.hadoop.conf.Configuration.writeXml(Configuration.java:2790)
    at org.apache.hadoop.conf.Configuration.writeXml(Configuration.java:2769)
    at org.apache.hadoop.hive.conf.HiveConf.getConfVarInputStream(HiveConf.java:3628)
    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4051)
    at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:4003)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:81)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:65)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:702)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

定位过程:
1.根据异常堆栈的信息,初步判断是在java进程启动时加载字符编码集时找不到具体lib中的类方法;
2.CharsetProviderICU这个编码集提供方不是jdk自带的(在jdk的lib中搜了下没找到),于是查看jdk的源码查看这个类的加载过程
Charset.java:340

 ClassLoader cl = ClassLoader.getSystemClassLoader();
 ServiceLoader sl = ServiceLoader.load(CharsetProvider.class, cl);
 Iterator i = sl.iterator();

ServiceLoader中的load过程核心代码摘录如下

try {
    String fullName = PREFIX + service.getName();
    if (loader == null)
        configs = ClassLoader.getSystemResources(fullName);
    else
        configs = loader.getResources(fullName);
} catch (IOException x) {
     fail(service, "Error locating configuration files", x);
}

其中PREFIX定义为

private static final String PREFIX = "META-INF/services/";

所以可以想到CharsetProviderICU加载的描述应该是定义在某个jar下面的META-INF/services/xxxx的描述中,但是我在hive的lib下并没有查找到CharsetProviderICU这个类的存在;
3.于是再到hbase的lib文件下去查找CharsetProviderICU看是否存在,显示在phoenix-4.14.0-HBase-1.4-client.jar中存在,如果在hive的classpath中确实加载了phoenix-4.14.0-HBase-1.4-client.jar,那就可以理解报错的原因,由于phoenix引入后hive的服务一直没有重启,所以没有报错;
4.现在亟待解决的问题的在启动hive的时候为什么会去加载hbase lib文件夹的包,查看hive的启动脚本(bin/hive这个文件),有如下一段描述

if [ "$SKIP_HBASECP" = false ]; then
  # HBase detection. Need bin/hbase and a conf dir for building classpath entries.
  # Start with BigTop defaults for HBASE_HOME and HBASE_CONF_DIR.
  HBASE_HOME=${HBASE_HOME:-"/usr/lib/hbase"}
  HBASE_CONF_DIR=${HBASE_CONF_DIR:-"/etc/hbase/conf"}
  if [[ ! -d $HBASE_CONF_DIR ]] ; then
    # not explicitly set, nor in BigTop location. Try looking in HBASE_HOME.
    HBASE_CONF_DIR="$HBASE_HOME/conf"
  fi
 
  # perhaps we've located the HBase config. if so, include it on classpath.
  if [[ -d $HBASE_CONF_DIR ]] ; then
    export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HBASE_CONF_DIR}"
  fi
 
  # look for the hbase script. First check HBASE_HOME and then ask PATH.
  if [[ -e $HBASE_HOME/bin/hbase ]] ; then
    HBASE_BIN="$HBASE_HOME/bin/hbase"
  fi
  HBASE_BIN=${HBASE_BIN:-"$(which hbase)"}
 
  # perhaps we've located HBase. If so, include its details on the classpath
  if [[ -n $HBASE_BIN ]] ; then
    # exclude ZK, PB, and Guava (See HIVE-2055)
    # depends on HBASE-8438 (hbase-0.94.14+, hbase-0.96.1+) for `hbase mapredcp` command
    for x in $($HBASE_BIN mapredcp 2>> ${STDERR} | tr ':' '\n') ; do
      if [[ $x == *zookeeper* || $x == *protobuf-java* || $x == *guava* ]] ; then
        continue
      fi
      # TODO: should these should be added to AUX_PARAM as well?
      export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${x}"
    done
  fi
fi

关键的语句是 hbase mapredcp 2>> ${STDERR} | tr ':' '\n'列出了与hbase运行时所需的各种jar包,并在下面加载进了hive运行时所需的classpath中,所以这就解释了hive运行时为什么会加载hbase的lib

解决方法:
按照hive启动脚本中的命令参数--skiphbasecp,如果设置为true则不会记载到classpath路径中,所以启动命令修改为

nohup hive --skiphbasecp --service hiveserver2 2>&1 >> /opt/hive-server2.log &
nohup hive --skiphbasecp --service metastore

启动完成后,测试hive,hive on hbase(实际生产使用时个人不推荐使用这种方式)以及hbase自身的功能都正常。

你可能感兴趣的:(hive服务启动异常定位记录)