HiveServer2 常见异常和处理方法

1. Connection timed out

java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/default: java.net.ConnectException: Connection timed out (Connection timed out)
	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:256)
	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:247)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.singleConnectionForAllStatementsExecute(MultiThreadStatementTest.java:88)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.execute(MultiThreadStatementTest.java:73)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.lambda$parallelExecute$0(MultiThreadStatementTest.java:55)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

可能原因:

  1. HiveServer 连接数达到上限。因为每个 jdbc,HiveServer 需要一个线程。
<property>
   <name>hive.server2.thrift.max.worker.threadsname>
   <value>500value>
   <description>Maximum number of Thrift worker threadsdescription>
 property>
  1. HiveServer OOM 不能及时处理客户请求
  2. Hive Metastore 没有响应或者后台数据库卡住。

2. Connection reset by peer

Caused by: java.net.SocketException: Connection reset by peer (connect failed)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
	... 14 more

HiveServer2 Server socket 的 backlog 默认是 0,在centos 系统中查看 HiveServer2 的 backlog 是50。当Socket accept 的速度变慢,操作系统接收的新的请求满的时候。就会丢掉新的请求,报这个错误。
查看当前HiveServer 的 backlog 命令如下:

ss -antp > antp

3. Running, pool size = 100, active threads = 100, queued tasks = 100, completed tasks = xxx

异常信息如下

org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@6ef9d564 rejected from java.util.concurrent.ThreadPoolExecutor@6e2c02d2[Running, pool size = 100, active threads = 100, queued tasks = 100, completed tasks = 234]
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
	at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:324)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)
	at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:497)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.execute(MultiThreadStatementTest.java:82)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.lambda$parallelExecute$0(MultiThreadStatementTest.java:56)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

原因是队列满了,调整以下2个参数可以解决。

<property>
   <name>hive.server2.async.exec.threadsname>
   <value>100value>
   <description>Number of threads in the async thread pool for HiveServer2description>
 property>
 <property>
   <name>hive.server2.async.exec.wait.queue.sizename>
   <value>100value>
   <description>
     Size of the wait queue for async thread pool in HiveServer2.
     After hitting this limit, the async thread pool will reject new requests.
   description>
 property>

你可能感兴趣的:(hive,hive)