基于Hiveserver的查询平台Timeout异常排查

线上查询平台刚上线时经常跑一个查询跑到5分钟左右就抛异常了,因为是基于Hiveserver2的,先看一下是否在目标端超时设置有问题.

对于Hiveserver2的超时设置有2个参数来决定的(Hive 0.10),默认值如下:

<property>
   <name>hive.server.read.socket.timeout</name>
   <value>10</value>
   <description>Timeout for theHiveServer to close the connection if no response from the client in N seconds,defaults to 10 seconds.</description>
</property>
<property>
  <name>hive.server.tcp.keepalive</name>
  <value>true</value>
   <description>Whether to enable TCPkeepalive for the HiveServer. Keepalive will prevent accumulation of half-openconnections.</description>
</property>
booleantcpKeepAlive = conf.getBoolVar(HiveConf.ConfVars.SERVER_TCP_KEEP_ALIVE);
TServerTransportserverTransport = tcpKeepAlive ? new TServerSocketKeepAlive(cli.port) : newTServerSocket(cli.port, 1000 *conf.getIntVar(HiveConf.ConfVars.SERVER_READ_SOCKET_TIMEOUT));

Socket result = serverSocket_.accept();
TSocket result2 = new TSocket(result);
result2.setTimeout(clientTimeout_);
return result2;


通过代码可见,在默认情况下会返回一个TServerSocketKeepAlive,也就是说在Hiveserver的层面是不会发生超时的,只有当hive.server.tcp.keepalive设置为false时才会在读取数据时将hive.server.read.socket.timeout设置为超时时间.

既然末端不会超时那就从头看:Nginx Proxy端设置的读取超时时间proxy_read_timeout设置了30分钟,应该没问题的;接下来就是Tomcat的会话超时时间设置,是保持默认的30分钟:

<session-config>
       <session-timeout>30</session-timeout>
</session-config>

到这里才想起来我们的HiveServer通过haproxy做了HA,看了一下果然这里的超时时间设置的是5分钟:

   timeout client 300000

   timeout server 300000

最后修改为30分钟解决(注意这里client,server的超时都必须设置,否则:server端没设置则在HiveStatment.execute()时报异常,而server端设置client端没设置则会抛出HiveQueryResult cannotcreate resultset retrieveSchema()异常).

对于分层架构的超时设置必须要协调好啊:)


你可能感兴趣的:(timeout,hiveserver2)