微步229

HiveServer2(Spark ThriftServer)自定义权限认证

Hive 除了为我们提供一个 CLI 方式来查询数据之外，还给我们提供了基于 JDBC/ODBC 的方式来连接 Hive，这就是 HiveServer2（HiveServer）。但是默认情况下通过 JDBC 连接 HiveServer2 不需要任何的权限认证（hive.server2.authentication = NONE）；这意味着任何知道 ThriftServer 地址的人都可以连接我们的 Hive，并执行一些操作。更可怕的是，这些人甚至可以执行一些诸如 drop table xxx 之类的操作，这势必威胁到我们的数据安全。基于此，本文介绍如何通过自定义的方式来对连接 ThriftServer 的人进行有效的权限验证。

如果想及时了解Spark、Hadoop或者Hbase相关的文章，欢迎关注微信公共帐号：iteblog_hadoop

HiveServer2 支持多种用户安全认证方式：NONE, NOSASL, KERBEROS, LDAP, PAM ,CUSTOM 等等。我们可以通过 hive.server2.authentication 参数进行配置。这篇文章我们涉及到的配置是 hive.server2.authentication=CUSTOM，这时候我们需要通过 hive.server2.custom.authentication.class 参数配置我们自定义的权限认证类，这个类必须实现 org.apache.hive.service.auth.PasswdAuthenticationProvider 接口，org.apache.hive.service.auth.PasswdAuthenticationProvider 接口的定义如下：

package org.apache.hive.service.auth;

import javax.security.sasl.AuthenticationException;

public interface PasswdAuthenticationProvider {

/**

* The Authenticate method is called by the HiveServer2 authentication layer

* to authenticate users for their requests.

* If a user is to be granted, return nothing/throw nothing.

* When a user is to be disallowed, throw an appropriate {@link AuthenticationException}.

*

* For an example implementation, see {@link LdapAuthenticationProviderImpl}.

*

* @param user The username received over the connection request

* @param password The password received over the connection request

*

* @throws AuthenticationException When a user is found to be

* invalid by the implementation

*/

void Authenticate(String user, String password) throws AuthenticationException;

}

从上可知，我们需要实现 Authenticate 方法（为啥这个方法第一个字母大写？不应该是 authenticate 吗？）。现在我们来定义一个自己的授权认证类 IteblogPasswdAuthenticationProvider。为了简便，我把用户名和密码等数据写到了文件中；你要是愿意，你可以将这些信息存储到数据库中。实现代码非常简单，如下：

package com.iteblog.hive;

import com.google.common.base.Charsets;

import com.google.common.collect.Maps;

import com.google.common.io.Files;

import com.google.common.io.LineProcessor;

import org.apache.hadoop.hive.conf.HiveConf;

import org.apache.hive.service.auth.PasswdAuthenticationProvider;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

import javax.security.sasl.AuthenticationException;

import java.io.File;

import java.io.IOException;

import java.util.Map;

public class IteblogPasswdAuthenticationProvider implements PasswdAuthenticationProvider {

private static Logger logger = LoggerFactory.getLogger("IteblogPasswdAuthenticationProvider");

private static Map lines = null;

static {

HiveConf hiveConf = new HiveConf();

String filePath = hiveConf.get("hive.server2.custom.authentication.passwd.filepath");

logger.warn("Start init configuration file: {}.", filePath);

try {

File file = new File(filePath);

lines = Files.readLines(file, Charsets.UTF_8, new LineProcessor>() {

Map map = Maps.newHashMap();

public boolean processLine(String line) {

String arr[] = line.split(",");

if (arr.length != 2) {

logger.error("Configuration error: {}", line);

return false;

}

map.put(arr[0], arr[1]);

return true;

}

public Map getResult() {

return map;

}

});

} catch (IOException e) {

logger.error("Read configuration file error: {}", e.getMessage());

System.exit(127);

}

public void Authenticate(String username, String password) throws AuthenticationException {

if (lines == null) {

throw new AuthenticationException("Configuration file parser error!");

}

String passwd = lines.get(username);

if (passwd == null) {

throw new AuthenticationException("Unauthorized user: " + username + ", please contact iteblog to add an account.");

} else if (!passwd.equals(password)) {

throw new AuthenticationException("Incorrect password for " + username + ", please try again.");

}

logger.warn("User[{}] authorized success.", username);

}

上面的代码逻辑非常简单，我就不介绍了。我们把用户名和密码信息写到了 /home/iteblog/hive-thrift-passwd文件中，内容日下:

iteblog,123

并通过参数 hive.server2.custom.authentication.passwd.filepath 参数指定这个文件的路径。现在我们编译上面的类，得到的 jar 包放到 $HIVE_HOME/lib 路径下面（如果你是用 Spark，那你将这个 jar 包放到 $SPARK_HOME/jars 或 $SPARK_HOME/lib 目录下）。并在 hive-site.xml 文件里面加入以下的配置：

<property>

<name>hive.server2.authenticationname>

   <value>CUSTOMvalue>
 property>
  
 <property>
   <name>hive.server2.custom.authentication.classname>
   <value>com.iteblog.hive.IteblogPasswdAuthenticationProvidervalue>
 property>
  
 <property>
   <name>hive.server2.custom.authentication.passwd.filepathname>
   <value>/home/iteblog/hive-thrift-passwdvalue>
 property>

重启 thriftServer，现在我们来测试下这个设置是否生效：

beeline> !connect jdbc:hive2://www.iteblog.com:10000

Connecting to jdbc:hive2://www.iteblog.com:10000

Enter username for jdbc:hive2://www.iteblog.com:10000:

Enter password for jdbc:hive2://www.iteblog.com:10000:

Error: Could not open client transport with JDBC Uri: jdbc:hive2://www.iteblog.com:10000: Peer indicated failure: Error validating the login (state=08S01,code=0)

0: jdbc:hive2://www.iteblog.com:10000 (closed)>

上面测试我们登录的时候没有设置用户名和密码，如果你没有输入用户名系统默认会用 anonymous 用户进行登录，但是我们并没有配置这个用户登录，这时候应该是不能登录的。从上面终端输出也可以看出登录失败，而且日志里面有如下异常输出

18/01/10 16:50:40 WARN IteblogPasswdAuthenticationProvider: Start init configuration file: /home/iteblog/hive-thrift-passwd.

18/01/10 16:50:40 ERROR TSaslTransport: SASL negotiation failure

javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: Unauthorized user: anonymous, please contact iteblog to add an account.]

at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:109)

at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)

at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)

at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)

at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)

at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: javax.security.sasl.AuthenticationException: Unauthorized user: anonymous, please contact iteblog to add an account.

at com.iteblog.hive.IteblogPasswdAuthenticationProvider.Authenticate(IteblogPasswdAuthenticationProvider.java:58)

at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.Authenticate(CustomAuthenticationProviderImpl.java:47)

at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:106)

at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:102)

... 8 more

现在我们用正确的用户登录，但是密码是错误的，看下结果：

beeline> !connect jdbc:hive2://www.iteblog.com:10000

Connecting to jdbc:hive2://www.iteblog.com:10000

Enter username for jdbc:hive2://www.iteblog.com:10000: iteblog

Enter password for jdbc:hive2://www.iteblog.com:10000: ***

Error: Could not open client transport with JDBC Uri: jdbc:hive2://www.iteblog.com:10000: Peer indicated failure: Error validating the login (state=08S01,code=0)

从上面的结果可以看出，登录同样失败，日志里面也会有以下的异常信息：

18/01/10 16:53:39 ERROR TSaslTransport: SASL negotiation failure

javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: Incorrect password for iteblog, please try again.]