Flink访问Kerberos环境下的Hive

目录

测试环境

工程搭建

示例代码及运行

总结

本文主要介绍如何使用Flink访问Kerberos环境下的Hive。

测试环境

1.hive版本为2.1.1

2.flink版本为1.10.0

工程搭建

使用IDE工具通过Maven创建一个Java工程,具体创建过程就不详细描述了。

1.在工程的pom.xml文件中增加如下依赖


	org.apache.flink
	flink-java
	${flink.version}
	


	org.apache.flink
	flink-streaming-java_${scala.binary.version}
	${flink.version}
	


	org.apache.flink
	flink-table-api-java-bridge_${scala.binary.version}
	${flink.version}
	


	org.apache.flink
	flink-table-planner_${scala.binary.version}
	${flink.version}
	


	org.apache.flink
	flink-table-planner-blink_${scala.binary.version}
	${flink.version}
	


	org.apache.flink
	flink-connector-hive_${scala.binary.version}
	${flink.version}
	


	org.apache.flink
	flink-hadoop-compatibility_2.11
	${flink.version}
	


	org.apache.flink
	flink-shaded-hadoop-2-uber
	2.7.5-8.0
	



	org.apache.hive
	hive-exec
	${hive.version}
	

2. 将hive-site.xml,krb5.conf,keytab文件添加到classpath

Flink访问Kerberos环境下的Hive_第1张图片

示例代码及运行

1. 主程序内容如下:

public class HiveCatalogExample {

    private static String name = "myhive";  // hivecatalog的名称
    private static String defaultDatabase = null;  // hive中的数据库
    private static String hiveConfDir = "D:\\test"; // a local path
    private static String version = "2.1.1";

    public static void main(String[] args) throws Exception {

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        EnvironmentSettings settings = EnvironmentSettings.newInstance()
                .useBlinkPlanner()
                .inStreamingMode()
                .build();
        StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, settings);

        new KerberosAuth().kerberosAuth(false);  //认证
        HiveCatalog hive = getHiveCatalog();  // 获取HiveCatalog 
        tableEnv.registerCatalog("myhive", hive);   // set the HiveCatalog as the current catalog of the session
        tableEnv.useCatalog("myhive");
        tableEnv.useDatabase("02_logical_layer");

        Table table = tableEnv.from("test").select("withColumns(1 to 3)");
        tableEnv.toRetractStream(table,Row.class).print();
        tableEnv.execute("demo");

    }

    //获取HiveCatalog
    public static HiveCatalog getHiveCatalog() throws Exception {
        HiveCatalog hiveCatalog = null;
        try {
            hiveCatalog = UserGroupInformation.getLoginUser().doAs(new PrivilegedExceptionAction() {
                @Override
                public HiveCatalog run() throws Exception {
                    return new HiveCatalog(name, defaultDatabase, hiveConfDir, version);
                }
            });
        } catch (IOException e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return hiveCatalog;
    }

}

2.  KerberosAuth类,用于初始化访问Kerberos

public class KerberosAuth {
    public void kerberosAuth(Boolean debug) {
        try {
            System.setProperty("java.security.krb5.conf", "src/main/resources/krb5.ini");
            System.setProperty("javax.security.auth.useSubjectCredsOnly", "false");
            if (debug) System.setProperty("sun.security.krb5.debug", "true");
            UserGroupInformation.loginUserFromKeytab("[email protected]", "src/main/resources/test.keytab");
            System.out.println(UserGroupInformation.getCurrentUser());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

运行结果如下:

Flink访问Kerberos环境下的Hive_第2张图片

总结

  1. 访问Kerberos环境下的hive时,需要使用Hadoop API提供的UserGroupInformation类实现Kerberos账号登录认证,该API在登录Kerberos认证后,会启动一个线程定时的刷新认证
  2. Flink1.10完善了HiveCatalog,使得读取Hive更加简单

你可能感兴趣的:(flink,flink,kerberos,hive,hadoop)