Flink-1.9.1集成读写Hive(基于CDH 5.14.2集成成功)

文章目录

  • 写在前面
  • 环境准备
  • 操作步骤
    • 1. 配置Flink-1.9.1使用Hive-1.2.1
    • 2. Sql-client简单功能测试
    • 3. Java提交Flink job demo
  • 参考

写在前面

本文记录了一次在CDH-5.14.2集群中(Hive升级到Hive 1.2.1、Hadoop 2.6.0-cdh5.14.2)实现Flink-1.9.1集成读写Hive的过程,介绍了Flink自带sql-client连接Hive的方式以及java实现连接Hive的小demo。

之前在CDH-6.1.0集群中因为版本的原因(Hive-2.1.1)没有执行成功,而且使用CDH-5.14.2中的Hive-1.1.0同样也出现问题。
需要提醒的是Flink在1.9.x版本才提供集成读写Hive的功能,且是beta版,Flink官方表示目前Flink集成Hive仅支持2.3.4和1.2.1两个版本,我在利用CDH-6.1.0-Hadoop-3.0.0(Hive-2.1.1)集群集成Hive过程中发现,无论配置2.3.4和1.2.1都会出现错误,使用CDH-5.14.2集群并将Hive从1.1.0升级到Hive-1.2.1,Flink集成Hive运行成功。

还是建议使用匹配的对应Hive版本(起码大版本号要对应或相近,否则出现方法找不到等错误)。

环境准备

基于CDH-5.14.2编译Flink1.9.1的安装包
Hive 1.1.0-cdh5.14.2
CDH-5.14.2集群升级Hive-1.1.0至Hive-1.2.1

操作步骤

1. 配置Flink-1.9.1使用Hive-1.2.1

首先修改flink-1.9.1/conf/sql-client-defaults.yaml配置,为hive配置catalog相关参数,cdh版本的hive-conf目录为:/etc/hive/conf.cloudera.hive。

目前Flink集成Hive仅支持2.3.4和1.2.1两个版本,hive-version只能配置2.3.4或1.2.1两个值。

[root@node01 lib]# vi /opt/flink-1.9.1/conf/sql-client-defaults.yaml
...
#==============================================================================
# Catalogs
#==============================================================================

# Define catalogs here.

#catalogs: [] # empty list
# A typical catalog definition looks like:
#  - name: myhive
#    type: hive
#    hive-conf-dir: /opt/hive_conf/
#    default-database: ...

catalogs:
   - name: myhive
     type: hive
     property-version: 1
     hive-conf-dir: /etc/hive/conf.cloudera.hive
     hive-version: 1.2.1

执行bin/sql-client.sh embedded启动sql-client,第一次报错:

[root@node01 flink-1.9.1]# bin/sql-client.sh embedded
No default environment specified.
Searching for '/opt/flink-1.9.1/conf/sql-client-defaults.yaml'...found.
Reading default environment from: file:/opt/flink-1.9.1/conf/sql-client-defaults.yaml
No session environment specified.
Validating current environment...

Exception in thread "main" org.apache.flink.table.client.SqlClientException: The configured environment is invalid. Please check your environment files again.
        at org.apache.flink.table.client.SqlClient.validateEnvironment(SqlClient.java:147)
        at org.apache.flink.table.client.SqlClient.start(SqlClient.java:99)
        at org.apache.flink.table.client.SqlClient.main(SqlClient.java:194)
Caused by: org.apache.flink.table.client.gateway.SqlExecutionException: Could not create execution context.
        at org.apache.flink.table.client.gateway.local.LocalExecutor.getOrCreateExecutionContext(LocalExecutor.java:562)
        at org.apache.flink.table.client.gateway.local.LocalExecutor.validateSession(LocalExecutor.java:382)
        at org.apache.flink.table.client.SqlClient.validateEnvironment(SqlClient.java:144)
        ... 2 more
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.CatalogFactory' in
the classpath.

Reason: No context matches.

The following properties are requested:
hive-conf-dir=/etc/hive/conf.cloudera.hive
hive-version=1.2.1
property-version=1
type=hive

The following factories have been considered:
org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
org.apache.flink.table.sources.CsvBatchTableSourceFactory
org.apache.flink.table.sources.CsvAppendTableSourceFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
org.apache.flink.table.planner.StreamPlannerFactory
org.apache.flink.table.executor.StreamExecutorFactory
org.apache.flink.table.planner.delegation.BlinkPlannerFactory
org.apache.flink.table.planner.delegation.BlinkExecutorFactory
        at org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:283)
        at org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:191)
        at org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:144)
        at org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:114)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.createCatalog(ExecutionContext.java:258)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$new$0(ExecutionContext.java:136)
        at java.util.HashMap.forEach(HashMap.java:1289)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:135)
        at org.apache.flink.table.client.gateway.local.LocalExecutor.getOrCreateExecutionContext(LocalExecutor.java:558)
        ... 4 more

加入Flink编译源码目录下生成的相关jar:

{flink-compile-home}/flink-connectors/flink-connector-hive/target/flink-connector-hive_2.11-1.9.1.jar
{flink-compile-home}/flink-connectors/flink-hadoop-compatibility/target/flink-hadoop-compatibility_2.11-1.9.1.jar

运行再次报错:

[root@node01 flink-1.9.1]# bin/sql-client.sh embedded
No default environment specified.
Searching for '/opt/flink-1.9.1/conf/sql-client-defaults.yaml'...found.
Reading default environment from: file:/opt/flink-1.9.1/conf/sql-client-defaults.yaml
No session environment specified.
Validating current environment...

Exception in thread "main" org.apache.flink.table.client.SqlClientException: The configured environment is invalid. Please check your environment files again.
        at org.apache.flink.table.client.SqlClient.validateEnvironment(SqlClient.java:147)
        at org.apache.flink.table.client.SqlClient.start(SqlClient.java:99)
        at org.apache.flink.table.client.SqlClient.main(SqlClient.java:194)
Caused by: org.apache.flink.table.client.gateway.SqlExecutionException: Could not create execution context.
        at org.apache.flink.table.client.gateway.local.LocalExecutor.getOrCreateExecutionContext(LocalExecutor.java:562)
        at org.apache.flink.table.client.gateway.local.LocalExecutor.validateSession(LocalExecutor.java:382)
        at org.apache.flink.table.client.SqlClient.validateEnvironment(SqlClient.java:144)
        ... 2 more
Caused by: java.lang.NoClassDefFoundError: org/apache/hive/common/util/HiveVersionInfo
        at org.apache.flink.table.catalog.hive.client.HiveShimLoader.getHiveVersion(HiveShimLoader.java:58)
        at org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory.createCatalog(HiveCatalogFactory.java:82)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.createCatalog(ExecutionContext.java:259)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$new$0(ExecutionContext.java:136)
        at java.util.HashMap.forEach(HashMap.java:1289)
        at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:135)
        at org.apache.flink.table.client.gateway.local.LocalExecutor.getOrCreateExecutionContext(LocalExecutor.java:558)
        ... 4 more
Caused by: java.lang.ClassNotFoundException: org.apache.hive.common.util.HiveVersionInfo
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 11 more

报错显示缺少Hive相关的jar包,sql-client的jar管理直接放在{flink-home}/lib下,且Hive的版本支持2.3.4和1.2.1,我的Hive版本:Hive 2.1.1-cdh6.1.0,根据版本最近选择2.3.4,下载Hive-1.2.1的安装包:http://archive.apache.org/dist/hive/hive-1.2.1/(Hive-2.3.4:http://archive.apache.org/dist/hive/hive-2.3.4/apache-hive-2.3.4-bin.tar.gz)

拷贝{hive-home}/lib中的相关jar包:

{hive-home}/lib/hive-exec-1.2.1.jar
{hive-home}/lib/hive-common-1.2.1.jar
{hive-home}/lib/hive-metastore-1.2.1.jar
{hive-home}/lib/hive-shims-common-1.2.1.jar
{hive-home}/lib/antlr-runtime-3.4.jar
{hive-home}/lib/datanucleus-api-jdo-3.2.6.jar
{hive-home}/lib/datanucleus-core-3.2.10.jar
{hive-home}/lib/datanucleus-rdbms-3.2.9.jar
{hive-home}/lib/javax.jdo-3.2.0-m3.jar
{hive-home}/lib/libfb303-0.9.2.jar
{hive-home}/lib/commons-cli-1.2.jar
{hive-home}/lib/mysql-connector-java-5.1.34.jar
{hive-home}/lib/libthrift-0.9.2.jar
{hive-home}/lib/hive-serde-1.2.1.jar

再次运行报错:

[root@node01 flink-1.9.1]# bin/sql-client.sh embedded
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder;
        at org.apache.flink.table.client.cli.CliOptionsParser.(CliOptionsParser.java:43)
        at org.apache.flink.table.client.SqlClient.main(SqlClient.java:188)

将之前导入的{hive-home}/lib/commons-cli-1.2.jar改为:commons-cli-1.3.1.jar,再次运行成功:

[root@node01 flink-1.9.1]# bin/sql-client.sh embedded
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
No default environment specified.
Searching for '/opt/module/flink-1.9.1/conf/sql-client-defaults.yaml'...found.
Reading default environment from: file:/opt/module/flink-1.9.1/conf/sql-client-defaults.yaml
No session environment specified.
Validating current environment...done.

                                   ▒▓██▓██▒
                               ▓████▒▒█▓▒▓███▓▒
                            ▓███▓░░        ▒▒▒▓██▒  ▒
                          ░██▒   ▒▒▓▓█▓▓▒░      ▒████
                          ██▒         ░▒▓███▒    ▒█▒█▒
                            ░▓█            ███   ▓░▒██
                              ▓█       ▒▒▒▒▒▓██▓░▒░▓▓█
                            █░ █   ▒▒░       ███▓▓█ ▒█▒▒▒
                            ████░   ▒▓█▓      ██▒▒▒ ▓███▒
                         ░▒█▓▓██       ▓█▒    ▓█▒▓██▓ ░█░
                   ▓░▒▓████▒ ██         ▒█    █▓░▒█▒░▒█▒
                  ███▓░██▓  ▓█           █   █▓ ▒▓█▓▓█▒
                ░██▓  ░█░            █  █▒ ▒█████▓▒ ██▓░▒
               ███░ ░ █░          ▓ ░█ █████▒░░    ░█░▓  ▓░
              ██▓█ ▒▒▓▒          ▓███████▓░       ▒█▒ ▒▓ ▓██▓
           ▒██▓ ▓█ █▓█       ░▒█████▓▓▒░         ██▒▒  █ ▒  ▓█▒
           ▓█▓  ▓█ ██▓ ░▓▓▓▓▓▓▓▒              ▒██▓           ░█▒
           ▓█    █ ▓███▓▒░              ░▓▓▓███▓          ░▒░ ▓█
           ██▓    ██▒    ░▒▓▓███▓▓▓▓▓██████▓▒            ▓███  █
          ▓███▒ ███   ░▓▓▒░░   ░▓████▓░                  ░▒▓▒  █▓
          █▓▒▒▓▓██  ░▒▒░░░▒▒▒▒▓██▓░                            █▓
          ██ ▓░▒█   ▓▓▓▓▒░░  ▒█▓       ▒▓▓██▓    ▓▒          ▒▒▓
          ▓█▓ ▓▒█  █▓░  ░▒▓▓██▒            ░▓█▒   ▒▒▒░▒▒▓█████▒
           ██░ ▓█▒█▒  ▒▓▓▒  ▓█                █░      ░░░░   ░█▒
           ▓█   ▒█▓   ░     █░                ▒█              █▓
            █▓   ██         █░                 ▓▓        ▒█▓▓▓▒█░
             █▓ ░▓██░       ▓▒                  ▓█▓▒░░░▒▓█░    ▒█
              ██   ▓█▓░      ▒                    ░▒█▒██▒      ▓▓
               ▓█▒   ▒█▓▒░                         ▒▒ █▒█▓▒▒░░▒██
                ░██▒    ▒▓▓▒                     ▓██▓▒█▒ ░▓▓▓▓▒█▓
                  ░▓██▒                          ▓░  ▒█▓█  ░░▒▒▒
                      ▒▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓  ▓░▒█░
          
    ______ _ _       _       _____  ____  _         _____ _ _            _  BETA   
   |  ____| (_)     | |     / ____|/ __ \| |       / ____| (_)          | |  
   | |__  | |_ _ __ | | __ | (___ | |  | | |      | |    | |_  ___ _ __ | |_ 
   |  __| | | | '_ \| |/ /  \___ \| |  | | |      | |    | | |/ _ \ '_ \| __|
   | |    | | | | | |   <   ____) | |__| | |____  | |____| | |  __/ | | | |_ 
   |_|    |_|_|_| |_|_|\_\ |_____/ \___\_\______|  \_____|_|_|\___|_| |_|\__|
          
        Welcome! Enter 'HELP;' to list all available commands. 'QUIT;' to exit.

配置过程中参考了这篇博客:https://blog.csdn.net/h335146502/article/details/100689010

博主踩坑后的分享非常珍贵,节省了很多时间,具体解决的话需要根据报错类去找相应的hive包。

2. Sql-client简单功能测试

这里仿照官网在Hive中创建一个mytable表:

CREATE TABLE mytable(name string, age int);

测试效果如下:

Flink SQL> show catalogs;
default_catalog
myhive

Flink SQL> use catalog myhive;

Flink SQL> show databases;
default
test_myq

Flink SQL> use test_myq;

Flink SQL> show tables;
mytable
test

Flink SQL> describe mytable;
root
 |-- name: STRING
 |-- age: STRING

Flink SQL> select * from mytable;
                                                                     SQL Query Result (Table)                                                                      
 Table program finished.                                                  Page: Last of 1                                                    Updated: 15:27:34.496 

                      name                       age
                      zane                        17
                      mort                        18


Q Quit                          + Inc Refresh                   G Goto Page                     N Next Page                     O Open Row                      
R Refresh                       - Dec Refresh                   L Last Page                     P Prev Page                     

Flink SQL> insert into mytable values('hadoop', 2);
[INFO] Submitting SQL update statement to the cluster...
[INFO] Table update statement has been successfully submitted to the cluster:
Cluster ID: StandaloneClusterId
Job ID: eea25d95d6c724d483019f9c5a9bf646
Web interface: http://node01:8081

注意:sql-client运行之前需要启动Flink集群,否则会报如下错误:

Caused by: java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: node01/192.168.1.100:8081
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
	... 16 more
Caused by: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: node01/192.168.1.100:8081
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
	at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
	... 6 more
Caused by: java.net.ConnectException: Connection refused
	... 10 more

启动方式:

[root@node01 ~]# cd /opt/module/flink-1.9.1
[root@node01 flink-1.9.1]# bin/start-cluster.sh

在Flink的Web UI中可以看到刚刚执行的任务详情:
Flink-1.9.1集成读写Hive(基于CDH 5.14.2集成成功)_第1张图片
Flink-1.9.1集成读写Hive(基于CDH 5.14.2集成成功)_第2张图片

3. Java提交Flink job demo

代码:

package com.wonders.flink.hive;

import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.typeutils.TupleTypeInfo;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.java.BatchTableEnvironment;
import org.apache.flink.table.catalog.hive.HiveCatalog;

/**
 * 实现Flink集成读写Hive
 */
public class IntegrationHiveHandler {
    public static void main(String[] args) {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        BatchTableEnvironment tableEnv = BatchTableEnvironment.create(env);

        String catalogName     = "myhive";
        String defaultDatabase = "default";
        String hiveConfDir     = "/etc/hive/conf.cloudera.hive/";
        String version         = "1.2.1";
        
        HiveCatalog hive = new HiveCatalog(catalogName, defaultDatabase, hiveConfDir, version);
        tableEnv.registerCatalog(catalogName, hive);

        try {
            tableEnv.useCatalog(catalogName);
            tableEnv.useDatabase("test_myq");

            tableEnv.sqlUpdate("insert into mytable values ('Mao', 6)");
            tableEnv.execute("insert into mytable");

            Table mytable = tableEnv.sqlQuery("select * from mytable");
            //将table转换成DataSet
            // convert the Table into a DataSet of Tuple2 via a TypeInformation
            TupleTypeInfo<Tuple2<String, Integer>> tupleType = new TupleTypeInfo<>(
                    Types.STRING,
                    Types.INT);
            DataSet<Tuple2<String, Integer>> dsTuple = tableEnv.toDataSet(mytable, tupleType);

            dsTuple.print();
            env.execute("TestFlinkHive");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

pom.xml如下:


<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0modelVersion>

    <groupId>com.wonders.flink.hivegroupId>
    <artifactId>flink-hiveartifactId>
    <version>1.0-SNAPSHOTversion>

    
    <repositories>
        <repository>
            <id>clouderaid>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/url>
        repository>
    repositories>

    <properties>
        <flink.version>1.9.1flink.version>
        <cdh-hive.version>1.2.1cdh-hive.version>
        <hadoop.version>2.6.0hadoop.version>
    properties>

    <dependencies>
        
        
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-streaming-scala_2.11artifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>scala-libraryartifactId>
                    <groupId>org.scala-langgroupId>
                exclusion>
                <exclusion>
                    <artifactId>scala-parser-combinators_2.11artifactId>
                    <groupId>org.scala-lang.modulesgroupId>
                exclusion>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-scala_2.11artifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>commons-compressartifactId>
                    <groupId>org.apache.commonsgroupId>
                exclusion>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>

        
        <dependency>
            <groupId>org.slf4jgroupId>
            <artifactId>slf4j-apiartifactId>
            <version>1.7.25version>
        dependency>
        <dependency>
            <groupId>org.slf4jgroupId>
            <artifactId>slf4j-log4j12artifactId>
            <version>1.7.25version>
        dependency>

        
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-table-planner_2.11artifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-table-api-scala-bridge_2.11artifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-table-api-scala_2.11artifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-table-commonartifactId>
            <version>${flink.version}version>
            <exclusions>
                <exclusion>
                    <artifactId>slf4j-apiartifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>

        
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-connector-hive_2.11artifactId>
            <version>${flink.version}version>
            <scope>providedscope>
        dependency>
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-hadoop-compatibility_2.11artifactId>
            <version>${flink.version}version>
            <scope>providedscope>
        dependency>
        
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-shaded-hadoop-2-uberartifactId>
            <version>2.7.5-8.0version>
            <scope>providedscope>
            <exclusions>
                <exclusion>
                    <artifactId>slf4j-log4j12artifactId>
                    <groupId>org.slf4jgroupId>
                exclusion>
            exclusions>
        dependency>

        <dependency>
            <groupId>org.apache.hivegroupId>
            <artifactId>hive-jdbcartifactId>
            <version>${cdh-hive.version}version>
        dependency>

        
        <dependency>
            <groupId>org.apache.flinkgroupId>
            <artifactId>flink-jdbc_2.11artifactId>
            <version>${flink.version}version>
        dependency>
    dependencies>

    <build>
        <plugins>
            
            <plugin>
                <groupId>org.apache.maven.pluginsgroupId>
                <artifactId>maven-compiler-pluginartifactId>
                <version>3.5.1version>
                <configuration>
                    <source>1.8source>
                    <target>1.8target>
                    <encoding>UTF-8encoding>
                configuration>
            plugin>
            
            <plugin>
                <groupId>net.alchim31.mavengroupId>
                <artifactId>scala-maven-pluginartifactId>
                <version>3.2.2version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compilegoal>
                            <goal>testCompilegoal>
                        goals>
                    execution>
                executions>
            plugin>
            
            <plugin>
                <artifactId>maven-assembly-pluginartifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependenciesdescriptorRef>
                    descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>mainClass>
                        manifest>
                    archive>
                configuration>
                <executions>
                    <execution>
                        <id>make-assemblyid>
                        <phase>packagephase>
                        <goals>
                            <goal>singlegoal>
                        goals>
                    execution>
                executions>
            plugin>
        plugins>
    build>

project>

将其打包,使用以下命令执行:

[root@node01 flink-1.9.1]# bin/flink run --class com.wonders.flink.hive.IntegrationHiveHandler /opt/jars/flink-hive-1.0-SNAPSHOT-jar-with-dependencies.jar 

执行结果:

Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
Starting execution of program
(Mao,6)
(mort,18)
(zane,17)
(hadoop,2)
java.lang.RuntimeException: No new data sinks have been defined since the last execution. The last execution refers to the latest call to 'execute()', 'count()', 'collect()', or 'print()'.
        at org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:944)
        at org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:926)
        at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
        at com.wonders.flink.hive.IntegrationHiveHandler.main(IntegrationHiveHandler.java:44)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438)
        at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274)
        at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746)
        at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
        at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
        at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
Program execution finished
Job with JobID 852fa70b9531ac57592654b126cf6175 has finished.
Job Runtime: 3274 ms
Accumulator Results: 
- 7b4b9fe834333b4e95ca28513c06ed12 (java.util.ArrayList) [6 elements]

可以看到成功插入一条记录(‘Mao’, 6),并查询打印出来了。(因为代码里面没有设置Sink所以报错,不影响代码执行逻辑)
Web UI详情:
Flink-1.9.1集成读写Hive(基于CDH 5.14.2集成成功)_第3张图片

参考

Flink Hive Integration

Flink Reading & Writing Hive Tables

你可能感兴趣的:(Flink,Hive)