最新版本Flink 1.12.0 的sql-cli配置连接yarn-session

一、主要参考见:

https://mp.weixin.qq.com/s/99ehmNzJVwW3cOrw_UkGsg
https://mp.weixin.qq.com/s/YuR-s5zCtBz_5ku_bttbaw
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/hive/#dependencies
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sqlClient.html
https://www.jianshu.com/p/af96d6618854

二、依赖问题

1、添加三个jar包:

  • flink-connector-hive_2.11-1.12.0.jar

  • flink-sql-connector-hive-2.2.0_2.11-1.12.0.jar

  • hive-exec-2.1.1-cdh6.3.1.jar

2、配置flink目录下的conf/sql-client-defaults.yaml文件

#catalogs: [] # empty list
# A typical catalog definition looks like:
catalogs:
   - name: myhive
     type: hive
     hive-conf-dir: /etc/hive/conf
     default-database: default

#配置当前的Catalog和Database:     
current-catalog: myhive
current-database: default 

3、 在添加上面三个jar包的情况下,Standlone模式是能够正常

启动./start-cluster.sh
再./sql-client.sh embedded启动客户端
再输入查询语句就能直接执行

4、如果是不启动Standlone集群,会去连接yarn-session,但是这时候会报错org.apache.hadoop.mapred.JobConf

Caused by: java.lang.ClassNotFoundException: **org.apache.hadoop.mapred.JobConf**
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 52 more

奇怪的地方是我已经在/etc/profile文件中配置如下的环境变量

 export HADOOP_CLASSPATH=`hadoop classpath`

但是没有用,还是缺乏了hadoop的依赖,所以还需要再在lib包里加上hadoop的依赖:

要么是这个:
       flink-shaded-hadoop-2-uber-2.7.5-8.0.jar

要么是:
      hadoop-common-3.0.0-cdh6.3.1.jar
      hadoop-mapreduce-client-common-3.0.0-cdh6.3.1.jar
      hadoop-mapreduce-client-core-3.0.0-cdh6.3.1.jar
      hadoop-mapreduce-client-hs-3.0.0-cdh6.3.1.jar
      hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.1.jar

三、生产的flink sql on hive 实践的坑和优化可以看我另外一篇文档(持续更新 …)

https://blog.csdn.net/weixin_44500374/article/details/112610629

你可能感兴趣的:(Flink,flink,hive,大数据)