Pentaho Kettle 连接 CDH Hive (No suitable driver found for jdbc:hive2 错误解决方法)

关键点: 

清理Kettle缓存:


 
   
   
   
   
  1. rm -rf  /home/user/ data-integration/./system/karaf/caches
  2. rm -rf  /home/user/ data-integration/./system/karaf/ data

karaf 是Kettle用于实现插件的一个组件, 比如一些大数据有关的shim都算做kettle的插件

 

配置Kettle big data setting:

在 Kettle安装目录/data-integration/plugins/pentaho-big-data-plugin/plugin.properties 中找到配置项

active.hadoop.configuration=cdh513

这里的"cdh513"就是 Kettle安装目录/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations 下的子目录名字, 又称shim, 它相当于hadoop不同版本的驱动

默认有4个shim, 分别对应hadoop的4个发行版, 用哪个就在上述的plugin.properties里配置好

配置shim的Hadoop setting文件:

Kettle安装目录/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh513/下6个xml文件, 如core-site.xml, hbase-site.xml, hdfs-site.xml, yarn-site.xml, hive-site.xml, mapred-site.xml 

或者直接从集群那里拷贝覆盖.(CDP各组件配置文件路径: Hadoop: /etc/hadoop/conf,  hbase: /etc/hbase/conf, hive: /etc/hive/conf)

 

拷贝jar包:

运行kitchen或pan的时候如果报错: no suitable driver found for jdbc:hive2

可以复制一遍jar包到kettle的lib目录下 以及 active shim的lib目录下

CDH的Hive在  /opt/cloudera/parcels/CDH/lib/hive

可以把/opt/cloudera/parcels/CDH/lib/hive/lib下所有hive开头的jar包复制到 Kettle安装目录/data-integration/lib 和 Kettle安装目录/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh513/lib

例如(这里假设kettle安装在/opt下, 且当前生效的shim叫cdh513)


 
   
   
   
   
  1. cp /opt/cloudera/parcels/CDH/ lib/hive/ lib/hive*.jar /opt/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh513/ lib
  2. cp /opt/cloudera/parcels/CDH/ lib/hive/ lib/hive*.jar /opt/data-integration/ lib

然后再清理一遍kettle的缓存, 否则Kettle可能会不识别刚才拷贝的jar文件:


 
   
   
   
   
  1. rm -rf  /home/fr-renjie.wei/ data-integration/./system/karaf/caches
  2. rm -rf  /home/fr-renjie.wei/ data-integration/./system/karaf/ data

 

No suitable driver found for jdbc:hive2

似乎linux上的Kettle本身在调用Hive jar包的过程中有什么bug, 这个问题经常出现, 网上也有很多人问到这个bug.

我遇到情况是, Kitchen调用job会报这个错, pan不会报错.

我的解决办法: 

  1. 清理一次缓存(rm -rf /home/user/data-integration/./system/karaf/caches)
  2. 运行kitchen

还有一个一劳永逸的办法, 直接改Kitchen.sh, 加上rm这句


 
   
   
   
   
  1. #!/bin/sh
  2. # *****************************************************************************
  3. #
  4. # Pentaho Data Integration
  5. #
  6. # Copyright (C) 2005-2018 by Hitachi Vantara : http://www.pentaho.com
  7. #
  8. # *****************************************************************************
  9. #
  10. # Licensed under the Apache License, Version 2.0 (the "License");
  11. # you may not use this file except in compliance with
  12. # the License. You may obtain a copy of the License at
  13. #
  14. # http://www.apache.org/licenses/LICENSE-2.0
  15. #
  16. # Unless required by applicable law or agreed to in writing, software
  17. # distributed under the License is distributed on an "AS IS" BASIS,
  18. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  19. # See the License for the specific language governing permissions and
  20. # limitations under the License.
  21. #
  22. # *****************************************************************************
  23. INITIALDIR= "`pwd`"
  24. BASEDIR= "`dirname $0`"
  25. cd "$BASEDIR"
  26. DIR= "`pwd`"
  27. cd - > /dev/null
  28. rm -rf $BASEDIR/./system/karaf/caches #add this!
  29. if [ "$1" = "-x" ]; then
  30. set LD_LIBRARY_PATH= $LD_LIBRARY_PATH: $BASEDIR/lib
  31. export LD_LIBRARY_PATH
  32. export OPT= "-Xruntracer $OPT"
  33. shift
  34. fi
  35. export IS_KITCHEN= "true"
  36. "$DIR/spoon.sh" -main org.pentaho.di.kitchen.Kitchen -initialDir "$INITIALDIR/" "$@"

 

你可能感兴趣的:(ETL,kettle)