Hive安装并使用MySQL存储元数据

Hive默认使用derby存储元数据
derby存在什么缺陷
1.derby不能多个客户端登录
2.derby登录必须在相同目录下,否则可能会找不到所创建的表。
比如在/hive目录下启动hive程序,那么所创建的表就会存储在/hive下面保存。如果在/home下面,所创建的表就会在/home下面保存。
一下操作使用Hadoop用户
下载解压到hive的目录
3. 替换jar包,与hbase0.96和hadoop2.2版本一致

由于我们下载的hive是基于hadoop1.3和hbase0.94的,所以必须进行替换,因为我们的hbse0.96是基于hadoop2.2的,所以我们必须先解决hive的hadoop版本问题,目前我们从官网下载的hive都是用1.几的版本编译的,因此我们需要自己下载源码来用hadoop2.X的版本重新编译hive,这个过程也很简单,只需要如下步骤:

(1)进入/usr/hive/lib

上面只是截取了一部分:

(2)同步hbase的版本
先cd到hive0.12.0/lib下,将hive-0.12.0/lib下hbase-0.94开头的那两个jar包删掉,然后从/home/hadoop/hbase-0.96.0-hadoop2/lib下hbase开头的包都拷贝过来

find /usr/hbase/hbas/lib -name “hbase*.jar”|xargs -i cp {} ./

(3)基本的同步完成了
重点检查下zookeeper和protobuf的jar包是否和hbase保持一致,如果不一致,
拷贝protobuf.**.jar和zookeeper-3.4.5.jar到hive/lib下。

(4)用mysql当原数据库,
找一个mysql的jdbcjar包mysql-connector-java-5.1.10-bin.jar也拷贝到hive-0.12.0/lib下

可以通过下面命令来查找是否存在

如果不存在则下载:

链接: http://pan.baidu.com/s/1gdCDoGj 密码: 80yl

注意 mysql-connector-java-5.1.10-bin.jar

修改权限为777 (chmod 777 mysql-connector-java-5.1.10-bin.jar)

还有,看一下hbase与hive的通信包是否存在:

可以通过下面命令:

aboutyun@master:/usr/hive/lib$ find -name hive-hbase-handler*
./hive-hbase-handler-0.13.0-SNAPSHOT.jar
不存在则下载:

链接: http://pan.baidu.com/s/1gd9p0Fh 密码: 94g1

  1. 安装mysql
    • Ubuntu 采用apt-get安装
    • sudo apt-get install mysql-server
    • 建立数据库hive
    • create database hivemeta
    • 创建hive用户,并授权
    • grant all on hive.* to hive@’%’ identified by ‘hive’;
    • flush privileges;
    对于mysql的安装不熟悉,可以参考:

Ubuntu下面卸载以及安装mysql

http://www.aboutyun.com/thread-7788-1-1.html

上面命令解释一下:
• sudo apt-get install mysql-server安装数据服务器,如果想尝试通过其他客户端远程连接,则还需要安装mysql-client

• create database hivemeta
这个使用来存储hive元数据,所创建的数据库

• grant all on hive.* to hadoop@’%’ identified by ‘123456’; 这个是授权,还是比较重要的,否则hive客户端远程连接会失败
里面的内容不要照抄:需要根据自己的情况来修改。上面的用户名和密码都为hive。

如果连接不成功尝试使用root用户
1. grant all on hive.* to ‘root’@’%’identified by ‘123456’;
2. flush privileges;
复制代码


  1. 修改hive-site文件配置:

下面配置需要注意的是:
(1)使用的是mysql的root用户,密码为123,如果你是用的hadoop,把用户名和密码该为hadoop即可:

(2)hdfs新建文件并授予权限

对于上面注意

bin/hadoop fs -mkdir /hive/warehouse
bin/hadoop fs -mkdir /hive/scratchdir
bin/hadoop fs -chmod g+w /hive/warehouse
bin/hadoop fs -chmod g+w /hive/scratchdir

(3)hive.aux.jars.path切忌配置正确
不能有换行或则空格。特别是换行,看到很多文章都把他们给分开了,这对很多新手是一个很容易掉进去的陷阱。


2.    hive.aux.jars.path
3.    <value>file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,file:///usr/hive/lib/protobuf-java-2.5.0.jar,file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,file:///usr/hive/lib/zookeeper-3.4.5.jar,file:///usr/hive/lib/guava-11.0.2.jar
4.  

上面问题解决,把下面内容放到hive-site文件即可

这里介绍两种配置方式,一种是远程配置,一种是本地配置。最好选择远程配置

远程配置

1.  <configuration>
2.  <property>
3.    <name>hive.metastore.warehouse.dirname>
4.    <value>hdfs://master:8020/hive/warehousevalue>
5.  property>
6.  <property>
7.    <name>hive.exec.scratchdirname>
8.    <value>hdfs://master:8020/hive/scratchdirvalue>
9.  property>
10. <property>
11.   <name>hive.querylog.locationname>
12.   <value>/usr/hive/logsvalue>
13. property>
14. <property>  
15.   <name>javax.jdo.option.ConnectionURLname>  
16.   <value>jdbc:mysql://172.16.77.15:3306/hiveMeta?createDatabaseIfNotExist=truevalue>  
17. property>  
18. <property>  
19.   <name>javax.jdo.option.ConnectionDriverNamename>  
20.   <value>com.mysql.jdbc.Drivervalue>  
21. property>  
22. <property>  
23.   <name>javax.jdo.option.ConnectionUserNamename>  
24.   <value>hivevalue>  
25. property>  
26. <property>  
27.   <name>javax.jdo.option.ConnectionPasswordname>  
28.   <value>hivevalue>  
29. property> 
30. <property>
31.   <name>hive.aux.jars.pathname>
32.   <value>file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,file:///usr/hive/lib/protobuf-java-2.5.0.jar,file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,file:///usr/hive/lib/zookeeper-3.4.5.jar,file:///usr/hive/lib/guava-11.0.2.jarvalue>
33. property>
34. <property>
35.   <name>hive.metastore.urisname>  
36.   <value>thrift://172.16.77.15:9083value>  
37. property>  
38. configuration>

本地配置:

1.  <configuration>  
2.  <property>  
3.    <name>hive.metastore.warehouse.dirname>  
4.    <value>/user/hive_remote/warehousevalue>  
5.  property>  
6.  
7.  <property>  
8.    <name>hive.metastore.localname>  
9.    <value>truevalue>  
10. property>  
11. 
12. <property>  
13.   <name>javax.jdo.option.ConnectionURLname>  
14.   <value>jdbc:mysql://localhost/hive_remote?createDatabaseIfNotExist=truevalue>  
15. property>  
16. 
17. <property>  
18.   <name>javax.jdo.option.ConnectionDriverNamename>  
19.   <value>com.mysql.jdbc.Drivervalue>  
20. property>  
21. 
22. <property>  
23.   <name>javax.jdo.option.ConnectionUserNamename>  
24.   <value>rootvalue>  
25. property>  
26. 
27. <property>  
28.   <name>javax.jdo.option.ConnectionPasswordname>  
29.   <value>123value>  
30. property>  
31. configuration>  

  1. 修改其它配置:

1.修改hadoop的hadoop-env.sh(否则启动hive汇报找不到类的错误)
export HIVE_CONF_DIR=/usr/hive/conf

2.修改$HIVE_HOME/bin的hive-config.sh,增加以下三行
export JAVE_HOME=XXXXX
export HIVE_HOME=XXXXX
export HADOOP_HOME=XXXXX

然后我们首先启动元数据库
1. hive –service metastore -hiveconf hive.root.logger=DEBUG,console &
&方便退出命令行
记得这里它会卡住不动,不用担心,这里已经启动成功。

然后我们在启动客户端
1. hive
复制代码

这样hive就安装成功了

首先说一些遇到的各种问题
1.遇到的问题

问题1:元数据库未启动
这里首先概括一下,会遇到的问题。首先需要启动元数据库,通过下面命令:
(1)hive –service metastore
(2)hive –service metastore -hiveconf hive.root.logger=DEBUG,console

注释:
-hiveconf hive.root.logger=DEBUG,console命令的含义是进入debug模式,便于寻找错误

如果不启用元数据库,而是使用下面命令

  1. hive
    复制代码

你会遇到下面错误

  1. Exception in thread “main” java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
  2. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:295)
  3. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:679)
  4. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
  5. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  6. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  7. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  8. at java.lang.reflect.Method.invoke(Method.java:606)
  9. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
  10. Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
  11. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1345)
  12. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
  13. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
  14. at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2420)
  15. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2432)
  16. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:289)
  17. … 7 more
  18. Caused by: java.lang.reflect.InvocationTargetException
  19. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  20. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  21. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  22. at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  23. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1343)
  24. … 12 more
  25. Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException:

    1. 27.
  26. java.net.ConnectException: Connection refused
  27. at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
  28. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:288)
  29. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:169)
  30. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  31. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  32. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  33. at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  34. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1343)
  35. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
  36. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
  37. at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2420)
  38. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2432)
  39. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:289)
  40. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:679)
  41. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
  42. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  43. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  44. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  45. at java.lang.reflect.Method.invoke(Method.java:606)
  46. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
  47. Caused by: java.net.ConnectException: Connection refused
  48. at java.net.PlainSocketImpl.socketConnect(Native Method)
  49. at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
  50. at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
  51. at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
  52. at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  53. at java.net.Socket.connect(Socket.java:579)
  54. at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
  55. … 19 more
  56. )
  57. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:334)
  58. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:169)
  59. … 17 more
    复制代码

问题2:元数据库启动状态是什么样子的

  1. hive –service metastore
  2. Starting Hive Metastore Server
  3. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
  4. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
  5. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
  6. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
  7. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
  8. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
  9. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
    复制代码

刚开始遇到这种情况,我知道是因为可能没有配置正确,这个耗费了很长时间,一直没有找到正确的解决方案。当再次执行

hive –service metastore
命令的时候报4083端口被暂用: 报错如下红字部分。表示9083端口已经被暂用,也就是说客户端已经和主机进行了通信,当我在进行输入hive命令的时候,进入下面图1界面

图1
1. Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
2. at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:93)
3. at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:75)
4. at org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.(TServerSocketKeepAlive.java:34)
5. at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4291)
6. at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4248)
7. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
8. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
9. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
10. at java.lang.reflect.Method.invoke(Method.java:606)
11. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
12. Exception in thread “main” org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
13. at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:93)
14. at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:75)
15. at org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.(TServerSocketKeepAlive.java:34)
16. at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4291)
17. at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4248)
18. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
19. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
20. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
21. at java.lang.reflect.Method.invoke(Method.java:606)
22. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
复制代码
对于端口的暂用,可以采用下面命令杀掉进程

  1. netstat -ap|grep 4083
    复制代码
    上面主要的作用是查出暂用端口的进程id,然后使用下面命令杀掉进程即可
  2. kill -9 进程号
    复制代码
    详细可以查看下面内容:
    使用配置hadoop中常用的Linux命令

问题3:hive.aux.jars.path配置中含有看换行或则空格,报错如下

错误表现1:/usr/hive/lib/hbase-client-0.96.0-
hadoop2.jar
整个路径错位,导致系统不能识别,这个错位,其实就是换行。
1.
2. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
3.
4.
5. java.io.FileNotFoundException: File does not exist: hdfs://hydra0001/opt/module/hive-0.10.0-cdh4.3.0/lib/hive-builtins-0.10.0-cdh4.3.0.jar
6. 2014-05-24 19:32:06,563 ERROR exec.Task (SessionState.java:printError(440)) - Job Submission failed with exception ‘java.io.FileNotFoundException(File file:/usr/hive/lib/hbase-client-0.96.0-
7. hadoop2.jar does not exist)’
8. java.io.FileNotFoundException: File file:/usr/hive/lib/hbase-client-0.96.0-
9. hadoop2.jar does not exist
10. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
11. at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
12. at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
13. at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
14. at org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)
15. at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)
16. at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
17. at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
18. at org.apache.hadoop.mapreduce.Job 10.run(Job.java:1268)19.atorg.apache.hadoop.mapreduce.Job 10.run(Job.java:1265)
20. at java.security.AccessController.doPrivileged(Native Method)
21. at javax.security.auth.Subject.doAs(Subject.java:415)
22. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
23. at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
24. at org.apache.hadoop.mapred.JobClient 1.run(JobClient.java:562)25.atorg.apache.hadoop.mapred.JobClient 1.run(JobClient.java:557)
26. at java.security.AccessController.doPrivileged(Native Method)
27. at javax.security.auth.Subject.doAs(Subject.java:415)
28. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
29. at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
30. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
31. at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
32. at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
33. at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:152)
34. at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
35. at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1481)
36. at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1258)
37. at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1092)
38. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
39. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
40. at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
41. at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
42. at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
43. at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
44. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
45. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
46. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
47. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
48. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
49. at java.lang.reflect.Method.invoke(Method.java:606)
50. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
51.
52.
53. 2014-05-24 19:32:06,571 ERROR ql.Driver (SessionState.java:printError(440)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
复制代码

错误表现2:

  1. hive.aux.jars.path
  2. file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,
  3. file:///usr/hive/lib/protobuf-java-2.5.0.jar,
  4. file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,
  5. file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,
  6. file:///usr/hive/lib/zookeeper-3.4.5.jar,
  7. file:///usr/hive/lib/guava-11.0.2.jar

  8. 复制代码

上面看那上去很整洁,但是如果直接复制到配置文件中,就会产生下面错误。

  1. Caused by: java.net.URISyntaxException: Illegal character in scheme name at index 0:
  2. file:///usr/hive/lib/protobuf-java-2.5.0.jar
  3. at java.net.URI$Parser.fail(URI.java:2829)
  4. at java.net.URI$Parser.checkChars(URI.java:3002)
  5. at java.net.URI$Parser.checkChar(URI.java:3012)
  6. at java.net.URI$Parser.parse(URI.java:3028)
  7. at java.net.URI.(URI.java:753)
  8. at org.apache.hadoop.fs.Path.initialize(Path.java:203)
  9. … 37 more
  10. Job Submission failed with exception ‘java.lang.IllegalArgumentException(java.net.URISyntaxException: Illegal character in scheme name at index 0:
  11. file:///usr/hive/lib/protobuf-java-2.5.0.jar)’
  12. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    复制代码

验证hive与hbase的整合:
一、启动hbase与hive
启动hbase

  1. hbase shell
    复制代码

启动hive
(1)启动元数据库

  1. CREATE TABLE hbase_table_1(key int, value string) STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES (“hbase.columns.mapping” = “:key,cf1:val”) TBLPROPERTIES (“hbase.table.name” = “xyz”);
    复制代码
    上面的含义是在hive中建表hbase_table_1,通过org.apache.hadoop.hive.hbase.HBaseStorageHandler这个类映射,在hbase建立与之对应的xyz表。
    (1)执行这个语句之前:
    首先查看hbase与hive:
    hbase为空:

hive为空

(2)执行
1. CREATE TABLE hbase_table_1(key int, value string) STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES (“hbase.columns.mapping” = “:key,cf1:val”) TBLPROPERTIES (“hbase.table.name” = “xyz”);
复制代码

(3)对比发生变化
hbase显示新建表xyz

hive显示新建表hbase_table_1

三、验证整合,在hbase插入表

(1)通过hbase添加数据
在hbase中插入一条记录:
1. put ‘xyz’,’10001’,’cf1:val’,’www.aboutyun.com’
复制代码

分别查看hbase与hive表发生的变化:
(1)hbase变化

(2)hive变化

(2)通过hive添加数据
对于网上流行的通过pokes表,插入这里没有执行成功,通过网上查询,可能是hive0.12的一个bug.详细可以查看:

  1. INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes;
  2. Total MapReduce jobs = 1
  3. Launching Job 1 out of 1
  4. Number of reduce tasks is set to 0 since there’s no reduce operator
  5. java.lang.IllegalArgumentException: Property value must not be null
  6. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
  7. at org.apache.hadoop.conf.Configuration.set(Configuration.java:810)
  8. at org.apache.hadoop.conf.Configuration.set(Configuration.java:792)
  9. at org.apache.hadoop.hive.ql.exec.Utilities.copyTableJobPropertiesToConf(Utilities.java:1996)
  10. at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:864)
  11. at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67)
  12. at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
  13. at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342)
  14. at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
  15. at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
  16. at java.security.AccessController.doPrivileged(Native Method)
  17. at javax.security.auth.Subject.doAs(Subject.java:415)
  18. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
  19. at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
  20. at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
  21. at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
  22. at java.security.AccessController.doPrivileged(Native Method)
  23. at javax.security.auth.Subject.doAs(Subject.java:415)
  24. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
  25. at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
  26. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
  27. at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
  28. at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
  29. at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:152)
  30. at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
  31. at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1481)
  32. at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1258)
  33. at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1092)
  34. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
  35. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
  36. at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
  37. at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
  38. at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
  39. at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
  40. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
  41. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
  42. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  43. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  44. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  45. at java.lang.reflect.Method.invoke(Method.java:606)
  46. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
  47. Job Submission failed with exception ‘java.lang.IllegalArgumentException(Property value must not be null)’
  48. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    复制代码

网上找了很多资料,这个可能是一个bug,在hive0.13.0已经修复。
详细见:
https://issues.apache.org/jira/browse/HIVE-5515

转自http://www.aboutyun.com/thread-7881-1-1.html

你可能感兴趣的:(hadoop集群配置)