基于mysql的hive安装配置(apache-hive-2.0.1-bin.tar.gz)

引言: Hive是一种强大的数据仓库查询语言,类似SQL,本文将介绍如何搭建Hive的开发测试环境。

1. 什么是Hive?

   hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供简单的sql查询功能,可以将sql语句转换为MapReduce任务进行运行。 其优点是学习成本低,可以通过类SQL语句快速实现简单的MapReduce统计,不必开发专门的MapReduce应用,十分适合数据仓库的统计分析。

2.  按照Hive的准备条件

    2.1  Hadoop集群环境已经安装完毕

 2.2 本文使用CentOS6做为开发环境

3. 安装步骤

    3.1 下载Hive包:apache-hive-2.0.1-bin.tar.gz

    3.2 将其解压到/opt/my目录下

   tar xzvf apache-hive-2.0.1-bin.tar.gz

    3.3 hive名字太长了,缩短hive的名字

        # mv apache-hive-2.0.1  hive

    3.4 设置环境变量

   export HIVE_HOME=/opt/my/hive

   export PATH=$PATH:$HIVE_HOME/bin

   export CLASSPATH=$CLASSPATH:$HIVE_HOME/bin

   3.5. 修改hive-env.xml, 将hive-env.xml.template.改名: mv hive-env.xml.template hive-env.xml

    # Set HADOOP_HOME to point to a specific hadoop install directory

    HADOOP_HOME=/opt/my/hadoop

    # Hive Configuration Directory can be controlled by:

     export HIVE_CONF_DIR=/opt/my/hive/conf

   3.6. 默认hadoop的堆大小为1G,如果有特殊需要,可以调的大一些,我们这里就用默认得就可以了

         export HADOOP_HEAPSIZE=1024

   3.7 修改hive-site.xml,主要修改数据库的连接信息. mv hive-default.xml.template hive-site.xml

   
  hive.metastore.local  
  true   
    
  
  javax.jdo.option.ConnectionURL  
  jdbc:mysql://192.168.1.100:3306/hive?characterEncoding=UTF-8  
  JDBC connect string for a JDBC metastore  
  
  
  
  javax.jdo.option.ConnectionDriverName  
  com.mysql.jdbc.Driver  
  Driver class name for a JDBC metastore  
  
  
  
  javax.jdo.option.ConnectionUserName  
  root
  username to use against metastore database  
  
  
  
  javax.jdo.option.ConnectionPassword  
  hadoop  
  password to use against metastore database  

3.7 保存退出。

3.8 安装mysql作为hive的metastore

      mysql的安装详见我的另一篇文档 http://blog.csdn.net/nengyu/article/details/51615836

同事需要将mysql的连接驱动放入 /hive/lib目录中

 3.9. 我参考了一些网上的文档,到这一步就结束了,就启动hive,不过会报如下错,

Exception in thread "main" java.lang.RuntimeException: Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)

 3.10 我是执行如下:

执行:# schematool -dbType mysql -initSchema


3.11 执行后还会报如下错误:

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
 

 3.12 不要惊慌,在hive目录下创建iotmp文件夹,修改文件hive-site.xml


    hive.querylog.location
    /opt/my/hive/iotmp
    Location of Hive run time structured log file
  
  
  
    hive.exec.local.scratchdir
    
/opt/my/hive
/iotmp Local scratch space for Hive jobs hive.downloaded.resources.dir
/opt/my/hive
/iotmp
Temporary local directory for added resources in the remote file system.

 保存退出。启动hive 
  

[root@Master bin]# ./hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/my/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/my/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/my/hive/lib/hive-common-2.0.1.jar!/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> 

3.13 测试mysql数据库连接hive

         使用mysql客户端工具登录,输入用户名:root,密码hadoop,我在mysql 上建了hive数据库,登录后如下图:

基于mysql的hive安装配置(apache-hive-2.0.1-bin.tar.gz)_第1张图片


真是很奇妙!!!


你可能感兴趣的:(基于mysql的hive安装配置(apache-hive-2.0.1-bin.tar.gz))