Apache Hive2.1.0安装笔记

Hive2.x已经足够稳定了,前面也安装过hive0.x和Hive1.x的版本,今天我们来看下hive2.x如何安装使用。

环境:

centos7.1

Hadoop2.7.3

JDK8

Hive2.1.0

1,首先需要下载hive最新的稳定版本的包,并保证的你Hadoop集群已经是能够正常运行的

http://ftp.kddilabs.jp/infosystems/apache/hive/

2,解压到指定目录

首先进入conf目录把所有带template后缀的文件,给移除后缀,只有hive-default.xml移除后缀后,需要修改名为hive-site.xml。

3,配置hive的log

vi conf/hive-log4j2.properties 

配置下面的2个参数:
property.hive.log.dir = /home/search/hive/logs 
property.hive.log.file = hive.log

4,配置使用MySQL作为元数据存储

关于安装mysql以及分配权限的请参考散仙之前的文章:http://qindongliang.iteye.com/blog/2337865

vi hive-site.xml

配置下面的几项参数
javax.jdo.option.ConnectionURL= jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8
javax.jdo.option.ConnectionUserName=root
javax.jdo.option.ConnectionPassword=pwd
javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver
hive.metastore.warehouse.dir=hdfs://192.168.10.38:8020//user/hive/warehouse

其他的凡是包含 ${system:java.io.tmpdir}变量的统一替代成绝对路径,目录可以在
hive的根目录下建一个tmp目录,统一放进去

最后切记添加mysql的jdbc驱动包到hive/lib的目录下面

说明下上面的jdbc的url里面驱动字符串设置为数据库编码为utf-8此外&符号需要转义

jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8

此外默认hive读取Hbase的lib如果没有安装hbase则会启动不起来: 需要下载hbase然后配置对应的HBASE_HOME,文末会给出所有的环境变量

5,在hive2.x之后,需要先初始化schema如下:

$HIVE_HOME/bin/schematool -initSchema -dbType mysql

注意不执行这个,直接执行hive会报错:

Caused by: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3364)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
        at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)

执行成功打印结果如下:

[search@es1 ~]$ $HIVE_HOME/bin/schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/search/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/search/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       root
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Initialization script completed
schemaTool completed

6,测试集群是否正常

在本地磁盘上新建一个文件a,写入内容如下

1,a
2,b
3,c
4,a
5,a
2,a
4,2
1,a
1,a

编写的create_sql如下:

-- 存在就删除
  drop table if exists info ; 
-- 建表
CREATE TABLE info(id string, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
 -- 加载数据
load data local inpath '/home/search/test_hive/a' into table info;

最后执行脚本,不报错就代表通过了:

hive -f create_sql

Hive2.x之后不推荐使用MR的方式运行任务了,推荐使用Tez或者Spark引擎运行job,但是mr还是支持的

执行下面的语句进行测试

hive -e "select count(*) from info"

运行成功,就代表Hive+Hadoop集成成功。

关于Hive On Tez 集成我下篇文章会介绍。

7,一些环境变量如下:

#JDK
export JAVA_HOME=/home/search/jdk1.8.0_102/
export CLASSPATH=.:$JAVA_HOME/lib
export PATH=$JAVA_HOME/bin:$PATH

#Maven
export MAVEN_HOME=/home/search/apache-maven-3.3.9
export CLASSPATH=$CLASSPATH:$MAVEN_HOME/lib
export PATH=$PATH:$MAVEN_HOME/bin

#Ant
export ANT_HOME=/home/search/ant
export CLASSPATH=$CLASSPATH:$ANT_HOME/lib
export PATH=$PATH:$ANT_HOME/bin

#Hadoop
export HADOOP_HOME=/home/search/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export CLASSPATH=.:$CLASSPATH:$HADOOP_COMMON_HOME:$HADOOP_COMMON_HOME/lib:$HADOOP_MAPRED_HOME:$HADOOP_HDFS_HOME:$HADOOP_HDFS_HOME

#Hbase
export HBASE_HOME=/home/search/hbase
export CLASSPATH=$CLASSPATH:$HBASE_HOME/lib
export PATH=$HBASE_HOME/bin:$PATH

#Pig
export PIG_HOME=/home/search/pig
export PIG_CLASSPATH=$PIG_HOME/lib:$HADOOP_HOME/etc/hadoop
export PATH=/ROOT/server/bigdata/pig/bin:$PATH

#Zookeeper
export ZOOKEEPER_HOME=/home/search/zookeeper
export CLASSPATH=.:$ZOOKEEPER_HOME/lib
export PATH=$PATH:$ZOOKEEPER_HOME/bin

#Hive
export HIVE_HOME=/home/search/hive
export HIVE_CONF_DIR=$HIVE_HOME/conf
export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/conf

#JStorm
export JSTORM_HOME=/home/search/jstorm-2.1.1
export CLASSPATH=$CLASSPATH:$JSTORM_HOME/lib
export PATH=$PATH:$JSTORM_HOME/bin:$PATH

#Scala
export SCALA_HOME=/home/search/scala
export CLASSPATH=.:$SCALA_HOME/lib
export PATH=$PATH:$SCALA_HOME/bin

#Spark
export SPARK_HOME=/ROOT/server/spark
export PATH=$PATH:$SPARK_HOME/bin

你可能感兴趣的:(hive)