Apache Hive是一个建立在Hadoop架构之上的数据仓库。它能够提供数据的精炼,查询和分析。Hadoop之前已经安装好了(Hadoop database安装手册),本文主要描述如何安装配置hive。Hive框架如下图所示:
Hive 2.3.2 Installation Guide_第1张图片
一、MySQL配置
Metastore(元数据存储)是一个独立的关系型数据库,Hive会在其中保存表模式和其他系统元数据,这里使用MySQL作为hive的元数据存储(metastore)。

root@mydb01 ~]# mysql -U smsqw -p
mysql> drop database metadb;
mysql> create database metadb;
mysql> grant all on metadb.* to smsqw@'%' identified by 'abcABC@12';
mysql> flush privileges; 

二、Hive安装配置
1、Hive下载与安装

hadoop@bdi:~$ cd /u01/software
hadoop@bdi:/u01/software$ wget http://mirrors.hust.edu.cn/apache/hive/stable-2/apache-hive-2.3.2-bin.tar.gz
hadoop@bdi:/u01/software$ cd ../
hadoop@bdi:/u01$ mv apache-hive-2.3.2-bin hive

2、Hive环境变量设置
编辑.bashr文件,加入以下内容:

hadoop@bdi:~$ vi .bashrc
export HIVE_HOME=/u01/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/u01/hive/lib/*:.
hadoop@bdi:~$ source .bashrc

3、Hadoop上创建文件夹

hadoop@bdi:~$ hadoop fs -mkdir /tmp 
hadoop@bdi:~$ hadoop fs -mkdir -p /user/hive/warehouse
hadoop@bdi:~$ hadoop fs -chmod g+w /tmp 
hadoop@bdi:~$ hadoop fs -chmod g+w /user/hive/warehouse
hadoop@bdi:~$ hadoop fs -ls /
Found 3 items
drwxr-xr-x   - hadoop supergroup          0 2017-12-05 14:48 /hbase
drwxr-xr-x   - hadoop supergroup          0 2017-12-06 10:07 /user
drwxrwx---   - hadoop supergroup          0 2017-12-05 10:13 /tmp
hadoop@bdi:~$ hadoop fs -ls /user/hive
Found 1 items
drwxrwxr-x   - hadoop supergroup          0 2017-12-06 10:07 /user/hive/warehouse

4、配置Hive
编辑hive-env.sh文件,加入以下内容:

hadoop@bdi:~$ cd $HIVE_HOME
hadoop@bdi:/u01/hive/conf$ cp hive-env.sh.template hive-env.sh
hadoop@bdi:/u01/hive/conf$ vi hive-env.sh
......
export HADOOP_HOME=/u01/hadoop

创建hive-site.xml文件,修改内容如下:

hadoop@bdi:/u01/hive/conf$ cp hive-default.xml.template hive-site.xml
hadoop@bdi:/u01/hive/conf$ vi hive-site.xml

Hive 2.3.2 Installation Guide_第2张图片
以上配置中包含了临时工作目录hive.exec.scratchdir,metastore的工作目录hive.metastore.warehouse.dir,以及存储Hive的元数据和表模式的MySQL数据库的配置。
另外创建一个临时目录,替换掉hive-site.xml的所有包含 ${system:Java.io.tmpdir} 字段的 value,即路径。否则,在执行hive命令时会报错,如下图所示:
hadoop@bdi:~$ mkdir $HIVE_HOME/iotmp
hadoop@bdi:~$ vi $HIVE_HOME/conf/hive-site.xml
Hive 2.3.2 Installation Guide_第3张图片
5、初始化metastore schema
这里使用schematool离线工具来初始化,如下:

hadoop@bdi:/u01/hive/lib$ schematool -dbType mysql -initSchema -verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/u01/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://192.168.120.92:3306/metadb?useSSL=false
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       smsqw
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Connecting to jdbc:mysql://192.168.120.92:3306/metadb?useSSL=false
Connected to: MySQL (version 5.7.20)
Driver: MySQL Connector Java (version mysql-connector-java-5.1.44 ( Revision: b3cda4f864902ffdde495b9df93937c3e20009be ))
Transaction isolation: TRANSACTION_READ_COMMITTED
......
0: jdbc:mysql://192.168.120.92:3306/metadb> !closeall
Closing: 0: jdbc:mysql://192.168.120.92:3306/metadb?useSSL=false
beeline> 
beeline> Initialization script completed
schemaTool completed

6、Hive测试验证
到此,准备工作已经完成,接下来可以通过Hive CLI来验证配置是否正确。创建数据库以及表,并显示表的schema信息。

hadoop@bdi:~$ hive --service metastore &
hadoop@bdi:~$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/u01/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/u01/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true
hive> show databases;
OK
default
Time taken: 5.321 seconds, Fetched: 1 row(s)
hive> create database if not exists hivedb;  
OK
Time taken: 0.399 seconds
hive> create table hivedb.htb01 (id int,name string);
OK
Time taken: 0.23 seconds
hive> use hivedb;
OK
Time taken: 0.027 seconds
hive> show tables;
OK
htb01
Time taken: 0.037 seconds, Fetched: 1 row(s)
hive> show create table hivedb.htb01;
OK
CREATE TABLE `hivedb.htb01`(
  `id` int, 
  `name` string)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://192.168.120.95:9000/user/hive/warehouse/hivedb.db/htb01'
TBLPROPERTIES (
  'transient_lastDdlTime'='1512562710')
Time taken: 0.215 seconds, Fetched: 13 row(s)

参考文献:
1、Hive Schema Tool
2、Tutorialspoint之Hive Installation
3、Hive常见问题汇总