1.安装hadoop
2.从maven中下载mysql-connector-java-5.1.26-bin.jar(或其他jar版本)放在hive目录下的lib文件夹
3.配置hive环境变量,HIVE_HOME=F:\hadoop\apache-hive-2.1.1-bin
4.hive配置
hive的配置文件放在$HIVE_HOME/conf下,里面有4个默认的配置文件模板
hive-default.xml.template 默认模板
hive-env.sh.template hive-env.sh默认配置
hive-exec-log4j.properties.template exec默认配置
hive-log4j.properties.template log默认配置
可不做任何修改hive也能运行,默认的配置元数据是存放在Derby数据库里面的,大多数人都不怎么熟悉,我们得改用mysql来存储我们的元数据,以及修改数据存放位置和日志存放位置等使得我们必须配置自己的环境,下面介绍如何配置。
(1)创建配置文件
$HIVE_HOME/conf/hive-default.xml.template -> $HIVE_HOME/conf/hive-site.xml
$HIVE_HOME/conf/hive-env.sh.template -> $HIVE_HOME/conf/hive-env.sh
$HIVE_HOME/conf/hive-exec-log4j.properties.template -> $HIVE_HOME/conf/hive-exec-log4j.properties
$HIVE_HOME/conf/hive-log4j.properties.template -> $HIVE_HOME/conf/hive-log4j.properties
(2)修改 hive-env.sh
export HADOOP_HOME=F:\hadoop\hadoop-2.7.2
export HIVE_CONF_DIR=F:\hadoop\apache-hive-2.1.1-bin\conf
export HIVE_AUX_JARS_PATH=F:\hadoop\apache-hive-2.1.1-bin\lib
(3)修改 hive-site.xml
hive.metastore.warehouse.dir
/user/hive/warehouse
location of default database for the warehouse
hive.exec.scratchdir
/tmp/hive
HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/ is created, with ${hive.scratch.dir.permission}.
hive.exec.local.scratchdir
F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp
Local scratch space for Hive jobs
hive.downloaded.resources.dir
F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp
Temporary local directory for added resources in the remote file system.
hive.querylog.location
F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp
Location of Hive run time structured log file
hive.server2.logging.operation.log.location
F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp/operation_logs
Top level directory where operation logs are stored if logging functionality is enabled
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
root
datanucleus.autoCreateSchema
true
datanucleus.autoCreateTables
true
datanucleus.autoCreateColumns
true
hive.metastore.schema.verification
false
Enforce metastore schema version consistency.
True: Verify that version information stored in metastore matches with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
注:需要事先在hadoop上创建hdfs目录
./hadoop fs -mkdir /tmp
./hadoop fs -mkdir /user/hive/warehouse
./hadoop fs -chmod g+w /tmp
./hadoop fs -chmod g+w /user/hive/warehouse
(4)日志文件配置 略
5.MySQL设置
(1)创建hive数据库: create database hive default character set latin1;
(2)grant all on hive.* to hive@'localhost' identified by 'hive';
flush privileges;
--本人用的是root用户,所以这步省略
6.
(1)启动hadoop:start-all.cmd
(2)启动metastore服务:hive --service metastore
(3)启动Hive:hive
若Hive成功启动,Hive本地模式安装完成。
7、查看mysql数据库
use hive; show tables;
8.在hive下建一张表:CREATE TABLE xp(id INT,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
9.在MySQL中查看:select * from TBLS
安装过程中遇到的问题
(1)hive启动时报错Required table missing : "`VERSION`" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations