基本环境资源
Hadoop:2.7.X
Hive:2.1.X.bin.tar.gz 版本
Hive:1.x.src.tar.gz 源码版本
第一步:windows 安装Hadoop2.7.x,请参考:
第二步:下载Hive.tar.gz,官网下载地址:http://archive.apache.org/dist/hive
第二步:解压Hive.tar.gz 至指定文件夹目录(C:\hive),配置Hive 全局环境变量。
Hive 全局环境变量:
第三步:Hive 配置文件(C:\hive\apache-hive-2.1.1-bin\conf)
配置文件目录C:\hive\apache-hive-2.1.1-bin\conf\conf有4个默认的配置文件模板拷贝成新的文件名
hive-default.xml.template -----> hive-site.xml
hive-env.sh.template -----> hive-env.sh
hive-exec-log4j.properties.template -----> hive-exec-log4j2.properties
hive-log4j.properties.template -----> hive-log4j2.properties
第四步: 新建本地目录后面配置文件用到
C:\hive\apache-hive-2.1.1-bin\my_hive
第五步:Hive需要调整的配置文件(hive-site.xml 和hive-env.sh)
编辑C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-site.xml 文件
hive.metastore.warehouse.dir
/user/hive/warehouse
location of default database for the warehouse
hive.exec.scratchdir
/tmp/hive
HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.
hive.exec.local.scratchdir
C:/hive/apache-hive-2.1.1-bin/my_hive/scratch_dir
Local scratch space for Hive jobs
hive.downloaded.resources.dir
C:/hive/apache-hive-2.1.1-bin/my_hive/resources_dir/${hive.session.id}_resources
Temporary local directory for added resources in the remote file system.
hive.querylog.location
C:/hive/apache-hive-2.1.1-bin/my_hive/querylog_dir
Location of Hive run time structured log file
hive.server2.logging.operation.log.location
C:/hive/apache-hive-2.1.1-bin/my_hive/operation_logs_dir
Top level directory where operation logs are stored if logging functionality is enabled
javax.jdo.option.ConnectionURL
jdbc:mysql://192.168.60.178:3306/hive?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true
JDBC connect string for a JDBC metastore.
javax.jdo.option.ConnectionDriverName
com.mysql.cj.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
admini
Username to use against metastore database
javax.jdo.option.ConnectionPassword
123456
password to use against metastore database
hive.metastore.schema.verification
false
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
datanucleus.schema.autoCreateAll
true
Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.
编辑(C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-env.sh 文件)
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=C:\hadoop\hadoop-2.7.6
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=C:\hive\apache-hive-2.1.1-bin\conf
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=C:\hive\apache-hive-2.1.1-bin\lib
第六步:在hadoop上创建hdfs目录
hadoop fs -mkdir /tmp
hadoop fs -mkdir /user/
hadoop fs -mkdir /user/hive/
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse
第七步:创建Hive 初始化依赖的数据库hive,注意编码格式:latin1
第八步:启动Hive 服务
(1)、首先启动Hadoop,执行指令:stall-all.cmd
(2)、Hive 初始化数据,执行指令:hive --service metastore
如果一切正常,cmd 窗口指令显示如下截图
如果Hive 初始化正常,MySQL中Hive 数据库涉及表,如下截图:
(3)、启动Hive服务,执行指令:hive
至此,windows 10 搭建Hive 服务结束。
遇到的问题(1):Hive 执行数据初始化(hive --service metastore),总是报错。
解决思路:通过Hive 自身携带的脚本,完成Hive 数据库的初始化。
Hive 携带脚本的文件位置(C:\hive\apache-hive-2.1.1-bin\scripts\metastore\upgrade),选择执行SQL的版本,如下截图:
选择需要执行的Hive版本(Hive_x.x.x)所对应的sql 版本(hive-schema-x.x.x.mysql.sql)
说明:我选择Hive版本时2.1.1,所以我选项的对应sql 版本hive-schema-2..1.0.mysql.sql 脚本。
遇到的问题(2):Hive 的Hive_x.x.x_bin.tar.gz 版本在windows 环境中缺少 Hive的执行文件和运行程序。
解决版本:下载低版本Hive(apache-hive-1.0.0-src),将bin 目录替换目标对象(C:\hive\apache-hive-2.1.1-bin)原有的bin目录。
截图如下:apache-hive-1.0.0-src\bin 目录 结构