Hive的介绍,这里就不在说了,百度搜一下就有很多了。
下面我重点说一下一个比较完善的hadoop集群上的Hive配置;
1、Hive的官网下载地址http://mirror.bit.edu.cn/apache/hive/
2、解压,并重名hive
### 使用wget 命令,将压缩包下载到本地上
wget http://mirror.bit.edu.cn/apache/hive/hive-2.3.6/apache-hive-2.3.6-bin.tar.gz
tar -zxvf apache-hive-2.3.6-bin.tar.gz -C /usr/local/
## 重命名
mv apache-hive-2.3.6-bin hive
3、修改环境变量/etc/profile
这里我要加下一个小提示,就是下面的配置文件,“=”两边不能有空格,不然会配置不成功
[root@master local]# vi /etc/profile
# Hive setting
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
## 生成环境变量
[root@master local]# source /etc/profile
4、检查是否配置成果
此时已经配置成功了
[root@master local]# hive --version
Hive 2.3.6
Git git://HW13934/Users/gates/tmp/hive-branch-2.3/hive -r 2c2fdd524e8783f6e1f3ef15281cc2d5ed08728f
Compiled by gates on Tue Aug 13 11:56:35 PDT 2019
From source with checksum c44b0b7eace3e81ba3cf64e7c4ea3b19
5、这里我先暂停,不继续说hive的配置。
如果已经再服务器的本地配置了,mysql或者mariadb,你可以跳过。如果还没有,我个人建议直接安装mariadb就好了,因为目前mysql已经闭源了;
mariadb 和 mysql 一样。而且mariadb是开源的
5.1安装数据库mariadb及初始化
## 安装 mariadb 数据库
[root@master ~]# yum -y install mariadb-server mariadb
## 数据库初始化
[root@master ~]# mysql_secure_installation
## 启动数据库,并设置开机自启
[root@master ~]# systemctl restart mariadb
[root@master ~]# systemctl enable mariadb
如果不知道,mysql_secure_installation 初始化如何使用的,可以百度看下这个命令
5.2 初始化完毕,建议再数据库先建立一个hive 的数据库,为后面的配置做准备;
MariaDB [(none)]> create hive;
创建好就行;
6、配置hive-site.xml
到了这里是很容易出错,尤其再初始化的时候。如果按照下面操作,保证你一次成功
初始的时候,hive/conf 目录并没有hive-site.xml 文件,只有一个模板文件
[root@master conf]# pwd
/usr/local/hive/conf
[root@master conf]# cp hive-default.xml.template hive-site.xml
6.1 配置 hive-site.xml 文件。
这里我个人建议先把 模板里的内容全部删除,以免影响待会的配置。有需要的话,再从hive-default.xml.template 文件赋值过来就行
删除到剩下这些内容就行;
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
</configuration>
6.2 添加配置到 hive-site.xml 文件
我知道大家喜欢复制,复制没事,但是里面的参数,移动要修改成自己的。不要直接复制了。
注意看我的shell里的注解
因为如果这里没有配置好,待会初始话,就会报错
"1.0" encoding="UTF-8" standalone="no"?>
-stylesheet type="text/xsl" href="configuration.xsl"?>
## 数据库的用户名
## 如果自己的用户名不是 root,一定要修改成自己设定的!!!
javax.jdo.option.ConnectionUserName</name>
root</value>
</property>
## 数据库的密码
## 如果自己的密码不是 123456,一定要修改成自己设定的!!!
javax.jdo.option.ConnectionPassword</name>
123456</value>
</property>
## 数据库的连接驱动地址及端口号
## 这里我多说几句,一定要用localhost,localhost,localhost
## 重要的事情说了三次。
javax.jdo.option.ConnectionURL</name>
jdbc:mysql://localhost:3306/hive</value>
</property>
javax.jdo.option.ConnectionDriverName</name>
com.mysql.jdbc.Driver</value>
</property>
hive.metastore.schema.verification</name>
false</value>
</property>
</configuration>
7、初始化数据库
7.1先启动hadoop集群和数据库,,同时数据库里要有hive数据库。
7.2 只有执行完 7.1 步骤才可以执行7.2步骤!!!
1)目录一定要对
2)建议用 ./schematool 启动
[root@master bin]# pwd
/usr/local/hive/bin
## 初始化数据库
[root@master bin]# ./schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost:3306/hive
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
[root@master bin]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/java/jdk1.8.0_231/bin:/usr/local/java/jdk1.8.0_231/jre/bin:/usr/local/hadoop/bin/:/usr/local/hadoop/sbin/:/usr/local/hive/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.6.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
8 我列举下,因为6.2的步骤没有配置好的情况。
报错如下,
第一种错误:启动不了
Error: Syntax error: Encountered “” at line 1, column 64. (state=42X01,code=30000)
Initialization script hive-schema-2.3.0.mysql.sql
Error: Syntax error: Encountered "" at line 1, column 64. (state=42X01,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
第2种错误,启动成功,数据库没有初始化
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : null, message from server: “Host ‘master.hadoop.com’ is not allowed to connect to this MariaDB server”
[root@master bin]# ./schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://192.168.12.131:3306/hive
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : null, message from server: "Host 'master.hadoop.com' is not allowed to connect to this MariaDB server"
SQL Error code: 1130
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
[root@master hive]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/java/jdk1.8.0_231/bin:/usr/local/java/jdk1.8.0_231/jre/bin:/usr/local/hadoop/bin/:/usr/local/hadoop/sbin/:/usr/local/hive/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.6.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
NoViableAltException(24@[])
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1300)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
FAILED: ParseException line 1:0 cannot recognize input near 'exit' 'exit' ''
Interrupting... Be patient, this might take some time.
Press Ctrl+C again to kill JVM
hive>
>
这种的解决方法,回到我6.2所说的,数据库的连接地址一定要有 localhost
第3种启动成功的列子
[root@master bin]# ./schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost:3306/hive
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
[root@master bin]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/java/jdk1.8.0_231/bin:/usr/local/java/jdk1.8.0_231/jre/bin:/usr/local/hadoop/bin/:/usr/local/hadoop/sbin/:/usr/local/hive/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.6.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
如果有错误的话,看看我第6个步骤,按照里面的配置,一定没有错。