1.准备安装包:
MySQL-5.6.26-1.linux_glibc2.5.x86_64.rpm-bundle
mysql-connector-java-5.1.39.jar(驱动,需要放入/hive/lib)
apache-hive-1.2.1-bin.tar.gz
2.安装mysql 数据用于储存元数据,hive自带Derby,但该数据库不稳定,使用mysql作为元数据管理
1). 只需要安装在集群里面的一台节点上即可,此处选择hadoop1节点
2). 在Hadoop1上安装mariadb
yum -y install mariadb-server mariadb
3). 开启服务并开机自启
systemctl start mariadb.service
systemctl enable mariadb.service
4). 设置密码。第一次登陆时直接空密码登陆,之后使用sql语句设置密码
mysql -u root -p
登录之后,先查看databases是否正常,之后sql语句设置密码
> use mysql;
> update user set password=password( ‘123456’ ) where user= ‘root’ ;
然后设置root用户可以从任何主机登陆,对任何的库和表都有访问权限
> grant all privileges on . to root@’%’ identified by ‘123456’;
> grant all privileges on . to root@‘hadoop1’ identified by ‘123456’;
> grant all privileges on . to root@‘localhost’ identified by ‘123456’;
> FLUSH PRIVILEGES;
3.配置hive环境变量
export HIVE_HOME=/software/hive
export PATH= P A T H : PATH: PATH:HIVE_HOME/bin
使配置生效 source /etc/profile
4.进入/hive/conf 修改 hive-env.sh 新增环境路径
1). cp hive-env.sh.template hive-env.sh
2). vi hive-env.sh 添加 环境路径
export JAVA_HOME=/usr/local/java/jdk1.8.0_65
export HADOOP_HOME=/software/hadoop-3.2.0
export HIVE_HOME=/software/hive
export HIVE_CONF_DIR=/software/hive/conf
export HIVE_AUX_JARS_PATH=/software/hive/lib
5.配置hive-default.xml(注意是修改,根据key替换value) 元数据管理方式由默认的derby改为mysql
1.将hive-default.xml 改名为hive-site.xml
2.配置数据库信息
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
root
username to use against metastore database
javax.jdo.option.ConnectionPassword
root
password to use against metastore database
6.在hive下新建一个文件夹tmp,然后在hive-site.xml中增加下面配置。没有这个配置在启动hive时会报错。
system:java.io.tmpdir
/software/hive/tmp
system:user.name
root
7.在hdfs上建立一些hive-site.xml中配置的仓库地址并授权
bin/hadoop fs -mkdir -p /user/hive/warehouse
bin/hadoop fs -mkdir -p /user/hive/tmp
bin/hadoop fs -mkdir -p /user/hive/log
bin/hadoop fs -chmod -R 777 /user/hive/warehouse
bin/hadoop fs -chmod -R 777 /user/hive/tmp
bin/hadoop fs -chmod -R 777 /user/hive/log
hadoop fs -chmod -R 777 /tmp/hive
8.初始化
在hive的bin目录下
schematool -dbType mysql -initSchema
如果执行不报错,会显示completed,此时查看MySQL的数据库会多出来一个hive,hive数据库中会有很多表,这些就是hive初始化的表。
1).创建hive的数据库
hive创建数据库的最简单写法和mysql差不多:
create database foo;
仅当名为foo的数据库当前不存在时才创建:
create database if not exists foo;
创建数据库时指定位置,这个位置一般是在hdfs上的位置:
create database foo location '/db/foo';
查看已经创建的数据库:
show databases ;
使用foo数据库
hive> use foo;
OK
Time taken: 0.748 seconds
hive> show tables;
2).在元数据管理库中查看foo数据库管理位置(mysql 数据库中,use hive ,查询dbs数据表)
ariaDB [hive]> select * from DBS;
+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE | CTLG_NAME |
+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
| 1 | Default Hive database | hdfs://cluster1/user/hive/warehouse | default | public | ROLE | hive |
| 6 | NULL | hdfs://cluster1/user/hive/warehouse/foo.db | foo | root | USER | hive |
+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
2 rows in set (0.00 sec)
MariaDB [hive]>
3)通过远程jdbc方式连接到hive数据仓库(如果连接不上,请参考:https://blog.csdn.net/swing2008/article/details/53743960 或者 https://blog.csdn.net/alan_liuyue/article/details/90299035)
1.启动hiveserver2服务器,监听端口10000
$>hive --service hiveserver2 &
2.通过beeline命令行连接到hiveserver2
$>beeline //进入beeline命令行(于hive --service beeline)
$beeline>!help //查看帮助
$beeline>!quit //退出
$beeline>!connect jdbc:hive2://localhost:10000/mydb2//连接到hibve数据
$beeline>show databases ;
$beeline>use mydb2 ;
$beeline>show tables;
报错参考:https://blog.csdn.net/qq_16633405/article/details/82190440
在hadoop的配置文件core-site.xml中添加如下属性:
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
将装有hive的节点设置为Active
hdfs haadmin -getServiceState nn1 查看节点状态。
安装过程遇到的问题集合:
1.执行schematool -dbType mysql -initSchema
Exception in thread “main” java.lang.ExceptionInInitializerError
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:446)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.2.0
at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.(HiveConf.java:368)
… 7 more
解决方法:hive版本过低导致,可以选择,hive-2.x 或者 3.x
2.特殊字符无法编译错误
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3210,96,“file:/usr/local/hive/apache-hive-3.1.1-bin/conf/hive-site.xml”]
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2981)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2930)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2805)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1459)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5098)
at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:96)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3210,96,“file:/usr/local/hive/apache-hive-3.1.1-bin/conf/hive-site.xml”]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
at org.apache.hadoop.conf.Configuration P a r s e r . p a r s e N e x t ( C o n f i g u r a t i o n . j a v a : 3277 ) a t o r g . a p a c h e . h a d o o p . c o n f . C o n f i g u r a t i o n Parser.parseNext(Configuration.java:3277) at org.apache.hadoop.conf.Configuration Parser.parseNext(Configuration.java:3277)atorg.apache.hadoop.conf.ConfigurationParser.parse(Configuration.java:3071)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2964)
… 15 more
解决方法:原因是hive-site.xml文件第3210行有个特殊字符,删除掉for: