Hadoop入门学习笔记-第五天(hadoop-hive安装部署与配置笔记)

1.准备安装包:
MySQL-5.6.26-1.linux_glibc2.5.x86_64.rpm-bundle
mysql-connector-java-5.1.39.jar(驱动,需要放入/hive/lib)
apache-hive-1.2.1-bin.tar.gz
2.安装mysql 数据用于储存元数据,hive自带Derby,但该数据库不稳定,使用mysql作为元数据管理
1). 只需要安装在集群里面的一台节点上即可,此处选择hadoop1节点
2). 在Hadoop1上安装mariadb
yum -y install mariadb-server mariadb
3). 开启服务并开机自启
systemctl start mariadb.service
systemctl enable mariadb.service
4). 设置密码。第一次登陆时直接空密码登陆,之后使用sql语句设置密码
mysql -u root -p
登录之后,先查看databases是否正常,之后sql语句设置密码
> use mysql;
> update user set password=password( ‘123456’ ) where user= ‘root’ ;
然后设置root用户可以从任何主机登陆,对任何的库和表都有访问权限
> grant all privileges on . to root@’%’ identified by ‘123456’;
> grant all privileges on . to root@‘hadoop1’ identified by ‘123456’;
> grant all privileges on . to root@‘localhost’ identified by ‘123456’;
> FLUSH PRIVILEGES;
3.配置hive环境变量
export HIVE_HOME=/software/hive
export PATH= P A T H : PATH: PATH:HIVE_HOME/bin

使配置生效  source /etc/profile

4.进入/hive/conf 修改 hive-env.sh 新增环境路径
1). cp hive-env.sh.template hive-env.sh
2). vi hive-env.sh 添加 环境路径
export JAVA_HOME=/usr/local/java/jdk1.8.0_65
export HADOOP_HOME=/software/hadoop-3.2.0
export HIVE_HOME=/software/hive
export HIVE_CONF_DIR=/software/hive/conf
export HIVE_AUX_JARS_PATH=/software/hive/lib
5.配置hive-default.xml(注意是修改,根据key替换value) 元数据管理方式由默认的derby改为mysql
1.将hive-default.xml 改名为hive-site.xml
2.配置数据库信息

javax.jdo.option.ConnectionURL

jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true

JDBC connect string for a JDBC metastore





javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

Driver class name for a JDBC metastore





javax.jdo.option.ConnectionUserName

root

username to use against metastore database





javax.jdo.option.ConnectionPassword

root

password to use against metastore database


6.在hive下新建一个文件夹tmp,然后在hive-site.xml中增加下面配置。没有这个配置在启动hive时会报错。

system:java.io.tmpdir

/software/hive/tmp





system:user.name

root


7.在hdfs上建立一些hive-site.xml中配置的仓库地址并授权

bin/hadoop fs -mkdir -p /user/hive/warehouse

bin/hadoop fs -mkdir -p /user/hive/tmp

bin/hadoop fs -mkdir -p /user/hive/log

bin/hadoop fs -chmod -R 777 /user/hive/warehouse

bin/hadoop fs -chmod -R 777 /user/hive/tmp

bin/hadoop fs -chmod -R 777 /user/hive/log

hadoop fs -chmod -R 777 /tmp/hive

8.初始化

在hive的bin目录下

 schematool -dbType mysql -initSchema 

如果执行不报错,会显示completed,此时查看MySQL的数据库会多出来一个hive,hive数据库中会有很多表,这些就是hive初始化的表。

9.hive命令行操作

1).创建hive的数据库
	hive创建数据库的最简单写法和mysql差不多:

	create database foo;
	仅当名为foo的数据库当前不存在时才创建:
	
	create database if not exists foo;
	创建数据库时指定位置,这个位置一般是在hdfs上的位置:

	create database foo location '/db/foo';

	查看已经创建的数据库:
	show databases ;
	
	使用foo数据库
	
	hive> use foo;
	OK
	Time taken: 0.748 seconds
	hive> show tables;


2).在元数据管理库中查看foo数据库管理位置(mysql 数据库中,use hive ,查询dbs数据表)
	ariaDB [hive]> select * from DBS;
	+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
	| DB_ID | DESC                  | DB_LOCATION_URI                            | NAME    | OWNER_NAME | OWNER_TYPE | CTLG_NAME |
	+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
	|     1 | Default Hive database | hdfs://cluster1/user/hive/warehouse        | default | public     | ROLE       | hive      |
	|     6 | NULL                  | hdfs://cluster1/user/hive/warehouse/foo.db | foo     | root       | USER       | hive      |
	+-------+-----------------------+--------------------------------------------+---------+------------+------------+-----------+
	2 rows in set (0.00 sec)

	MariaDB [hive]>
3)通过远程jdbc方式连接到hive数据仓库(如果连接不上,请参考:https://blog.csdn.net/swing2008/article/details/53743960 或者 https://blog.csdn.net/alan_liuyue/article/details/90299035)

	1.启动hiveserver2服务器,监听端口10000
	$>hive --service hiveserver2 &

	2.通过beeline命令行连接到hiveserver2
	$>beeline										//进入beeline命令行(于hive --service beeline)
	$beeline>!help										//查看帮助
	$beeline>!quit										//退出
	$beeline>!connect jdbc:hive2://localhost:10000/mydb2//连接到hibve数据
	$beeline>show databases ;
	$beeline>use mydb2 ;
	$beeline>show tables;		
	报错参考:https://blog.csdn.net/qq_16633405/article/details/82190440

	在hadoop的配置文件core-site.xml中添加如下属性:
	
		hadoop.proxyuser.root.hosts
		*
	
	
		hadoop.proxyuser.root.groups
		*
	
	
	将装有hive的节点设置为Active
	hdfs haadmin -getServiceState nn1 查看节点状态。

安装过程遇到的问题集合:
1.执行schematool -dbType mysql -initSchema

Exception in thread “main” java.lang.ExceptionInInitializerError
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:446)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.2.0
at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.(HiveConf.java:368)
… 7 more
解决方法:hive版本过低导致,可以选择,hive-2.x 或者 3.x

2.特殊字符无法编译错误
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8

at [row,col,system-id]: [3210,96,“file:/usr/local/hive/apache-hive-3.1.1-bin/conf/hive-site.xml”]
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2981)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2930)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2805)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1459)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5098)
at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:96)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3210,96,“file:/usr/local/hive/apache-hive-3.1.1-bin/conf/hive-site.xml”]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
at org.apache.hadoop.conf.Configuration P a r s e r . p a r s e N e x t ( C o n f i g u r a t i o n . j a v a : 3277 ) a t o r g . a p a c h e . h a d o o p . c o n f . C o n f i g u r a t i o n Parser.parseNext(Configuration.java:3277) at org.apache.hadoop.conf.Configuration Parser.parseNext(Configuration.java:3277)atorg.apache.hadoop.conf.ConfigurationParser.parse(Configuration.java:3071)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2964)
… 15 more

解决方法:原因是hive-site.xml文件第3210行有个特殊字符,删除掉for:

你可能感兴趣的:(Hadoop入门学习笔记,数据库,hadoop,mysql,java,hive)