Hive之——hive本地模式配置,连接mysql数据库--Hive2.3.3+Hadoop2.9.0+MySQL5.7.18

转载请注明出处:https://blog.csdn.net/l1028386804/article/details/80160042

1、环境的基本说明

    本文使用的是vmware10虚拟机安装unbuntu16.04(64位)环境,机器名为hadoop。

    hive2.3.3 : apache-hive-2.3.3-bin.tar.gz

    connect : mysql-connector-java-5.1.42.tar.gz

    hadoop2.9.0 :hadoop-2.9.0.tar.gz

2、Hadoop安装

1) 伪分布式安装

请参考博文:《Hadoop之——Hadoop2.4.1伪分布搭建》

2) 集群安装

请参考博文《Hadoop之——CentOS + hadoop2.5.2分布式环境配置》

3) 高可用集群安装

请参考博文《Hadoop之——Hadoop2.5.2 HA高可靠性集群搭建(Hadoop+Zookeeper)前期准备》和《Hadoop之——Hadoop2.5.2 HA高可靠性集群搭建(Hadoop+Zookeeper)》

3、hive的基本安装

1) 解压修改权限

解压hive:

tar -zxvf apache.hive-2.3.3.tar.gz    -C /usr/local/
cd /usr/local/  
mv apache-hive  hive

修改hive的权限和组:

sudo chmod 777 hive
sudo chown -R hadoop hive

2) 配置环境变量

编辑/etc/profile文件,含有jdk、hadoop等的环境变量,hive主要配置了HIVE_HOME和PATH两部分
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export JRE_HOME=$JAVA_HOME/jre
export CLASSHOME=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HIVE_HOME=/usr/local/hive
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PATH  
使配置生效:
source /etc/profile

3) 修改配置文件

生成配置文件,在conf目录下有四个:
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties

vim hive-env.sh:

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HIVE_HOME=/usr/local/hive
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/usr/local/hive/conf
# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/usr/local/hive/lib

vim hive-site.xml:

${system:Java.io.tmpdir}/${hive.session.id}_resources替换为本机路径/tmp/hive/resources 
${system:java.io.tmpdir}/${system:user.name}/operation_logs替换为本机路径/tmp/hive/operation_logs
${system:java.io.tmpdir}/${system:user.name}替换为本机路径 /tmp/hive
在本地创建相应的文件:
cd /tmp
mkdir hive 
mkdir hive/operation_logs 
mkdir hive/resources 

4) 默认的derby数据库的初始化及hive启动

初始化:
schematool -dbType derby -initSchema  
或者 
schematool -dbType mysql -initSchema --verbose (可以查看报错信息)
启动hadoop:
/usr/local/hadoop/sbin/start-all.sh
启动hive:hive
测试hive:
hive>create table test(id int,name string);
hive>describe test;

4、 连接mysql数据库

可直接参考博文《 MySQL之——CentOS6.5 编译安装MySQL5.6.16》,也可按照下面的步骤

1)安装mysql

yum -y install mysql-server  :会自动安装mysql-client

2) mysql配置hive用户

登录mysql
mysql -uroot -proot
创建hive用户
create user 'hive'@'localhost' identified by 'hive';
创建hive数据库
create database hive;
alter database hive character set latin1;
对hive用户赋予权限
grant all on hive.* to hive@'%'  identified by 'hive';
grant all on hive.* to hive@'localhost'  identified by 'hive';
查看权限记录
show grants for hive@'localhost'; 
刷新权限
flush privileges; 

3) mysql-connect

sudo cp mysql-connector-java-5.1.42.tar.gz /usr/local/hive/lib/

4) 修改hive-site.xml文件,配置mysql的连接属性

vim hive-site.xml只修改下面的内容,其余不用修改


	hive.metastore.warehouse.dir
	/user/hive/warehouse
	location of default database for the warehouse


	hive.metastore.localhive_db
	true
	Use false if a production metastore server is used


	hive.exec.scratchdir
	/tmp/hive
	HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/ is created, with ${hive.scratch.dir.permission}.


	javax.jdo.option.ConnectionURL
	jdbc:mysql://localhost:3306/hive?createDatabaseIfNoExist=true&useSSL=false
	 Roy
  JDBC connect string for a JDBC metastore.
  To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
  For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.



	javax.jdo.option.ConnectionDriverName
	com.mysql.jdbc.Driver
	User-Defined(Roy) Driver class name for a JDBC metastore


	javax.jdo.option.ConnectionUserName
	hive
	User-defined(Roy)Username to use against metastore database


	javax.jdo.option.ConnectionPassword
	hive
	User-defined(Roy)password to use against metastore database

5) 初始化并运行

    初始化mysql
schematool -dbType mysql -initSchema	
运行hive:hive

5、遇到的问题

1) Class path contains multiple SLF4J bindings.

问题描述
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: Couldn't create directory ${system:java.io.tmpdir}/${hive.session.id}_resources
    at org.apache.hadoop.hive.ql.util.ResourceDownloader.ensureDirectory(ResourceDownloader.java:126)
    at org.apache.hadoop.hive.ql.util.ResourceDownloader.(ResourceDownloader.java:48)
    at org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:390)
    at org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:363)
    at org.apache.hadoop.hive.cli.CliSessionState.(CliSessionState.java:60)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:663)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
解决方法:两个log4j冲突了,使用hadoop的,将hive的删除
rm lib/log4j-slf4j-impl-2.4.1.jar

2) Couldn't create directory ${system:java.io.tmpdir}/${hive.session.id}_resources

问题描述
Exception in thread "main" java.lang.RuntimeException: Couldn't create directory ${system:java.io.tmpdir}/${hive.session.id}_resources
    at org.apache.hadoop.hive.ql.util.ResourceDownloader.ensureDirectory(ResourceDownloader.java:126)
    at org.apache.hadoop.hive.ql.util.ResourceDownloader.(ResourceDownloader.java:48)
    at org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:390)
    at org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:363)
    at org.apache.hadoop.hive.cli.CliSessionState.(CliSessionState.java:60)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:663)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
解决方法:参照2.3小节hive-site.xml的修改,换成绝对路径

3) Establishing SSL connection without server's identity verification is not recommended.

这是因为连接mysql需要使用ssl连接,而我们没有进行相关的配置,可以手动设置为不使用ssl连接,在hive-site.xml中修改mysql的连接
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?createDatabaseIfNoExist=true&useSSL=false 

4) FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)

当初始化过一次时,schematool -initSchema -dbType --db,再运行时,会因为已经初始化了元数据使得初始化失败,这时需要删除相关的文件:derby删除bin目录下的metastore_db;mysql删除hive数据库中的所有表


你可能感兴趣的:(mysql,Hadoop,Hive,Hadoop生态)