前置安装:jdk1.8和hadoop3.x.x
一、hadoop-3.2.3单机版安装
1.下载安装包
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
2.解压缩
tar -zxvf hadoop-3.2.3.tar.gz
3.进入hadoop-3.2.3/目录下,创建文件夹input,并将配置文件复制到该文件夹下
mkdir input
cp ./etc/hadoop/*.xml ./input
4.运行mapreduce测试程序
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar grep ./input ./output 'dfs[a-z.]+'
5.查看运行结果,运行成功说明单机版hadoop没有问题
cat ./output/*
详情如下:
[root@VM-16-6-centos hadoop-3.2.3]# cat ./output/*
1 dfsadmin
二、hive3.1.2安装
1.下载hive3.1.2安装包
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
2.解压
tar -zxvf apache-hive-3.1.2-bin.tar.gz
3.下载mysql的驱动包mysql-connector-java-5.1.46.jar(hive的元数据默认在derby中跟sqllite类似,只适合测试用,生产环境需要放到mysql数据库或其他常用的数据库中)
https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.46/mysql-connector-java-5.1.46.jar
将驱动包放在lib目录下
cp mysql-connector-java-5.1.46.jar /apache-hive-3.1.2-bin/lib
参考:hive3.1.3安装 - 灰信网(软件开发博客聚合)
4.进入配置文件目录,从模板复制到hive-env.sh
cd apache-hive-3.1.2-bin/conf
cp hive-env.sh.template hive-env.sh
5.在hive-env.sh中添加如下:
export JAVA_HOME=/usr/java/jdk1.8.0_51
export HADOOP_HOME=/home/app/hadoop/hadoop-3.2.3
export HIVE_HOME=/home/app/hive/apache-hive-3.1.2-bin
export HIVE_CONF_DIR=/home/app/hive/apache-hive-3.1.2-bin/conf
6.在配置文件目录conf下新建hive-site.xml配置文件,添加元数据库配置和hive-server2端口和地址配置
javax.jdo.option.ConnectionURL
jdbc:mysql://xxxx.xxxx.xxxx.xx:3306/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
root
username to use against metastore database
javax.jdo.option.ConnectionPassword
xxxx
password to use against metastore database
hive.server2.thrift.bind.host
127.0.0.1
hive.server2.thrift.port
10000
7.初始化数据库
./bin/schematool -dbType mysql -initSchema
可能会报错如下,这是因为hadoop中的guava包跟hive中的guava包冲突导致,将hadoop-3.2.3/share/hadoop/common/lib/guava-27.0-jre.jar 复制到apache-hive-3.1.2-bin/lib下,将apache-hive-3.1.2-bin/lib下原来的guava-19.0.jar删除
[root@VM-16-6-centos conf]# ../bin/schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)
at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554)
at org.apache.hadoop.mapred.JobConf.(JobConf.java:448)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5104)
at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:96)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
8.重新初始化元数据库,如下:
[root@VM-16-6-centos apache-hive-3.1.2-bin]# ./bin/schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://129.204.20.98:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.mysql.sql
Initialization script completed
schemaTool completed
9.可以登录自己配置的数据库下,看是否已经有hive数据库,已经库下是否已经有表
10.在apache-hive-3.1.2-bin/bin目录下运行hive命令行客户端,进入客户端交互界面
./hive
查询数据库,查询成功,说明安装已经成功
hive> show databases;
详细结果如下:
[root@VM-16-6-centos bin]# ./hive
which: no hbase in (/usr/java/jdk1.8.0_51/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = afc39434-c699-4365-bc8f-2e06dde12996
Logging initialized using configuration in jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = 5be8b38a-1465-493f-8987-000db1832348
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
OK
default
Time taken: 0.951 seconds, Fetched: 1 row(s)
hive>
11.在bin目录下运行hive-server2
#如果不启动metastore服务只启动hiveserver服务,远程连接hive没办法执行mapreduce操作(聚合操作)
#9083端口
nohup ./hive --service metastore &
#10000和10002端口
nohup ./hive --service hiveserver2 &
详情如下:
[root@VM-16-6-centos bin]# ./hiveserver2
which: no hbase in (/usr/java/jdk1.8.0_51/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
2022-05-16 23:16:52: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = b329d09b-68ad-4bfe-96d2-b202995e7ee4
Hive Session ID = 1c58be8d-4793-413b-8074-71ac5877d318
Hive Session ID = 1d8f8f0b-ead2-4418-a396-966dceda11f3
Hive Session ID = e6a8d7f3-f29a-44bc-a40f-44b5db7c954b
12.另外启一个服务器ssh远程窗口,查看服务端端口10000和ui客户端端口10002是否已经启动
netstat -ntlp
如果10000和10002存在,说明已经启动成功
13.通过beeline客户端jdbc连接hive测试
#启动客户端
./beeline --color=true
#在客户端下连接hive-server2,默认用户名密码没有,随便输入就可以
beeline> !connect jdbc:hive2://127.0.0.1:10000
#连接成功后,查询数据库
0: jdbc:hive2://127.0.0.1:10000> show databases;
详情如下:
[root@VM-16-6-centos bin]# ./beeline --color=true
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app/hive/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://127.0.0.1:10000
Connecting to jdbc:hive2://127.0.0.1:10000
Enter username for jdbc:hive2://127.0.0.1:10000: hive
Enter password for jdbc:hive2://127.0.0.1:10000: ****
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://127.0.0.1:10000> show databases;
INFO : Compiling command(queryId=root_20220516225230_d70d6472-f9e7-4b86-abe3-179faecf8784): show databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=root_20220516225230_d70d6472-f9e7-4b86-abe3-179faecf8784); Time taken: 0.016 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20220516225230_d70d6472-f9e7-4b86-abe3-179faecf8784): show databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=root_20220516225230_d70d6472-f9e7-4b86-abe3-179faecf8784); Time taken: 1.155 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+----------------+
| database_name |
+----------------+
| default |
+----------------+
1 row selected (1.259 seconds)
参考:hive3.1.3安装 - 灰信网(软件开发博客聚合)
hadoop3.1.3单机版安装
Hadoop的安装和使用_嗯嗯嗯吧的博客-CSDN博客_hadoop的安装和使用
客户端连接技巧:
hiveserver2服务的启动与简单使用技巧_猫头鹰数据分析的博客-CSDN博客_hiveserver2启动