Apache Hive是一个分布式、容错的数据仓库系统,能够支持大规模的分析。Hive元数据仓库(HMS)提供了一个中央的元数据存储库,可轻松分析数据以做出明智的数据驱动决策,因此它是许多数据湖架构的关键组件。Hive建立在Apache Hadoop之上,支持在S3、adls、gs等存储上通过HDFS访问。Hive允许用户使用SQL读取、写入和管理PB级别的数据。
yum install mysql-server -y
service mysqld start
chkconfig mysqld on
grant all privileges on *.* to 'root'@'%' identified by '123456' with grant option
mysql -uroot -p
Installing Hive from a Stable Release 可以参考官网文档详细的安装步骤
此处安装的是hive-2.3.9
# 解压
tar -xf apache-hive-2.3.9-bin.tar.gz
# 修改名字
mv apache-hive-2.3.9-bin hive-2.3.9
Set the environment variable HIVE_HOME to point to the installation directory:
vi /etc/profile
#添加
export HIVE_HOME=/opt/bigdata/hive-2.3.9
export PATH=$HIVE_HOME/bin:$PATH
需要在hive安装路径下conf
cd /opt/bigdata/hive-2.3.9/conf
mv hive-default.xml.template hive-site.xml
vi hive-site.xml
# 需要配置configuration标签之外的配置
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://node01:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
mysql-connector-java-5.1.32-bin.jar
/opt/bigdata/hive-2.3.9/lib
# `schematool -dbType mysql -initSchema`
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/hive-2.3.9/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://node01:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
选择一个两个节点,一个hive作为客户端[【node01】,一个hive作为服务端【node02】连接mysql ,
环境变量HIVE_HOME配置不在赘述。
<property>
<name>hive.metastore.warehouse.dirname>
<value>/user/hive_remote/warehousevalue>
property>
<property>
<name>javax.jdo.option.ConnectionURLname>
<value>jdbc:mysql://node01:3306/hive_remote?createDatabaseIfNotExist=truevalue>
property>
<property>
<name>javax.jdo.option.ConnectionDriverNamename>
<value>com.mysql.jdbc.Drivervalue>
property>
<property>
<name>javax.jdo.option.ConnectionUserNamename>
<value>rootvalue>
property>
<property>
<name>javax.jdo.option.ConnectionPasswordname>
<value>123value>
property>
Metastore server and client communicate using Thrift Protocol
。
客户端配置
hive.metastore.local = false Metastore is remote. Note: This is no longer needed as of Hive 0.10.
hive.metastore.uris = thrift://:host and port for the Thrift metastore server.
hive.metastore.warehouse.dir=
客户端hive-site.xml的配置
<property>
<name>hive.metastore.warehouse.dirname>
<value>/user/hive_remote/warehousevalue>
property>
<property>
<name>hive.metastore.urisname>
<value>thrift://node02:9083value>
property>
schematool -dbType mysql -initSchema
hive --service metastore
hive