Hive之——Hive2.3.4 安装和配置

转载请注明出处:https://blog.csdn.net/l1028386804/article/details/88014099

Hive 是基于 Hadoop 的一个数据仓库,可以将结构化的数据文件映射为一张表,并提供类 sql 查询功能,Hive 底层将 sql 语句转化为 MapReduce 任务运行。
下载 Hive2.3.4 到 maste r的 /home/dc2-user 并解压

wget http://mirror.bit.edu.cn/apache/hive/hive-2.3.4/apache-hive-2.3.4-bin.tar.gz
tar zxvf apache-hive-2.3.4-bin.tar.gz

设置 Hive 环境变量

编辑 /etc/profile 文件, 在其中添加以下内容。

sudo vi /etc/profile
export HIVE_HOME=/home/dc2-user/apache-hive-2.3.4-bin
export PATH=$PATH:$HIVE_HOME/bin

使环境变量生效:

source /etc/profile

配置 Hive

重命名以下配置文件:

cd apache-hive-2.3.4-bin/conf/ 
cp hive-env.sh.template hive-env.sh 
cp hive-default.xml.template hive-site.xml 
cp hive-log4j2.properties.template hive-log4j2.properties 
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties

修改 hive-env.sh:

export JAVA_HOME=/home/dc2-user/java/jdk1.8.0_191 ##Java路径 
export HADOOP_HOME=/home/dc2-user/hadoop-2.7.7 ##Hadoop安装路径 
export HIVE_HOME=/home/dc2-user/apache-hive-2.3.4-bin ##Hive安装路径 
export HIVE_CONF_DIR=$HIVE_HOME/conf ##Hive配置文件路径

修改 hive-site.xml

修改对应属性的 value 值

vi hive-site.xml 
 
	hive.exec.scratchdir 
	/tmp/hive-${user.name} 
	HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/ is created, with ${hive.scratch.dir.permission}. 
 
 
	hive.exec.local.scratchdir 
	/tmp/${user.name} 
	Local scratch space for Hive jobs 
 
 
	hive.downloaded.resources.dir 
	/tmp/hive/resources 
	Temporary local directory for added resources in the remote file system. 
 
 
	 hive.querylog.location 
	/tmp/${user.name} 
	Location of Hive run time structured log file 

 
	hive.server2.logging.operation.log.location 
	/tmp/${user.name}/operation_logs 
	Top level directory where operation logs are stored if logging functionality is enabled 

配置 Hive Metastore

Hive Metastore 是用来获取 Hive 表和分区的元数据,本例中使用 mariadb 来存储此类元数据。
将 mysql-connector-java-5.1.40-bin.jar 放入 $HIVE_HOME/lib 下并在 hive-site.xml 中配置 MySQL 数据库连接信息。

 
	javax.jdo.option.ConnectionURL 
	jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false 
 

	javax.jdo.option.ConnectionDriverName
	com.mysql.jdbc.Driver


	javax.jdo.option.ConnectionUserName
	hive


	javax.jdo.option.ConnectionPassword 
	hive

为 Hive 创建 HDFS 目录

start-dfs.s #如果在安装配置hadoop是已经启动,则此命令可省略 
hdfs dfs -mkdir /tmp 
hdfs dfs -mkdir -p /usr/hive/warehouse 
hdfs dfs -chmod g+w /tmp 
hdfs dfs -chmod g+w /usr/hive/warehouse

安装 mysql

本例中使用的是 mariadb。

sudo yum install -y mariadb-server
sudo systemctl start mariadb

登录 mysql,初始无密码,创建 Hve 用户并设置密码。

mysql -uroot 
MariaDB [(none)]> create user'hive'@'localhost' identified by 'hive'; 
Query OK, 0 rows affected (0.00 sec) 

MariaDB [(none)]> grant all privileges on *.* to hive@localhost identified by 'hive'; 
Query OK, 0 rows affected (0.00 sec)

运行 Hive

运行 Hive 之前必须保证 HDFS 已经启动,可以使用 start-dfs.sh 来启动,如果之前安装 Hadoop 是已启动,次步骤可略过。
从 Hive 2.1 版本开始, 在启动 Hive 之前需运行 schematool 命令来执行初始化操作:

schematool -dbType mysql -initSchema 
SLF4J: Class path contains multiple SLF4J bindings. 
SLF4J: Found binding in [jar:file:/home/dc2-user/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/home/dc2-user/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 
Metastore connection URL: jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true 
Metastore Connection Driver : com.mysql.jdbc.Driver 
Metastore connection User: hive 
Starting metastore schema initialization to 2.3.0 
Initialization script hive-schema-2.3.0.mysql.sql 
Initialization script completed schemaTool completed

启动 Hive,输入命令 Hive

hive 

which: no hbase in (/home/dc2-user/java/jdk1.8.0_191/bin:/home/dc2-user/hadoop-2.7.7/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/bin:/home/dc2-user/apache-hive-2.3.4-bin/bin:/home/dc2-user/.local/bin:/home/dc2-user/bin) 
SLF4J: Class path contains multiple SLF4J bindings. 
SLF4J: Found binding in [jar:file:/home/dc2-user/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/home/dc2-user/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 
Logging initialized using configuration in file:/home/dc2-user/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. 
hive> 

测试 Hive

在 Hive中创建一个表:

hive> create table test_hive(id int, name string) 
    > row format delimited fields terminated by '\t' #字段之间用tab键进行分割 
	> stored as textfile; # 设置加载数据的数据类型,默认是TEXTFILE,如果文件数据是纯文本,就是使用 [STORED AS TEXTFILE],然后从本地直接拷贝到HDFS上,hive直接可以识别数据 
OK 
Time taken: 10.857 seconds 
hive> show tables; 
OK test_hive 
Time taken: 0.396 seconds, Fetched: 1 row(s)

可以看到表已经创建成功,输入 quit ; 退出 Hive,接下来以文本形式创建数据:

vi test_tb.txt
101	aa
102	bb
103	cc

进入 Hive,导入数据:

hive> load data local inpath '/home/dc2-user/test_db.txt' into table test_hive; 
Loading data to table default.test_hive 
OK 
Time taken: 6.679 seconds 

hive> select * from test_hive; 
101 aa 
102 bb 
103 cc 
Time taken: 2.814 seconds, Fetched: 3 row(s)

可以看到数据插入成功并且可以正常查询。

 

你可能感兴趣的:(Hadoop,Hive)