在window10上安装apache-hive-3.1.3

一、hive介绍

hive是什么:hive是基于Hadoop的一个数据仓库工具,用来进行数据提取、转化、加载,这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。hive数据仓库工具能将结构化的数据文件映射为一张数据库表,并提供SQL查询功能,能将SQL语句转变成MapReduce任务来执行,hive 是一种底层封装了Hadoop 的数据仓库处理工具,使用类SQL 的hiveSQL 语言实现数据查询。

二、hive下载

https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/

三、配置hive的环境变量

在window10上安装apache-hive-3.1.3_第1张图片在window10上安装apache-hive-3.1.3_第2张图片

四、修改hive中配置文件的参数

4.1、修改hive-env.sh中的参数

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=D:\bigdata\hadoop
 
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=D:\bigdata\apache-hive-3.1.3-bin\conf
 
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=D:\bigdata\apache-hive-3.1.3-bin\lib

4.2、在mysql上创建一个hive的数据库,在后面的配置中会用到

4.3、hive是运行在hadoop之上的,需要通过hadoop创建几个文件夹

4.4、修改hive-site.xml文件配置


		hive.metastore.warehouse.dir
		/user/hive/warehouse
		location of default database for the warehouse
	
 

	
		hive.exec.scratchdir
		/tmp/hive
		HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.
	
 

	
		hive.exec.local.scratchdir
		D:/bigdata/apache-hive-3.1.3-bin/my_hive/scratch_dir
		Local scratch space for Hive jobs
	
 

	
		hive.downloaded.resources.dir
		D:/bigdata/apache-hive-3.1.3-bin/my_hive/resources_dir/${hive.session.id}_resources
		Temporary local directory for added resources in the remote file system.
	
 

	
		hive.querylog.location
		D:/bigdata/apache-hive-3.1.3-bin/my_hive/querylog_dir
		Location of Hive run time structured log file
	
 

	
		hive.server2.logging.operation.log.location
		D:/bigdata/apache-hive-3.1.3-bin/my_hive/operation_logs_dir
		Top level directory where operation logs are stored if logging functionality is enabled
	
 

	
		javax.jdo.option.ConnectionURL
		jdbc:mysql://localhost:3306/hive?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true
		
		JDBC connect string for a JDBC metastore.
		
	
 

	
		javax.jdo.option.ConnectionDriverName
		com.mysql.cj.jdbc.Driver
		Driver class name for a JDBC metastore
	
 

	
		javax.jdo.option.ConnectionUserName
		root
		Username to use against metastore database
	
 

	
		javax.jdo.option.ConnectionPassword
		123456
		password to use against metastore database
	
 

	
		hive.metastore.schema.verification
		false
		
		Enforce metastore schema version consistency.
		True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
		schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
		proper metastore schema migration. (Default)
		False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
		
	



	
		datanucleus.schema.autoCreateAll
		true
		Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.
	

五、启动hive

默认情况下现在3.1.3版本是不存在window下的cmd命令。需要下载apache-hive-2.2.0-src.tar.gz这个包,把bin下面对应的cmd拷贝到hive中去

启动hive之前,先需要启动hadoop。然后再hive的bin目录下执行hive.cmd start

在执行命令的之后,出现了这样的情况

Beeline version 3.1.3 by Apache Hive
Hive Session ID = 5f6198d8-1183-4500-9cba-830da0127197
2023-11-09 01:24:22,203 INFO SessionState: Hive Session ID = 5f6198d8-1183-4500-9cba-830da0127197
Error applying authorization policy on hive configuration: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/用户名/5f6198d8-1183-4500-9cba-830da0127197. Name node is in safe mode.
The reported blocks 49 has reached the threshold 0.9990 of total blocks 49. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in 3 seconds. NamenodeHostName:localhost
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1612)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1599)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3437)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1166)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:742)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1213)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1089)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1012)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3026)

手动关闭安全模式

hadoop dfsadmin -safemode leave
D:\bigdata\hadoop\sbin>hadoop dfsadmin -safemode leave
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Safe mode is OFF

启动hive,在hive的bin下面执行hive.cmd start

在window10上安装apache-hive-3.1.3_第3张图片

执行过程中遇到这种先不用管,这个是hive与hadoop版本不匹配的问题 

2023-11-09 01:28:25,800 INFO session.SessionState: Resetting thread name to  main
2023-11-09 01:28:25,807 INFO conf.HiveConf: Using the default value passed in for log id: 255c0661-6506-4a5f-bbad-f48b74f972dd
2023-11-09 01:28:25,807 INFO session.SessionState: Updating thread name to 255c0661-6506-4a5f-bbad-f48b74f972dd main
2023-11-09 01:28:25,836 INFO conf.HiveConf: Using the default value passed in for log id: 255c0661-6506-4a5f-bbad-f48b74f972dd
2023-11-09 01:28:25,837 INFO session.SessionState: Resetting thread name to  main
Beeline version 3.1.3 by Apache Hive
hive>

这样说明启动成功了。

你可能感兴趣的:(大数据,JAVA知识,apache,hive,hadoop)