Hadoo 之 Hive

配置

一般需要重新配置的配置文件有:
etc/hadoop/core-site.xml
etc/hadoop/hdfs-site.xml

etc/hadoop/mapred-site.xml
etc/hadoop/yarn-site.xml

FQA

mapreduce examples 报错

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar grep /input/* output ‘dfs[a-z.]+’

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=anonymous, access=WRITE, inode="/user/hive/warehouse":root:supergroup:drwxrwxr-x
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1869)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1853)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1812)
	at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3215)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1127)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:713)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

解决方案:
添加以下配置到 etc/hadoop/mapred-site.xml ,
然后重启 hadoop


  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/


  mapreduce.map.env
  HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/

Hive 安装

建表测试

create database if not exists test;

create table test.person(
id  int,
name string,
age  int
)
stored as orc;


insert into table  test.person values(1, 'Haimeimei', 10);
insert into table  test.person values(2, 'David', 11);
insert into table  test.person values(3, 'Json', 12);

使用derby快速安装hive

基本配置

hadoop fs -mkdir       /tmp
hadoop fs -mkdir -p    /user/hive/warehouse
hadoop fs -chmod g+w   /tmp
hadoop fs -chmod g+w   /user/hive/warehouse

配置环境变量
~/.bashrc

export HIVE_HOME=/opt/Beaver/hive/
export PATH=$HIVE_HOME/bin:$PATH

在根目录下, 初始化 derby db

bin/schematool --initSchema -dbType derby

hiveserver2 和 beeline

bin/hiveserver2
# bin/hiveserver2
2023-06-13 14:14:07: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Beaver/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Beaver/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 4d5bbf08-f3f1-476b-ab97-060118b4494e
beeline
# beeline -u jdbc:hive2://localhost:10000 -n root
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Beaver/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Beaver/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://localhost:10000> insert into table  test.person values(1, 'Haimeimei', 10);
...

0: jdbc:hive2://localhost:10000> select * from test.person limit 10;
INFO  : Compiling command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc): select * from test.person limit 10
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:person.id, type:int, comment:null), FieldSchema(name:person.name, type:string, comment:null), FieldSchema(name:person.age, type:int, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc); Time taken: 0.165 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc): select * from test.person limit 10
INFO  : Completed executing command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc); Time taken: 0.0 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
+------------+--------------+-------------+
| person.id  | person.name  | person.age  |
+------------+--------------+-------------+
| 1          | Haimeimei    | 10          |
| 2          | David        | 11          |
| 3          | Json         | 12          |
+------------+--------------+-------------+
3 rows selected (0.265 seconds)
0: jdbc:hive2://localhost:10000> 

metastore

hive metastores 是提供元数据服务
因为是使用 derbydb,所以,要先关掉其他使用derbydb的服务,比如hiveserver2,
然后启动 metastore

nohup bin/hive --service metastore &

创建如下配置文件hive-site.xml,
放入spark conf 目录

# cat hive-site.xml 


	
		hive.metastore.uris
		thrift://localhost:9083
		Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.
	


启动 spark-sql 验证。
已经查出hive表里的数据,验证成功。

spark-sql> show databases;
default
test
Time taken: 2.273 seconds, Fetched 2 row(s)
spark-sql> select * from test.person;
1	Haimeimei	10
2	David	11
3	Json	12
Time taken: 4.065 seconds, Fetched 3 row(s)

FQA

出现如下错误
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate anonymous

Connecting to jdbc:hive2://localhost:10000
23/06/13 00:39:46 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate anonymous (state=08S01,code=0)
Beeline version 3.1.2 by Apache Hive

增加如下配置到 core-site.xml
重启hadoop 服务


    hadoop.proxyuser.root.hosts
    *


    hadoop.proxyuser.root.groups
    *

你可能感兴趣的:(hive,hadoop,mapreduce)