一般需要重新配置的配置文件有:
etc/hadoop/core-site.xml
etc/hadoop/hdfs-site.xml
etc/hadoop/mapred-site.xml
etc/hadoop/yarn-site.xml
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar grep /input/* output ‘dfs[a-z.]+’
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=anonymous, access=WRITE, inode="/user/hive/warehouse":root:supergroup:drwxrwxr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1869)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1853)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1812)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3215)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1127)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:713)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
解决方案:
添加以下配置到 etc/hadoop/mapred-site.xml ,
然后重启 hadoop
yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/
mapreduce.map.env
HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/
mapreduce.reduce.env
HADOOP_MAPRED_HOME=/opt/Beaver/hadoop/
create database if not exists test;
create table test.person(
id int,
name string,
age int
)
stored as orc;
insert into table test.person values(1, 'Haimeimei', 10);
insert into table test.person values(2, 'David', 11);
insert into table test.person values(3, 'Json', 12);
hadoop fs -mkdir /tmp
hadoop fs -mkdir -p /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse
配置环境变量
~/.bashrc
export HIVE_HOME=/opt/Beaver/hive/
export PATH=$HIVE_HOME/bin:$PATH
在根目录下, 初始化 derby db
bin/schematool --initSchema -dbType derby
# bin/hiveserver2
2023-06-13 14:14:07: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Beaver/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Beaver/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 4d5bbf08-f3f1-476b-ab97-060118b4494e
# beeline -u jdbc:hive2://localhost:10000 -n root
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Beaver/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Beaver/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://localhost:10000> insert into table test.person values(1, 'Haimeimei', 10);
...
0: jdbc:hive2://localhost:10000> select * from test.person limit 10;
INFO : Compiling command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc): select * from test.person limit 10
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:person.id, type:int, comment:null), FieldSchema(name:person.name, type:string, comment:null), FieldSchema(name:person.age, type:int, comment:null)], properties:null)
INFO : Completed compiling command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc); Time taken: 0.165 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc): select * from test.person limit 10
INFO : Completed executing command(queryId=root_20230613142009_df75b7a3-ae61-4da5-8ec8-b60621d432bc); Time taken: 0.0 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+------------+--------------+-------------+
| person.id | person.name | person.age |
+------------+--------------+-------------+
| 1 | Haimeimei | 10 |
| 2 | David | 11 |
| 3 | Json | 12 |
+------------+--------------+-------------+
3 rows selected (0.265 seconds)
0: jdbc:hive2://localhost:10000>
hive metastores 是提供元数据服务
因为是使用 derbydb,所以,要先关掉其他使用derbydb的服务,比如hiveserver2,
然后启动 metastore
nohup bin/hive --service metastore &
创建如下配置文件hive-site.xml,
放入spark conf 目录
# cat hive-site.xml
hive.metastore.uris
thrift://localhost:9083
Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.
启动 spark-sql 验证。
已经查出hive表里的数据,验证成功。
spark-sql> show databases;
default
test
Time taken: 2.273 seconds, Fetched 2 row(s)
spark-sql> select * from test.person;
1 Haimeimei 10
2 David 11
3 Json 12
Time taken: 4.065 seconds, Fetched 3 row(s)
出现如下错误
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate anonymous
Connecting to jdbc:hive2://localhost:10000
23/06/13 00:39:46 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate anonymous (state=08S01,code=0)
Beeline version 3.1.2 by Apache Hive
增加如下配置到 core-site.xml
重启hadoop 服务
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*