hive分区表某分区对应的目录不存在导致查询异常问题

1. 问题现象

1.1 spark thrift server查询hive分区表异常

0: jdbc:hive2://10.0.194.131:10017> select * from xy_app.xy_log_collect_user_funnels limit 10;
Error: java.lang.RuntimeException: serious problem (state=,code=0)
0: jdbc:hive2://10.0.194.131:10017> [hdfs@VM_194_131_centos /usr/local/xyhadoop]$ 

1.2 presto查询同个hive分区表也异常

presto>  select * from xy_app.xy_log_collect_user_funnels limit 10;

Query 20200519_080407_03244_tmamh, FAILED, 1 node
Splits: 9 total, 0 done (0.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20200519_080407_03244_tmamh failed: Partition location does not exist: hdfs://xydwns/user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007/project_id=pay_h5

presto查询的时候提示上当不存在

1.3 hive执行查询,跟pesto提示一致

hive> alter table xy_app.xy_log_collect_user_funnels drop if exists PARTITION(xy_date='20191007');
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Failed to delete parent: File does not exist: /user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getContentSummary(FSDirectory.java:2414)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:4653)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getContentSummary(NameNodeRpcServer.java:1101)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getContentSummary(AuthorizationProviderProxyClientProtocol.java:566)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getContentSummary(ClientNamenodeProtocolServerSideTranslatorPB.java:895)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

看来spark thrift server做得不够好,错了,也不告诉你错在哪里 

1.4 执行hdfs dfs -ls命令查看确实不存在

hdfs dfs -ls /user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007
ls: `/user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007': No such file or directory 

1.5  查看分区,xy_date=20191007却又存在

0: jdbc:hive2://10.0.194.131:10017> show partitions xy_app.xy_log_collect_user_funnels;
+-------------------------------------------------+--+
|                    partition                    |
+-------------------------------------------------+--+
| xy_date=20190304/project_id=invest              |

……

| xy_date=20191006/project_id=cardloan            |
| xy_date=20191006/project_id=cardloan_h5         |
| xy_date=20191006/project_id=pay_h5              |
| xy_date=20191006/project_id=test                |
| xy_date=20191006/project_id=wallet              |
| xy_date=20191006/project_id=wallet_h5           |
| xy_date=20191007/project_id=cardloan            |
| xy_date=20191007/project_id=cardloan_h5         |
| xy_date=20191007/project_id=pay_h5              |
| xy_date=20191007/project_id=test                |
| xy_date=20191007/project_id=wallet              |
| xy_date=20191007/project_id=wallet_h5           |

+-------------------------------------------------+--+

2. 原因分析

hive分区表元数据和hdfs文件目录不一致,导致的读异常。

3. 解决方案

3.1 尝试一:修复分区试试,还是异常

0: jdbc:hive2://10.0.194.131:10017> msck repair table xy_app.xy_log_collect_user_funnels;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (17.727 seconds)
0: jdbc:hive2://10.0.194.131:10017> select * from xy_app.xy_log_collect_user_funnels limit 10;
Error: java.lang.RuntimeException: serious problem (state=,code=0) 

3.2 尝试二:删除分区,然后再执行修复分区命令,结果发现删除分区失败

0: jdbc:hive2://10.0.194.131:10017> alter table xy_app.xy_log_collect_user_funnels drop PARTITION(xy_date='20191007');
Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: File does not exist: /user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getContentSummary(FSDirectory.java:2414)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:4653)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getContentSummary(NameNodeRpcServer.java:1101)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getContentSummary(AuthorizationProviderProxyClientProtocol.java:566)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getContentSummary(ClientNamenodeProtocolServerSideTranslatorPB.java:895)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
; (state=,code=0)

3.3 尝试三: 既然无法删除多余分区,那就创建多余分区对应的目录,再删除多余分区

创建多余分区对应的hdfs目录

hdfs dfs -mkdir /user/hive/warehouse/xy_app.db/xy_log_collect_user_funnels/xy_date=20191007

再删除多余分区 

hive> alter table xy_app.xy_log_collect_user_funnels drop if exists PARTITION(xy_date='20191007');
OK
Time taken: 0.085 seconds 

最后执行查询,成功!

 hive> select visit_id from xy_app.xy_log_collect_user_funnels limit 10;
OK
833ED40B-E899-466B-A459-4834E2F315FB
20F5BB1F-5107-4557-A83D-0C5B8FA5AED8
5637D35D-3855-46F7-A4CB-E5AFAEC206F9
AFD17530-0C1F-4C90-AC01-A8CBB2910854
FD9E2680-3800-4470-B741-69755173A908
E7188AEC-8119-4C71-A7B4-AA7FD5CB95C4
55224461-14C5-40C4-92D5-6DE0CAE32F6E
F3158B3D-BDBB-410C-9E2C-CEAEAE18AD23
32CEA784-8B5D-4CBA-8C5D-97831D7DB7DF
CB98781B-A9A1-4110-AE5B-B617B6B1CB0E
Time taken: 2.047 seconds, Fetched: 10 row(s)

一般情况,尝试一就可以解决问题了,不行就尝试二,基本都能搞定问题。尝试三是不得已而为之,也不知道谁对这个表怎样做了不该做操作。

 

你可能感兴趣的:(spark)