Hive中定义分割符会使用八进制的ASCII码

问题描述:
今天在用Azkaban跑job的时候发现出了如下问题:

14-11-2021 15:50:00 CST analysis INFO - MismatchedTokenException(24!=347)
14-11-2021 15:50:00 CST analysis INFO - 	at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
14-11-2021 15:50:00 CST analysis INFO - 	at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.HiveParser.cteStatement(HiveParser.java:36027)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.HiveParser.withClause(HiveParser.java:35886)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35700)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2284)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
14-11-2021 15:50:00 CST analysis INFO - 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
14-11-2021 15:50:00 CST analysis INFO - 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
14-11-2021 15:50:00 CST analysis INFO - 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
14-11-2021 15:50:00 CST analysis INFO - 	at java.lang.reflect.Method.invoke(Method.java:498)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
14-11-2021 15:50:00 CST analysis INFO - 	at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
14-11-2021 15:50:00 CST analysis INFO - FAILED: ParseException line 1:75 mismatched input 'Nov' expecting ) near 'Sun' in statement
14-11-2021 15:50:01 CST analysis INFO - Process completed unsuccessfully in 52 seconds.
14-11-2021 15:50:01 CST analysis ERROR - Job run failed!

其中有一个job是将数据处理后放入到一个预先创建好的表中,但是发现插入失败.
具体创建表语句

create table user_info(active_num string,`date` string)
row format delimited fields terminated by '\t' ;

这个问题出现的原因是hive中的分割符使用八进制的ASCII码表示
具体如下表

八进制 十六进制 十进制 字符 八进制 十六进制 十进制 字符
00 00 0 nul 100 40 64 @
01 01 1 soh 101 41 65 A
02 02 2 stx 102 42 66 B
03 03 3 etx 103 43 67 C
04 04 4 eot 104 44 68 D
05 05 5 enq 105 45 69 E
06 06 6 ack 106 46 70 F
07 07 7 bel 107 47 71 G
10 08 8 bs 110 48 72 H
11 09 9 ht 111 49 73 I
12 0a 10 nl 112 4a 74 J
13 0b 11 vt 113 4b 75 K
14 0c 12 ff 114 4c 76 L
15 0d 13 er 115 4d 77 M
16 0e 14 so 116 4e 78 N
17 0f 15 si 117 4f 79 O
20 10 16 dle 120 50 80 P
21 11 17 dc1 121 51 81 Q
22 12 18 dc2 122 52 82 R
23 13 19 dc3 123 53 83 S
24 14 20 dc4 124 54 84 T
25 15 21 nak 125 55 85 U
26 16 22 syn 126 56 86 V
27 17 23 etb 127 57 87 W
30 18 24 can 130 58 88 X
31 19 25 em 131 59 89 Y
32 1a 26 sub 132 5a 90 Z
33 1b 27 esc 133 5b 91 [
34 1c 28 fs 134 5c 92 |
35 1d 29 gs 135 5d 93 ]
36 1e 30 re 136 5e 94 ^
37 1f 31 us 137 5f 95 _
40 20 32 sp 140 60 96
41 21 33 ! 141 61 97 a
42 22 34 142 62 98 b
43 23 35 # 143 63 99 c
44 24 36 $ 144 64 100 d
45 25 37 % 145 65 101 e
46 26 38 & 146 66 102 f
47 27 39 ` 147 67 103 g
50 28 40 ( 150 68 104 h
51 29 41 ) 151 69 105 i
52 2a 42 * 152 6a 106 j
53 2b 43 + 153 6b 107 k
54 2c 44 , 154 6c 108 l
55 2d 45 - 155 6d 109 m
56 2e 46 . 156 6e 110 n
57 2f 47 / 157 6f 111 o
60 30 48 0 160 70 112 p
61 31 49 1 161 71 113 q
62 32 50 2 162 72 114 r
63 33 51 3 163 73 115 s
64 34 52 4 164 74 116 t
65 35 53 5 165 75 117 u
66 36 54 6 166 76 118 v
67 37 55 7 167 77 119 w
70 38 56 8 170 78 120 x
71 39 57 9 171 79 121 y
72 3a 58 : 172 7a 122 z
73 3b 59 ; 173 7b 123 {
74 3c 60 < 174 7c 124 |
75 3d 61 = 175 7d 125 }
76 3e 62 > 176 7e 126 ~
77 3f 63 ? 177 7f 127 del

如果没有可以通过对应的二进制转化
https://www.bejson.com/convert/jinzhi/
比如我获取水平制表符的八进制码
Hive中定义分割符会使用八进制的ASCII码_第1张图片
重新创建表:

create table user_info(active_num string,`date` string) row format delimited fields terminated by '\011';

每天进步一点点.

你可能感兴趣的:(大数据,hive,hadoop,big,data)