Hive读取复杂的数据类型(Array,Map,Struct)

1.数组
数据文件 hive_array.txt,Array(1,2,3,4)  装的数据类型是一样的

zhnagsan        PEK,SHA,HAK,NKG
lisi    CTU,CKG,XIY,CSX

创建表:

create table hive_array(name string, work_locations array)
row format delimited fields terminated by '\t'
COLLECTION ITEMS TERMINATED BY ',';

加载数据:load data local inpath '/home/hadoop/testdata/hive_array.txt' into table hive_array;
查询数据:

+------------------+----------------------------+--+
| hive_array.name  | hive_array.work_locations  |
+------------------+----------------------------+--+
| zhnagsan         | ["PEK","SHA","HAK","NKG"]  |
| lisi             | ["CTU","CKG","XIY","CSX"]  |
+------------------+----------------------------+--+
2 rows selected (0.233 seconds)
0: jdbc:hive2://hadoop002:10000/test> 

查询数组字段的第一个值:work_locations[index] : index from zero

0: jdbc:hive2://hadoop002:10000/test> select name,work_locations[0] from hive_array;

+-----------+------+--+
|   name    | _c1  |
+-----------+------+--+
| zhnagsan  | PEK  |
| lisi      | CTU  |
+-----------+------+--+
2 rows selected (0.103 seconds)
0: jdbc:hive2://hadoop002:10000/test> select name,size(work_locations) from hive_array;

+-----------+------+--+
|   name    | _c1  |
+-----------+------+--+
| zhnagsan  | 4    |
| lisi      | 4    |
+-----------+------+--+
2 rows selected (0.096 seconds)
0: jdbc:hive2://hadoop002:10000/test> select name from hive_array where  array_contains(work_locations,'PEK');

+-----------+--+
|   name    |
+-----------+--+
| zhnagsan  |
+-----------+--+
1 row selected (0.075 seconds)

2.Map
数据文件hive_map.txt

1,zhangsan,father:laozhang#mother:xiaohuang#brother:xiaohu,55
2,lisi,father:laoli#mother:xiaoxu#brother:xiaoqiang,44
3,wangwu,father:laowang#mother:xiaoliu#sister:wanghua,33
4,zhaoliu,father:xiaozhao#mother:xiaozhu,22

建表并加载数据

hive (test)> create table hive_map(id int,name string, members map, age int)
           > row format delimited fields terminated by ','
           > COLLECTION ITEMS TERMINATED BY '#'
           > MAP KEYS TERMINATED BY ':';
OK
Time taken: 0.305 seconds
hive (test)> load data local inpath '/home/hadoop/testdata/hive_map.txt' into table hive_map;
Loading data to table test.hive_map
Table test.hive_map stats: [numFiles=1, totalSize=218]
OK
Time taken: 0.398 seconds

查询数据

hive (test)> select * from hive_map;
OK
hive_map.id	hive_map.name	hive_map.members	hive_map.age
1	zhangsan	{"father":"laozhang","mother":"xiaohuang","brother":"xiaohu"}	55
2	lisi	{"father":"laoli","mother":"xiaoxu","brother":"xiaoqiang"}	44
3	wangwu	{"father":"laowang","mother":"xiaoliu","sister":"wanghua"}	33
4	zhaoliu	{"father":"xiaozhao","mother":"xiaozhu"}	22
Time taken: 0.2 seconds, Fetched: 4 row(s)
hive (test)> select id,name,age,members["father"],members["sister"] from hive_map;
OK
id	name	age	_c3	_c4
1	zhangsan	55	laozhang	NULL
2	lisi	44	laoli	NULL
3	wangwu	33	laowang	wanghua
4	zhaoliu	22	xiaozhao	NULL
Time taken: 0.096 seconds, Fetched: 4 row(s)
hive (test)> select id,map_keys(members),size(members) from hive_map;
OK
id	_c1	_c2
1	["father","mother","brother"]	3
2	["father","mother","brother"]	3
3	["father","mother","sister"]	3
4	["father","mother"]	2
Time taken: 0.046 seconds, Fetched: 4 row(s)
hive (test)> select id,name,age,members from hive_map where array_contains(map_keys(members),'brother');
OK
id	name	age	members
1	zhangsan	55	{"father":"laozhang","mother":"xiaohuang","brother":"xiaohu"}
2	lisi	44	{"father":"laoli","mother":"xiaoxu","brother":"xiaoqiang"}
Time taken: 0.054 seconds, Fetched: 2 row(s)
hive (test)> 

3.Struct
数据文件 hive_struct.txt

192.168.1.1#zhangsan:40
192.168.1.2#lisi:30
192.168.1.3#wangwu:20

创建表并加载数据

hive (test)> create table hive_struct(ip string, userinfo struct)
           > row format delimited fields terminated by '#'
           > COLLECTION ITEMS TERMINATED BY ':';
OK
Time taken: 0.068 seconds
hive (test)> load data local inpath '/home/hadoop/testdata/hive_struct.txt' into hive_struct;
FAILED: ParseException line 1:68 missing TABLE at 'hive_struct' near ''
hive (test)> load data local inpath '/home/hadoop/testdata/hive_struct.txt' into table hive_struct;

查询数据

Time taken: 0.172 seconds
hive (test)> select * from hive_struct;
OK
hive_struct.ip	hive_struct.userinfo
192.168.1.1	{"name":"zhangsan","age":40}
192.168.1.2	{"name":"lisi","age":30}
192.168.1.3	{"name":"wangwu","age":20}
Time taken: 0.044 seconds, Fetched: 3 row(s)

hive (test)> select ip,userinfo.name,userinfo.age from hive_struct;
OK
ip	name	age
192.168.1.1	zhangsan	40
192.168.1.2	lisi	30
192.168.1.3	wangwu	20
Time taken: 0.032 seconds, Fetched: 3 row(s)



 

你可能感兴趣的:(Hive)