Hive处理json格式数据

1、数据示例
假设info表中存有两个字段,分别是id,content
content={"resultCode":"0000","message":"处理成功","properties":[{"birthday":"1996-2-22","issued_by":"公安局","valid_date":"2017.03.19-2027.03.19","address":"XXXX","gender":"男","race":"汉","name":"Jack","id_card_number":"123456789012345678"}]}
2、使用hive内置函数
get_json_object 、 json_tuple、  regexp_replace
3、说明
使用get_json_object 解析content,获取properties,使用regexp_replace替换掉中括号(原因是json_tuple不识别array类型),最后用json_tuple获取properties中所需的字段
例如:获取id,name和id_card_number
select id,t2.* from (select id, regexp_replace(get_json_object(content,"$.properties"),'\\[|\\]','') properties from info) t1 lateral view json_tuple(tt.properties,"name","id_card_number") t2 as name,id_card_number;
结果:
id        t2.name     t3.id_card_number
1          Jack            123456789012345678

你可能感兴趣的:(Hive处理json格式数据)