要处理的数据如下:
{
"IP": "192.168.1.1",
"appName": "sichuan_yunyingyong",
"customEvent": [
{
"eventName": "xx1",
"du": "xx",
"timestamp": "1480521763049",
"eventParams": {
"ContentID": "yixiuge",
"account": "13856976635",
"networkType": "WIFI",
"result": "0",
"type": "11"
}
},
{
"eventName": "xx2",
"du": "xx",
"timestamp": "1480521763049",
"eventParams": {
"ContentID": "yixiuge",
"account": "13856976636",
"networkType": "WIFI",
"result": "0",
"type": "11"
}
}
]
}
posexplode(array)把数组变成(pos, json) 的键值对
select
lateral view posexplode(split(regexp_replace(regexp_replace(j1.j1_customEvent,'\\}\\,\\{','\\}\\|\\|\\{'),'\\[|\\]',''), '\\|\\|')) j2 as j2_customEvents_pos, j2_customEvent_json
执行结果为:
192.168.1.1 sichuan_yunyingyong {"eventName":"xx1","du":"xx","timestamp":"1480521763049","eventParams":{"ContentID":"yixiuge","account":"13856976635","networkType":"WIFI","result":"0","type":"11"}}
192.168.1.1 sichuan_yunyingyong {"eventName":"xx2","du":"xx","timestamp":"1480521763049","eventParams":{"ContentID":"yixiuge","account":"13856976636","networkType":"WIFI","result":"0","type":"11"}}
这里把json array的格式通过替换变成了 {json1} || {json2} , 再去掉数组的括号,最后根据 || 来拆开,形成了一个有两个元素的数组,接着 posexplode 在把数组变成(pos, json) 的键值对,pos记录了元素的位置,json就是实际的json数据,这样一条数据就变成了两条了
那么现在要获取IP , appName , account 就很简单了:
select192.168.1.1 sichuan_yunyingyong 13856976636