hive之regexp_replace函数、split函数的正则

Hive中,regexp_replace函数的第2个参数是正则表达式,第3个参数是字符串

select split(regexp_replace(data,'\\},\\{','}||{'),'\\|\\|')[0]as test  
from  
(select '[{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jd","monthSales":2090,"userCount":78981,"score":"9.8"},{"source":"jdmart","monthSales":6987,"userCount":1600,"score":"9.0"}]'
 as data) a

split函数解析也是正则

因此上面如果写成这样:

select split(regexp_replace(data,'\\},\\{','}||{'),'||')[0]as test  #这里做了修改
from  
(select '[{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jd","monthSales":2090,"userCount":78981,"score":"9.8"},{"source":"jdmart","monthSales":6987,"userCount":1600,"score":"9.0"}]'
 as data) a

就得不到想要的结果了

参考自:https://blog.csdn.net/longshenlmj/article/details/49027145

你可能感兴趣的:(Hive)