hive group by with cube和grouping sets用法

group by with cube

此方式对每一个统计维度打cube,维度的空值自动会变成null,使用coalesce(x x x x,‘all’)把对应维度的null转为all,就是该字段的合计值

select 
  coalesce(page_city_name,'all') as page_city_name,
  coalesce(item_index,'all') as item_index,
  count(distinct if(partition_event_type='view',device_id,null)) as view_uv,
  count(distinct if(partition_event_type='click',device_id,null)) as click_uv
from
    (
        select 
             device_id,                               
             city_name,   
             partition_event_type            
         from xxxxxxxxx
         where partition_date='2019-09-10'
         and partition_event_type in ('click','view')
group by
  page_city_name,
  item_index
with cube

注意在group by的时候不能写为coalesce(page_city_name,‘all’),否则结果还是会产生null值

group by
page_city_name,
item_index
hive group by with cube和grouping sets用法_第1张图片
group by
coalesce(page_city_name,‘all’),
coalesce(item_index,‘all’)
hive group by with cube和grouping sets用法_第2张图片

grouping sets

grouping sets的用法相对简单,实际上就是分别将字段group by之后在union all,示例如下:

group by 
       module,
       origin,
       page_city_name,
       page_city_rank
grouping sets(
       (module),
       (module,origin),
       (module,page_city_rank,page_city_name),
       (module,origin,page_city_rank,page_city_name)
    )

你可能感兴趣的:(大数据)