Hive窗口函数之GROUPING SETS,GROUPING__ID,CUBE,ROLLUP

数据:

2015-03,2015-03-10,cookie1
2015-03,2015-03-10,cookie5
2015-03,2015-03-12,cookie7
2015-04,2015-04-12,cookie3
2015-04,2015-04-13,cookie2
2015-04,2015-04-13,cookie4
2015-04,2015-04-16,cookie4
2015-03,2015-03-10,cookie2
2015-03,2015-03-10,cookie3
2015-04,2015-04-12,cookie5
2015-04,2015-04-13,cookie6
2015-04,2015-04-15,cookie3
2015-04,2015-04-15,cookie2
2015-04,2015-04-16,cookie1

1、GROUPING SETS函数

在一个GROUP BY查询中,根据不同的维度组合进行聚合,等价于将不同维度的GROUP BY结果集进行UNION ALL

SELECT 
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID 
FROM cookies2 
GROUP BY month,day 
GROUPING SETS (month,day) 
ORDER BY GROUPING__ID;

等同于:

SELECT month,NULL,COUNT(DISTINCT cookieid) AS uv,1 AS GROUPING__ID FROM cookies2 GROUP BY month 
UNION ALL 
SELECT NULL,day,COUNT(DISTINCT cookieid) AS uv,2 AS GROUPING__ID FROM cookies2 GROUP BY day

结果:

Hive窗口函数之GROUPING SETS,GROUPING__ID,CUBE,ROLLUP_第1张图片

2、CUBE函数

根据GROUP BY的维度的所有组合进行聚合。

SELECT 
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID 
FROM cookies2 
GROUP BY month,day 
WITH CUBE 
ORDER BY GROUPING__ID;

等同于:

SELECT NULL,NULL,COUNT(DISTINCT cookieid) AS uv,0 AS GROUPING__ID FROM cookies2
UNION ALL 
SELECT month,NULL,COUNT(DISTINCT cookieid) AS uv,1 AS GROUPING__ID FROM cookies2 GROUP BY month 
UNION ALL 
SELECT NULL,day,COUNT(DISTINCT cookieid) AS uv,2 AS GROUPING__ID FROM cookies2 GROUP BY day
UNION ALL 
SELECT month,day,COUNT(DISTINCT cookieid) AS uv,3 AS GROUPING__ID FROM cookies2 GROUP BY month,day

结果:

Hive窗口函数之GROUPING SETS,GROUPING__ID,CUBE,ROLLUP_第2张图片

3、ROLLUP函数

是CUBE的子集,以最左侧的维度为主,从该维度进行层级聚合。

比如,以month维度进行层级聚合:
SELECT 
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID  
FROM cookies2 
GROUP BY month,day
WITH ROLLUP 
ORDER BY GROUPING__ID;
可以实现这样的上钻过程:
月天的UV->月的UV->总UV

结果:

Hive窗口函数之GROUPING SETS,GROUPING__ID,CUBE,ROLLUP_第3张图片

--把month和day调换顺序,则以day维度进行层级聚合:
 
SELECT 
day,
month,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID  
FROM cookies2 
GROUP BY day,month 
WITH ROLLUP 
ORDER BY GROUPING__ID;

可以实现这样的上钻过程:
天月的UV->天的UV->总UV

结果:

Hive窗口函数之GROUPING SETS,GROUPING__ID,CUBE,ROLLUP_第4张图片

你可能感兴趣的:(Hive)