Hive窗口函数之SUM、AVG、MIN和MAX

本文中介绍Hive中的sum、avg、min和max四种窗口函数。

1、数据准备

数据如下所示:

cookie1	2015-04-10	1
cookie1	2015-04-11	5
cookie1	2015-04-12	7
cookie1	2015-04-13	3
cookie1	2015-04-14	2
cookie1	2015-04-15	4
cookie1	2015-04-16	4
cookie2	2015-04-10	2
cookie2	2015-04-11	3
cookie2	2015-04-12	5
cookie2	2015-04-13	6
cookie2	2015-04-14	3
cookie2	2015-04-15	9
cookie2	2015-04-16	7

2、SUM函数

select cookieid, createtime,pv,
sum(pv) over(partition by cookieid order by createtime) as sumpv1,
sum(pv) over(partition by cookieid order by createtime rows between unbounded preceding and current row) as sumpv2,
sum(pv) over(partition by cookieid order by createtime rows between 3 preceding and current row) as sumpv3,
sum(pv) over(partition by cookieid order by createtime rows between 3 preceding and 1 following) as sumpv4,
sum(pv) over(partition by cookieid order by createtime rows between current row and unbounded following) as sumpv5
from cookies;

结果:

Hive窗口函数之SUM、AVG、MIN和MAX_第1张图片

2、AVG函数

SELECT cookieid,
createtime,
pv,
AVG(pv) OVER(PARTITION BY cookieid ORDER BY createtime) AS avgpv1,
AVG(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS avgpv2, 
AVG(pv) OVER(PARTITION BY cookieid) AS avgpv3,
AVG(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS avgpv4,
AVG(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) AS avgpv5,    
AVG(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS avgpv6   
FROM cookies; 

结果:

Hive窗口函数之SUM、AVG、MIN和MAX_第2张图片

后面的MIN和MAX函数与前两种方式一样,在这里就不一一演示。

你可能感兴趣的:(Hive)