Hive 窗口函数

测试表数据

user_login 表

该表表示,用户登录的日期


image.png

user_name 表

该表记录 id 和对应的姓名


image.png

over() 函数

所谓窗口函数,就是在sql 执行到每一行的时候,都会有的一个操作 ,over() 窗口函数,需要和聚合函数连用

快速入门

先看一个简单的例子
select id, name, max(id) over() as max_id from user_name;
结果如下

id  name    max_id
5   e           5
4   d           5
3   c           5
2   b           5
1   a           5

over() 不加任何限制的时候,窗口大小默认是全部行数据,也就是说,遍历到每一行数据时,都会有个窗口,包含全部的数据,外面结合聚合函数 max() ,每次取到的 都是 5

开始给窗口加上条件

partition by XXX sort by XXX

使用 partition by 的窗口

窗口函数实战

获取用户连续登陆的天数

SELECT b.uid, MAX(c)
FROM (
    SELECT a.uid AS uid, a.dt1 AS dt2, COUNT(*) AS c
    FROM (
        SELECT uid AS uid, date_sub(dt, row_number() OVER (PARTITION BY uid ORDER BY dt)) AS dt1
        FROM user_login
    ) a
    GROUP BY a.uid, a.dt1
) b
GROUP BY b.uid;

将表的奇数行和偶数行互换,如果最后一行是奇数行,最后一行不变

select 
  if(old_id %2==1, follow_id, pre_id) as new_id, 
  old_name  
  from (
    select 
      id as old_id, 
      name as old_name, 
      min(id) over(order by id range between 1 preceding and 1 following) as pre_id, 
      max(id) over(order by id range between 1 preceding and 1 following) as follow_id 
     from user_name)a;
-- mysql 写法  
select 
    case when id <> (select max(id) from user_name) and mod(id, 2)!=0 
             then id+1
         when id = (select max(id) from user_name) and mod(id, 2)!=0 
             then id
    else id-1 end as id, 
    name
from user_name
order by id

你可能感兴趣的:(Hive 窗口函数)