hive row_number 使用注意点

 Row_number 是hive常用的窗口函数之一,目的:分组排序。

场景:order by value, value 值重复执行行,连续两次使用 row_number ,结果会不如所愿。

drop table test.test_row_number;
create table test.test_row_number
as
select date_add(current_timestamp(),-10) as event_time,'1' as id, 'A' as a,'B' as b,'C' as c
union all select date_add(current_timestamp(),-10) as event_time,'1' as id, 'a' as a, null as b,null as c
union all select date_add(current_timestamp(),-10) as event_time,'1' as id,null as a, 'b' as b,null as c
union all select date_add(current_timestamp(),-9) as event_time,'1' as id,null as a, null as b,'c' as c
;

SELECT
*
from 
test.test_row_number
;

SELECT
event_time,id,a,b,c,
ROW_NUMBER() OVER (PARTITION by id order by event_time) as rn,
ROW_NUMBER() OVER (PARTITION by id order by event_time desc ) as back_rn
from
test.test_row_number
;

其结果为:

hive row_number 使用注意点_第1张图片

你可能感兴趣的:(HIVE,使用小技巧,基础知识,hive)