hivesql窗口函数

一、累计计算窗口函数

sum(field) over(partition by field order by field asc/desc rows between … and …)

avg(field) over(partition by field order by field asc/desc rows between … and …)

max(field) over(partition by field order by field asc/desc rows between … and …)

min(field) over(partition by field order by field asc/desc rows between … and …)

rows between unbounded preceding and current row-表示包含本行和之前所有行

rows between current row and unbounded following-表示包含本行和之后所有行

rows between 3 preceding and 1 following-表示包含前三行和后一行(共计五行)

默认rows between unbounded preceding and unbounded following-表示不限制行数

说明:field表示字段;partition by起到分组作用;order by起到排序作用,升序asc/降序desc,默认asc

二、分区排序窗口函数

row_number() over(partition by field order by field asc/desc)

rank() over(partition by field order by field asc/desc)

dense_rank() over(partition by field order by field asc/desc)

row_number会为查询的每一行记录生成一个序号,连续且不重复

rank生成的序号有可能不连续,跳跃排序,出现相同排名时,下一次rank值为相关行之前的行数+1

dense_rank生成的序号连续,出现相同排名时,下一次rank值仍跟着上一次的rank值

说明:field表示字段

三、分组排序窗口函数

ntile(n) over(partition by field order by field asc/desc)

说明:field表示字段;ntile(n)用于将分组数据按照顺序切成n片,返回当前切片值,如果切片不均匀,默认增加第一个切片的分布

四、偏移分析窗口函数

lag(exp_str,offset,defval) over(partition by field order by field asc/desc)

lead(exp_str,offset,defval) over(partition by field order by field asc/desc)

lag表示从当前行向上偏移,lead表示从当前行向下偏移。

说明:field表示字段;exp_str是字段名称;offset是偏移量,向上或向下偏移行数,默认值为1;defval是默认值,当从当前行位置向上或向下偏移offset行已经超出了表的范围,则返回defval值,如果未指定defval值,则返回null。

你可能感兴趣的:(hivesql,hive,sql)