percent_rank函数以0到1之间的分数形式返回某个值在数据分区中的排名 。 个人感觉适用场景较少
建表:
create table SMALL_CUSTOMERS(CUSTOMER_ID NUMBER,SUM_ORDERS NUMBER); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 10); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 20); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 30); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 5); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 10); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 1);
具体用法:
select customer_id, sum_orders, percent_rank() over(order by sum_orders) percentRank from small_customers t;
percentile_cont:
percentile_cont函数对于计算内插值是非常有用的。percentile_cont函数接收一个0到1之间的几率值并返回与声明了排序的percent_rank函数计算值相等的内插值百分比。
语法:percentile_con(expr) within group(sort-clause) over(partition-clause order-by-clause)
具体用法:
select customer_id, sum_orders, percent_rank() over(order by sum_orders) percentRank, percentile_cont(0.4) within group(order by sum_orders) over(partition by customer_id) percentileCont from small_customers t;
percentile_disc:
percentile_disc函数在功能上类似于percentile_cont函数,只是percentile_cont函数使用了连续分布模型,而percentile_disc函数使用了离期 分布模型。当没有值与指定的percent_rank精确匹配的时候,percentile_cont(0.5)会计算两个离得最近的值的平均值。相反,在升序排列的情况下,percentile_disc函数只取比所传递的参数percent_rank值更大的值。在降序排列的时候,percentile_disc函数只取比所传递的参数percent_rank值更小的值。
具体用法:
select customer_id, sum_orders, percent_rank() over(order by sum_orders) percentRank, percentile_cont(0.4) within group(order by sum_orders) over(partition by customer_id) percentileCont, percentile_disc(0.4) within group(order by sum_orders) over(partition by customer_id) percentileDisc from small_customers t;