SQL窗口函数-排名窗口函数

关于窗口函数的基础,请看文章SQL窗口函数

排名窗口函数可以用于获取数据的分类排名。常见的排名窗口函数如下:

  1. ROW_NUMBER函数可以为分区中的每行数据分配一个序列号,序列号从1开始。
  2. RANK函数返回当前行在分区中的名次。如果存在名次相同的数据,后续的排名将会产生跳跃。
  3. DENSE_RANK函数返回当前行在分区中的名次。即使存在名次相同的数据,后续的排名也是连续值。
  4. PERCENT_RANK函数以百分比的形式返回当前行在分区中的名次。如果存在名次相同的数据,后续的排名将会产生跳跃。
  5. CUME_DIST函数计算当前行在分区内的累积分布。
  6. NTILE函数将分区内的数据分为N等份,并返回当前行所在的分片位置。

排名窗口函数不支持动态的窗口大小选项,而是以整个分区作为分析的窗口。

案例分析

案例使用的示例表

下面的查询中会用到两个表,其中employee表中存储了员工的基本信息,包括姓名、入职日期、部门编号、薪资等字段。以下是该表中的部分数据:

SQL窗口函数-排名窗口函数_第1张图片

department表中记录了部门信息,包括部门编号和部门名称。以下是该表中的数据: 

SQL窗口函数-排名窗口函数_第2张图片

这两个表的初始化脚本可以在文章底部获取。

1.分类排名

以下查询使用4个不同的排名函数计算每个员工在其部门内的月薪排名:

SELECT d.dept_name AS "部门名称", e.emp_name AS "姓名", e.salary AS "月薪",

       ROW_NUMBER() OVER(
         PARTITION BY e.dept_id ORDER BY e.salary DESC
       ) AS "row_number",
       
       RANK() OVER(
         PARTITION BY e.dept_id ORDER BY e.salary DESC
       ) AS "rank",
       
       DENSE_RANK() OVER(
         PARTITION BY e.dept_id ORDER BY e.salary DESC
       ) AS "dense_rank",
        
       PERCENT_RANK() OVER(
         PARTITION BY e.dept_id ORDER BY e.salary DESC
       ) AS "percent_rank"
       
FROM employee e
JOIN department d ON e.dept_id=d.dept_id

其中,4个窗口函数的OVER子句完全相同,PARTITION BY表示按照部门进行分区,ORDER BY表示按照月薪从高到低进行排序。该查询返回的结果如下:

SQL窗口函数-排名窗口函数_第3张图片

以“研发部”为例,ROW_NUMBER函数为每个员工分配了一个连续的数字编号,其中“廖化”和“张苞”的月薪相同,但是编号不同。


RANK函数为每个员工返回了一个名次,其中“廖化”和“张苞”的名次都是6,在他们之后“赵统”的名次为8,产生了跳跃。


DENSE_RANK函数为每个员工返回了一个名次,其中“廖化”和“张苞”的名次都是6,在他们之后“赵统”的名次为7,没有产生跳跃。


PERCENT_RANK函数按照百分比指定名次,取值位于0到1之间。其中“赵统”的百分比排名为0.875,产生了跳跃。

2.Top-N排行榜

基于排名窗口函数,我们还可以实现分类Top-N排行榜。例如,以下语句用于查找每个部门中最早入职的2名员工:

SELECT dept_name AS "部门名称", emp_name AS "姓名", to_char(hire_date,'yyyy-mm-dd hh24:mi:ss') AS "入职日期",rn AS "入职顺序" FROM (
  SELECT d.dept_name, e.emp_name, e.hire_date,
         ROW_NUMBER() OVER(
           PARTITION BY e.dept_id ORDER BY e.hire_date
         ) AS rn
  FROM employee e
  JOIN department d ON e.dept_id=d.dept_id
)
WHERE rn <=2;

查询结果:

SQL窗口函数-排名窗口函数_第4张图片

3.累积分布

CUME_DIST函数可以返回当前行在分区内的累积分布,也就是排名在当前行之前(包含当前行)所有数据所占的比率,取值范围为大于0且小于或等于1。


例如,以下查询返回了所有员工按照月薪排名的累积分布情况:

SELECT e.emp_name AS "姓名",e.salary AS "月薪",
       CUME_DIST() OVER(
          ORDER BY e.salary
       ) AS "累积占比"
FROM employee e

其中,OVER子句没有指定分区选项,因此CUME_DIST函数会将全体员工作为一个整体进行分析。ORDER BY选项表示按照月薪从低到高进行排序。该查询返回的结果如下:

SQL窗口函数-排名窗口函数_第5张图片

结果显示8%的员工月薪小于或等于4000元;或者也可以说,月薪4000元,意味着在公司中的月薪排名属于最低的8%。

4.平分后排序

NTILE函数用于将分区内的数据分为N等份,并计算当前行所在的分片位置。

例如,以下语句将员工按照入职先后顺序分为5组,并计算每个员工所在的分组:

SELECT e.emp_name AS "姓名",to_char(hire_date,'yyyy-mm-dd') AS "入职日期",
       NTILE(5) OVER(
          ORDER BY e.hire_date
       ) AS "分组位置"
FROM employee e

其中,OVER子句没有指定分区选项,因此NTILE函数会将全体员工作为一个整体进行分析。

ORDER BY选项表示按照入职先后进行排序。该查询返回的结果如下:

SQL窗口函数-排名窗口函数_第6张图片

 分组位置为1的是最早入职的20%员工,分组位置为5的是最晚入职的20%员工。

示例表和脚本

--部门表
CREATE TABLE department
    ( dept_id    NUMBER
    , dept_name  VARCHAR2(50) NOT NULL
    ) ;
COMMENT ON TABLE department IS '部门信息表';
COMMENT ON COLUMN department.dept_id IS '部门编号,自增主键';
COMMENT ON COLUMN department.dept_name IS '部门名称';


-- 生成测试数据
INSERT INTO department(DEPT_ID,dept_name) VALUES (1,'行政管理部');
INSERT INTO department(DEPT_ID,dept_name) VALUES (2,'人力资源部');
INSERT INTO department(DEPT_ID,dept_name) VALUES (3,'财务部');
INSERT INTO department(DEPT_ID,dept_name) VALUES (4,'研发部');
INSERT INTO department(DEPT_ID,dept_name) VALUES (5,'销售部');
INSERT INTO department(DEPT_ID,dept_name) VALUES (6,'保卫部');





--员工信息表
CREATE TABLE employee
    ( emp_id    NUMBER
    , emp_name  VARCHAR2(50) NOT NULL
    , sex       VARCHAR2(10) NOT NULL
    , dept_id   INTEGER NOT NULL
    , manager   INTEGER
    , hire_date DATE NOT NULL
    , job_id    INTEGER NOT NULL
    , salary    NUMERIC(8,2) NOT NULL
    , bonus     NUMERIC(8,2)
    , email     VARCHAR2(100) NOT NULL
  , comments  VARCHAR2(500)
  , create_by VARCHAR2(50) NOT NULL
  , create_ts TIMESTAMP NOT NULL
  , update_by VARCHAR2(50) 
  , update_ts TIMESTAMP
    ) ;
COMMENT ON TABLE employee IS '员工信息表';
COMMENT ON COLUMN employee.emp_id IS '员工编号,自增主键';
COMMENT ON COLUMN employee.emp_name IS '员工姓名';
COMMENT ON COLUMN employee.sex IS '性别';
COMMENT ON COLUMN employee.dept_id IS '部门编号';
COMMENT ON COLUMN employee.manager IS '上级经理';
COMMENT ON COLUMN employee.hire_date IS '入职日期';
COMMENT ON COLUMN employee.job_id IS '职位编号';
COMMENT ON COLUMN employee.salary IS '月薪';
COMMENT ON COLUMN employee.bonus IS '年终奖金';
COMMENT ON COLUMN employee.email IS '电子邮箱';
COMMENT ON COLUMN employee.comments IS '备注信息';
COMMENT ON COLUMN employee.create_by IS '创建者';
COMMENT ON COLUMN employee.create_ts IS '创建时间';
COMMENT ON COLUMN employee.update_by IS '修改者';
COMMENT ON COLUMN employee.update_ts IS '修改时间';
 
 
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (1,'刘备', '男', 1, NULL, DATE '2000-01-01', 1, 30000, 10000, '[email protected]', NULL, 'Admin', TIMESTAMP '2000-01-01 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (2,'关羽', '男', 1, 1, DATE '2000-01-01', 2, 26000, 10000, '[email protected]', NULL, 'Admin', TIMESTAMP '2000-01-01 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (3,'张飞', '男', 1, 1, DATE '2000-01-01', 2, 24000, 10000, '[email protected]', NULL, 'Admin', TIMESTAMP '2000-01-01 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (4,'诸葛亮', '男', 2, 1, DATE '2006-03-15', 3, 24000, 8000, '[email protected]', NULL, 'Admin', TIMESTAMP '2006-03-15 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (5,'黄忠', '男', 2, 4, DATE '2008-10-25', 4, 8000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2008-10-25 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (6,'魏延', '男', 2, 4, DATE '2007-04-01', 4, 7500, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2007-04-01 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (7,'孙尚香', '女', 3, 1, DATE '2002-08-08', 5, 12000, 5000, '[email protected]', NULL, 'Admin', TIMESTAMP '2002-08-08 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (8,'孙丫鬟', '女', 3, 7, DATE '2002-08-08', 6, 6000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2002-08-08 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (9,'赵云', '男', 4, 1, DATE '2005-12-19', 7, 15000, 6000, '[email protected]', NULL, 'Admin', TIMESTAMP '2005-12-19 10:00:00', 'Admin', TIMESTAMP '2006-12-31 10:00:00');
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (10,'廖化', '男', 4, 9, DATE '2009-02-17', 8, 6500, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2009-02-17 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (11,'关平', '男', 4, 9, DATE '2011-07-24', 8, 6800, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2011-07-24 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (12,'赵氏', '女', 4, 9, DATE '2011-11-10', 8, 6600, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2011-11-10 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (13,'关兴', '男', 4, 9, DATE '2011-07-30', 8, 7000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2011-07-30 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (14,'张苞', '男', 4, 9, DATE '2012-05-31', 8, 6500, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2012-05-31 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (15,'赵统', '男', 4, 9, DATE '2012-05-03', 8, 6000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2012-05-03 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (16,'周仓', '男', 4, 9, DATE '2010-02-20', 8, 8000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2010-02-20 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (17,'马岱', '男', 4, 9, DATE '2014-09-16', 8, 5800, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2014-09-16 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (18,'法正', '男', 5, 2, DATE '2017-04-09', 9, 10000, 5000, '[email protected]', NULL, 'Admin', TIMESTAMP '2017-04-09 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (19,'庞统', '男', 5, 18, DATE '2017-06-06', 10, 4100, 2000, '[email protected]', NULL, 'Admin', TIMESTAMP '2017-06-06 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (20,'蒋琬', '男', 5, 18, DATE '2018-01-28', 10, 4000, 1500, '[email protected]', NULL, 'Admin', TIMESTAMP '2018-01-28 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (21,'黄权', '男', 5, 18, DATE '2018-03-14', 10, 4200, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2018-03-14 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (22,'糜竺', '男', 5, 18, DATE '2018-03-27', 10, 4300, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2018-03-27 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (23,'邓芝', '男', 5, 18, DATE '2018-11-11', 10, 4000, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2018-11-11 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (24,'简雍', '男', 5, 18, DATE '2019-05-11', 10, 4800, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2019-05-11 10:00:00', NULL, NULL);
INSERT INTO employee(EMP_ID,emp_name, sex, dept_id, manager, hire_date, job_id, salary, bonus, email, comments, create_by, create_ts, update_by, update_ts) VALUES (25,'孙乾', '男', 5, 18, DATE '2018-10-09', 10, 4700, NULL, '[email protected]', NULL, 'Admin', TIMESTAMP '2018-10-09 10:00:00', NULL, NULL);

你可能感兴趣的:(oracle,Oracle,排序窗口函数,row_number,CUME_DIST,NTILE)