【LeetCode-SQL】185. 部门工资前三高的所有员工

目录

  • 一、题目
  • 二、解决
    • 1、dense_rank()
    • 2、变量
    • 3、表关联
  • 三、参考

一、题目

Employee 表包含所有员工信息,每个员工有其对应的工号 Id,姓名 Name,工资 Salary 和部门编号 DepartmentId

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 85000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
| 5  | Janet | 69000  | 1            |
| 6  | Randy | 85000  | 1            |
| 7  | Will  | 70000  | 1            |
+----+-------+--------+--------------+

Department 表包含公司所有部门的信息。

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

编写一个 SQL 查询,找出每个部门获得前三高工资的所有员工。例如,根据上述给定的表,查询结果应返回:

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| IT         | Randy    | 85000  |
| IT         | Joe      | 85000  |
| IT         | Will     | 70000  |
| Sales      | Henry    | 80000  |
| Sales      | Sam      | 60000  |
+------------+----------+--------+

解释

IT 部门中,Max 获得了最高的工资,RandyJoe 都拿到了第二高的工资,Will 的工资排第三。销售部门(Sales)只有两名员工,Henry 的工资最高,Sam 的工资排第二。

二、解决

1、dense_rank()

思路:

使用dense_rank函数,找到每个部门最高,然后取dense_rank<=3的结果即可。

代码-版本1:

大数据量时,要尽量避免通过salary这种数字进行表间联结,性能会很不可测。

# 击败74.59%
select 
    Department
    , Employee
    , Salary 
from (
    select 
        t2.name as department
        , t1.name as employee
        , t1.salary
        , dense_rank() over (partition by departmentid order by salary desc) as rk
    from employee as t1
    inner join department as t2 on t1.departmentid = t2.id
) as t3 
where rk <= 3;

代码-版本2:

可以先dense_rank,再join维度表(hive或spark里必要时进行map join),在分布式计算中,性能会高一些。

# more quicker,击败90.91%
select 
    t2.name AS Department
    , t1.Employee
    , t1.Salary 
from (
    select 
        DepartmentId, 
        name as employee, 
        salary, 
        dense_rank() over (partition by departmentid order by salary desc) as rnk
    from employee
) t1
inner join department as t2 on t1.departmentid = t2.id
where t1.rnk <= 3;

2、变量

思路:

S1:拆解问题,先把各个部门内薪水排名搞定。 根据部门升序、薪水降序的方式对员工记录进行排序,具体可以分为下面两种情况:
(1)本条记录与上条记录的部门ID相同,若薪水相同,排名不变;薪水不同,排名累加。
(2)本条记录与上条记录的部门ID不同,说明这是新部门的第一条记录,排名置1。

具体实现的代码如下:

    SELECT 
        name
        , Salary
        , DepartmentId
        , CASE 
            WHEN @preDeptId = DepartmentId AND @preSal = Salary THEN @rnk := @rnk
            WHEN @preDeptId = DepartmentId AND @preSal != Salary THEN @rnk := @rnk + 1
            WHEN @preDeptId != DepartmentId THEN @rnk := 1
            Else @rnk := 1  # @preDeptId != DepartmentId, preDeptId为null时,结果为null
        END AS RNK
        , @preDeptId := DepartmentId
        , @preSal := Salary
    FROM Employee, (SELECT @preDeptId := null,  @preSal := null, @rnk := 1) as init
    ORDER BY DepartmentId, Salary DESC

S2:筛选记录,得到预期结果。 关联department表,得到部门名;再用rnk<3条件,得到排名前三的薪水

代码:

# 击败92.10%
SELECT t2.Name Department, t1.Name Employee, t1.Salary
FROM 
(## 自定义变量RANK, 查找出 每个部门工资前三的排名
    SELECT 
        name
        , Salary
        , DepartmentId
        , CASE 
            WHEN @preDeptId = DepartmentId AND @preSal = Salary THEN @rnk := @rnk
            WHEN @preDeptId = DepartmentId AND @preSal != Salary THEN @rnk := @rnk + 1
            WHEN @preDeptId != DepartmentId THEN @rnk := 1
            Else @rnk := 1  # @preDeptId != DepartmentId, preDeptId为null时,结果为null
        END AS RNK
        , @preDeptId := DepartmentId
        , @preSal := Salary
    FROM Employee, (SELECT @preDeptId := null,  @preSal := null, @rnk := 1) as init
    ORDER BY DepartmentId, Salary DESC
) as t1
INNER JOIN Department as t2 ON t1.DepartmentId = t2.Id
where t1.RNK <= 3;

3、表关联

思路:

我们先找出公司里前 3 高的薪水,意思是不超过三个值比这些值大,

SELECT e1.Salary 
FROM Employee AS e1
WHERE 3 > (
            SELECT count(DISTINCT e2.Salary) 
		    FROM Employee AS e2
	 	    WHERE e1.Salary < e2.Salary AND e1.DepartmentId = e2.DepartmentId
        );
举个栗子:
当 e1 = e2 = [4,5,6,7,8]

e1.Salary = 4,e2.Salary 可以取值 [5,6,7,8]count(DISTINCT e2.Salary) = 4

e1.Salary = 5,e2.Salary 可以取值 [6,7,8]count(DISTINCT e2.Salary) = 3

e1.Salary = 6,e2.Salary 可以取值 [7,8]count(DISTINCT e2.Salary) = 2

e1.Salary = 7,e2.Salary 可以取值 [8]count(DISTINCT e2.Salary) = 1

e1.Salary = 8,e2.Salary 可以取值 []count(DISTINCT e2.Salary) = 0

最后 3 > count(DISTINCT e2.Salary),所以 e1.Salary 可取值为 [6,7,8],即集合前 3 高的薪水

再把表 Department 和表 Employee 连接,获得各个部门工资前三高的员工。

代码:

SELECT
	t2.NAME AS Department
    , t1.NAME AS Employee
    , t1.Salary AS Salary 
FROM
	Employee AS t1, Department as t2
WHERE t1.DepartmentId = t2.Id 
AND 3 > (
            SELECT count(DISTINCT t3.Salary)
		    FROM Employee AS t3
			WHERE t1.Salary < t3.Salary AND t1.DepartmentId = t3.DepartmentId
        ) 
ORDER BY t2.NAME, Salary DESC;

三、参考

1、185. 部门工资前三高的员工
2、使用dense_rank,更简洁的解;尽量避免基于salary这种数字型字段的连接(join,in,exists)
3、dense_rank()开窗函数,简洁清晰易懂 bit98%
4、MySQL 自定义变量解法

你可能感兴趣的:(LeetCode-SQL,sql,leetcode,数据库)