难题确实难,但很有意思。
题目
Employee 表包含所有员工。Employee 表有三列:员工Id,公司名和薪水。
+-----+------------+--------+
|Id | Company | Salary |
+-----+------------+--------+
|1 | A | 2341 |
|2 | A | 341 |
|3 | A | 15 |
|4 | A | 15314 |
|5 | A | 451 |
|6 | A | 513 |
|7 | B | 15 |
|8 | B | 13 |
|9 | B | 1154 |
|10 | B | 1345 |
|11 | B | 1221 |
|12 | B | 234 |
|13 | C | 2345 |
|14 | C | 2645 |
|15 | C | 2645 |
|16 | C | 2652 |
|17 | C | 65 |
+-----+------------+--------+
请编写SQL查询来查找每个公司的薪水中位数。挑战点:你是否可以在不使用任何内置的SQL函数的情况下解决此问题。
+-----+------------+--------+
|Id | Company | Salary |
+-----+------------+--------+
|5 | A | 451 |
|6 | A | 513 |
|12 | B | 234 |
|9 | B | 1154 |
|14 | C | 2645 |
+-----+------------+--------+
创建数据
CREATE TABLE Employee4(
Id INT,
Company VARCHAR(3),
Salary INT);
INSERT INTO Employee4 VALUE(1, 'A', 2341),(2, 'A', 341),(3, 'A', 15),
(4, 'A', 15341),(5, 'A', 451),(6, 'A', 513),(7, 'B', 15),
(8, 'B', 13),(9, 'B', 1154),(10, 'B', 1345),(11, 'B', 1221),
(12, 'B', 234),(13, 'C', 2345),(14, 'C', 2645),(15, 'C', 2645),
(16, 'C', 2652),(17, 'C', 65);
审题
难题确实是难题 开始我还以为是一个组的数据只进行选中位数即可
可是这道题目需要解决的是分组取中位数
先尝试一下对一个组怎么取分位数 这里以A组举例
SELECT E.Salary, @num:= @num+1 AS rank
FROM Employee4 AS E, (SELECT @num:=0) AS tmp
WHERE E.`Company` = 'A'
ORDER BY E.Salary;
如果是奇数数量则取中间两个 偶数则取中间那个
这里写的时候遇到困难 怎么根据奇偶选取 奇数选一个 偶数选两个 我是没有想到什么办法解决
转化为一个数学问题 对于奇数n 取得是floor(n+1/2) 对于偶数取得是n/2与n/2 + 1(其实这里也是floor(n+1/2) 与 floor(n+1/2) +1)
把这个问题统一一下 取得是一个区间 [floor(n+1/2), floor(n+1/2) +a] 若为奇数a为0 偶数a为1
这样其实就可以解决了
先把floor(n+1/2)和a选出来
SELECT FLOOR((COUNT(*)+1)/2) AS sta, IF(COUNT(*)%2 = 1,0,1) AS a
FROM Employee4 AS E
WHERE E.`Company` = 'A';
两表连接 选出rank在sta与sta+a之间的即可
SELECT tmp1.salary
FROM (SELECT E.Salary, @num:= @num+1 AS rank
FROM Employee4 AS E, (SELECT @num:=0) AS tmp
WHERE E.`Company` = 'A'
ORDER BY E.Salary) AS tmp1
JOIN (SELECT FLOOR((COUNT(*)+1)/2) AS sta, IF(COUNT(*)%2 = 1,0,1) AS a
FROM Employee4 AS E
WHERE E.`Company` = 'A') tmp2
ON tmp1.rank BETWEEN tmp2.sta AND tmp2.sta + tmp2.a;
学到了
解答
怎么对不同的组进行排名呢。
可以引入一个变量 pre_company 来判断组别是否改变
但这里的排序不能只对salary进行排序 需要对company salary综合排序
赋初值 num为1 pre_company为NULL
然后每一行选择 要判断当前company是否与pre_company相同 若相同则加一 @num + 1 若不同则赋值1(由于对compay进行了排序 所以这里一定是一个company一个company进行选的)
再更新 pre_company = 当前company
SELECT
E.Salary,
E.Company,
@num := IF(E.Company = @pre_company, @num + 1, 1) AS rank,
@pre_company:= E.Company
FROM
Employee4 AS E,
(SELECT
@num := 1,
@pre_company := NULL) AS tmp
ORDER BY E.company, E.Salary;
再选择每个组的floor(n+1/2)和a
SELECT E.`Company`, FLOOR((COUNT(*)+1)/2) AS sta, IF(COUNT(*)%2 = 1, 0, 1) AS a
FROM Employee4 AS E
GROUP BY E.`Company`;
连接两表,
SELECT tmp1.company, tmp1.salary
FROM (SELECT
E.Salary,
E.Company,
@num := IF(E.Company = @pre_company, @num + 1, 1) AS rank,
@pre_company:= E.Company
FROM
Employee4 AS E,
(SELECT
@num := 1,
@pre_company := NULL) AS tmp
ORDER BY E.company, E.Salary) AS tmp1
JOIN (SELECT E.`Company`, FLOOR((COUNT(*)+1)/2) AS sta, IF(COUNT(*)%2 = 1, 0, 1) AS a
FROM Employee4 AS E
GROUP BY E.`Company`) AS tmp2
ON tmp1.company = tmp2.company AND
tmp1.rank BETWEEN tmp2.sta AND tmp2.sta + tmp2.a;
别的方法
(
SELECT E.Company,FLOOR((COUNT(*)-1)/2) AS `beg`,if(COUNT(*) % 2=1,0,1) AS `cnt`
FROM employee AS E
GROUP BY E.Company
) AS A
计算每个人薪水的升序排名。最小的薪水排第一,第2小的薪水排第二,.....。
薪水比较方法:
if (A.salary = B.salary and A.id > B.id or A.salary > B.salary)
{
那么A的排名在B的排名后。
}
员工表left join员工表,得出同一个公司中,排名在每个人之前的所有人。
SELECT *
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
那么,按E1分组后,统计每组内,E2的个数即为E1的排名,排名从0开始。
最后结果按薪水升序,命名为表B。
(
SELECT E1.Id,E1.Company,E1.Salary, COUNT(E2.Salary) AS `trank`
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
GROUP BY E1.Id,E1.Company,E1.Salary
ORDER BY E1.Company,E1.Salary
) AS B
连接两表即可
SELECT B.Id,B.Company,B.Salary
FROM
(
SELECT E.Company,FLOOR((COUNT(*)-1)/2) AS `beg`,if(COUNT(*) % 2=1,0,1) AS `cnt`
FROM employee AS E
GROUP BY E.Company
) AS A
JOIN (
SELECT E1.Id,E1.Company,E1.Salary, COUNT(E2.Salary) AS `trank`
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
GROUP BY E1.Id,E1.Company,E1.Salary
ORDER BY E1.Company,E1.Salary
) AS B
ON (A.company = B.company AND B.trank BETWEEN A.beg AND (A.beg+A.cnt))