Leetcode1179. 重新格式化部门表(简单)

部门表 Department:

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| revenue       | int     |
| month         | varchar |
+---------------+---------+

(id, month) 是表的联合主键。
这个表格有关于每个部门每月收入的信息。
月份(month)可以取下列值 ["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]。

编写一个 SQL 查询来重新格式化表,使得新的表中有一个部门 id 列和一些对应 每个月 的收入(revenue)列。

查询结果格式如下面的示例所示:

Department 表:
+------+---------+-------+
| id   | revenue | month |
+------+---------+-------+
| 1    | 8000    | Jan   |
| 2    | 9000    | Jan   |
| 3    | 10000   | Feb   |
| 1    | 7000    | Feb   |
| 1    | 6000    | Mar   |
+------+---------+-------+

查询得到的结果表:

+------+-------------+-------------+-------------+-----+-------------+
| id   | Jan_Revenue | Feb_Revenue | Mar_Revenue | ... | Dec_Revenue |
+------+-------------+-------------+-------------+-----+-------------+
| 1    | 8000        | 7000        | 6000        | ... | null        |
| 2    | 9000        | null        | null        | ... | null        |
| 3    | null        | 10000       | null        | ... | null        |
+------+-------------+-------------+-------------+-----+-------------+

注意,结果表有 13 列 (1个部门 id 列 + 12个月份的收入列)。

审题
这个问题其实就是经常遇到的长数据与宽数据之间的转换 类似于数据透视表?
python或者R中都是很容易解决的。
下面看一下这个问题在sql中怎么解决呢?

生成数据

CREATE TABLE department2(
id INT,
revenue INT,
MONTH VARCHAR(10),
PRIMARY KEY(id, MONTH));

INSERT INTO department2 VALUE(1, 8000, 'Jan'),(2, 9000, 'Jan'),(3, 10000, 'Feb'),(1, 7000, 'Feb'),(1, 6000, 'Mar');

自己的解答
用If条件 IF(month='Jan',revenue,NULL) Jan_Revenue

SELECT id,
IF(`month`='Jan',revenue,NULL) Jan_Revenue,
IF(`month`='Feb',revenue,NULL) Feb_Revenue,
IF(`month`='Mar',revenue,NULL) Mar_Revenue
FROM Department;

结果是这样的


对Id进行分组

SELECT id,
IF(`month`='Jan',revenue,NULL) Jan_Revenue,
IF(`month`='Feb',revenue,NULL) Feb_Revenue,
IF(`month`='Mar',revenue,NULL) Mar_Revenue
FROM Department
GROUP BY id;

并没有得到想要的结果


原因是什么呢?
对于id=1 前三个月都有收入 但只有一个在第一行是非NULL的
分组后选择的是每一组的第一个元素 因此和想象的结果不太一样

那这么做呢

SELECT id, MAX(tmp.Jan_Revenue) AS Jan_Revenue, 
MAX(tmp.Feb_Revenue) AS Feb_Revenue, 
MAX(tmp.Mar_Revenue) AS Mar_Revenue
FROM (SELECT id,
IF(`month`='Jan',revenue,NULL) Jan_Revenue,
IF(`month`='Feb',revenue,NULL) Feb_Revenue,
IF(`month`='Mar',revenue,NULL) Mar_Revenue
FROM Department) tmp
GROUP BY id;

这样是ok的, 但还要写子查询

试一下聚合函数应该也是可以的 这个就是要在每一组只有一个元素为非NULL 把这个非NULL元素选择出来 所以 max sum min 都行

SELECT id,
MAX(IF(`month`='Jan',revenue,NULL)) Jan_Revenue,
MAX(IF(`month`='Feb',revenue,NULL)) Feb_Revenue,
MAX(IF(`month`='Mar',revenue,NULL)) Mar_Revenue
FROM Department2
GROUP BY id;

结果是一致的

别人的解答
把if函数换成case when语句 其实想法是一致的
值得注意的是这里需要用聚合函数

SELECT id,
SUM(CASE `month` WHEN 'Jan' THEN revenue END) Jan_Revenue,
SUM(CASE `month` WHEN 'Feb' THEN revenue END) Feb_Revenue,
SUM(CASE `month` WHEN 'Mar' THEN revenue END) Mar_Revenue,
SUM(CASE `month` WHEN 'Apr' THEN revenue END) Apr_Revenue,
SUM(CASE `month` WHEN 'May' THEN revenue END) May_Revenue,
SUM(CASE `month` WHEN 'Jun' THEN revenue END) Jun_Revenue,
SUM(CASE `month` WHEN 'Jul' THEN revenue END) Jul_Revenue,
SUM(CASE `month` WHEN 'Aug' THEN revenue END) Aug_Revenue,
SUM(CASE `month` WHEN 'Sep' THEN revenue END) Sep_Revenue,
SUM(CASE `month` WHEN 'Oct' THEN revenue END) Oct_Revenue,
SUM(CASE `month` WHEN 'Nov' THEN revenue END) Nov_Revenue,
SUM(CASE `month` WHEN 'Dec' THEN revenue END) Dec_Revenue
FROM Department
GROUP BY id;

在别人的博客中(group by的使用)看到,单独地使用group by (不加聚合函数),只能显示出每组记录的第一条记录。
我之前觉得,每组本来也只有一个元素,单独使用group by 就刚好显示了这唯一的一条元素,所以不加聚合函数也OK。按照这个想法进行实践,发现答案不能通过。
所以,我学到了:今后但凡使用group by,前面一定要有聚合函数(MAX /MIN / SUM /AVG / COUNT)

还有一种很简单的想法就是 把一月的选出来 再把二月的选出来 靠id连接直到12月

SELECT
DISTINCT 
    a.id,
    Jan.revenue AS Jan_Revenue,
    Feb.revenue AS Feb_Revenue,
    Mar.revenue AS Mar_Revenue,
    Apr.revenue AS Apr_Revenue,
    May.revenue AS May_Revenue,
    Jun.revenue AS Jun_Revenue,
    Jul.revenue AS Jul_Revenue,
    Aug.revenue AS Aug_Revenue,
    Sep.revenue AS Sep_Revenue,
    Octo.revenue AS Oct_Revenue,
    Nov.revenue AS Nov_Revenue,
    Dece.revenue AS Dec_Revenue
FROM
    Department a
LEFT JOIN
    Department Jan
ON
    a.id = Jan.id
AND
    Jan.month = 'Jan'
LEFT JOIN
    Department Feb
ON
    a.id = Feb.id
AND
    Feb.month = 'Feb'
LEFT JOIN
    Department Mar
ON
    a.id = Mar.id
AND
    Mar.month = 'Mar'
LEFT JOIN
    Department Apr
ON
    a.id = Apr.id
AND
    Apr.month = 'Apr'
LEFT JOIN
    Department May
ON
    a.id = May.id
AND
    May.month = 'May'
LEFT JOIN
    Department Jun
ON
    a.id = Jun.id
AND
    Jun.month = 'Jun'
LEFT JOIN
    Department Jul
ON
    a.id = Jul.id
AND
    Jul.month = 'Jul'
LEFT JOIN
    Department Aug
ON
    a.id = Aug.id
AND
    Aug.month = 'Aug'
LEFT JOIN
    Department Sep
ON
    a.id = Sep.id
AND
    Sep.month = 'Sep'
LEFT JOIN
    Department Octo
ON
    a.id = Octo.id
AND
    Octo.month = 'Oct'
LEFT JOIN
    Department Nov
ON
    a.id = Nov.id
AND
    Nov.month = 'Nov'
LEFT JOIN
    Department Dece
ON
    a.id = Dece.id
AND
    Dece.month = 'Dec'

你可能感兴趣的:(Leetcode1179. 重新格式化部门表(简单))