目录
练习1:分组求和
练习2:交换位置
练习3:分数排名
练习4:连续出现的数字
练习5:树节点
练习6:至少有五名直接下属的经理
练习7:分数排名
练习8:查询回答率最高的问题
练习9:各部门前3高工资的员工
练习10:平面上最近距离
练习11:行程和用户
各部门工资最高的员工(难度:中等)
1. 创建employee 表,包含所有员工信息,每个员工有其对应的 Id, Salary 和 DepartmentID。
Id |
Name |
Salary |
DepartmentID |
1 |
Joe |
70000 |
1 |
2 |
Henry |
80000 |
2 |
3 |
Sam |
60000 |
2 |
4 |
Max |
90000 |
1 |
【START】
第一步:创建employee表
这里我将Id和DepartmentID作为字符串处理。
Id VARCHAR(1) NOT NULL,
Name VARCHAR(225) NOT NULL,
Salary INT ,
DepartmentID VARCHAR(1),
PRIMARY KEY(Id)
);
第二步:插入数据
INSERT INTO employee VALUES('1','Joe',70000,'1'),
('2','Henry',80000,'2'),
('3','Sam',60000,'2'),
('4','Max',90000,'1');
第三步:查看表及表数据
SELECT * FROM employee
【END】
2. 创建department 表,包含公司所有部门的信息。
Id | Name |
1 | IT |
2 | Sales |
【START】
参考上面的步骤,建立新的表然后插入数据
CREATE TABLE department (
Id VARCHAR(1) NOT NULL,
Name VARCHAR(225) NOT NULL,
PRIMARY KEY(Id)
);
INSERT INTO department VALUES('1','IT'),
('2','Sales');
SELECT * FROM department
【END】
3. 编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。
Department | Employee | Salary |
IT | Max | 90000 |
Sales | Henry | 80000 |
【START】
思路:观察数据,可以发现,Department在department 表中,其他两个字段在employee表中,我们需要将这两张表通过DepartmentID关联起来,之后再求出部门里面最高工资即可。
SELECT dep.Name AS Department,
emp.Name AS Employee,
MAX(emp.Salary) AS Salary
FROM employee emp
LEFT JOIN department dep
ON emp.DepartmentID = dep.ID
GROUP BY dep.Name;
【END】
换座位
纵列的id是连续递增的,写一个 SQL query,变相邻俩学生的座位。如果学生人数是奇数,则不需要改变最后一个同学的座位。
seat表:
id | student |
1 | Abbot |
2 | Doris |
3 | Emerson |
4 | Green |
5 | Jeames |
输出结果:
id | student |
1 | Doris |
2 | Abbot |
3 | Green |
4 | Emerson |
5 | Jeames |
【START】
思路:观察上面的数据,我们可以发现奇数行的相下一位,偶数行的向上一位,只需要将奇数行并且不等于最大id,则id+1,偶数行则id-1。
此处需要注意,因为我们会对id进行求余的运算,需要将id维护为INT型。
CREATE TABLE seat (
Id INT NOT NULL,
student VARCHAR(225) NOT NULL,
PRIMARY KEY(Id)
);
INSERT INTO seat VALUES(1,'Abbot'),
(2,'Doris'),
(3,'Emerson'),
(4,'Green'),
(5,'Jeames');
SELECT CASE WHEN id%2=0 THEN id-1
WHEN id%2=1 AND id!=(SELECT MAX(id) FROM seat) THEN id+1
ELSE id END ID,
student
FROM seat
ORDER BY ID
【END】
编写一个 SQL 查询来实现分数排名,如果两个分数相同,则两个分数排名(Rank)相同。请注意,平分后的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”。
创建score表:
Id | Score |
1 | 3.50 |
2 | 3.65 |
3 | 4.00 |
4 | 3.85 |
5 | 4.00 |
6 | 3.65 |
查询的结果应该为
Score | Rank |
4.00 | 1 |
4.00 | 1 |
3.85 | 2 |
3.65 | 3 |
3.65 | 3 |
3.50 | 4 |
【START】
思路:按照Score降序排列,并且得到Rank后的值,值是按照无间隔排序。使用窗口函数中的ROW_NUMBER()
注意:decimal(m,n)中m是总个数;命名为Rank的时候由于rank是内置函数,需要加上引号。
CREATE TABLE score_t (
Id INT NOT NULL,
Score DECIMAL(6,2) NOT NULL,
PRIMARY KEY(Id)
);
INSERT INTO score_t VALUES (1,3.50),
(2,3.65),
(3,4.00),
(4,3.85),
(5,4.00),
(6,3.65);
SELECT Score,
DENSE_RANK() OVER (ORDER BY Score desc) AS 'Rank'
FROM score_t
【END】
【START】
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
Id | Num |
1 | 1 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 1 |
6 | 2 |
7 | 2 |
CREATE TABLE NUM (
Id INT NOT NULL,
Num INT NOT NULL,
PRIMARY KEY(Id)
);
INSERT INTO NUM VALUES (1,1),
(2,1),
(3,1),
(4,2),
(5,1),
(6,2),
(7,2)
【START】
思路:如果为连续出现的数,它的顺序排序与它分组后的顺序排序的差值是相等的(此处的排序必须为不重复排序)
SELECT Num AS ConsecutiveNums
FROM
(SELECT Num,
diff,
count(1) cot
FROM
(SELECT Num,
(row_number() OVER (ORDER BY Id)-row_number() OVER (PARTITION BY Num ORDER BY ID))
AS diff
FROM NUM) t
GROUP BY Num,
diff) t2
WHERE cot>=3;
【END】
对于tree表,id是树节点的标识,p_id是其父节点的id
id | p_id |
1 | null |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 2 |
每个节点都是以下三种类型中的一种:
写一条查询语句打印节点id及对应的节点类型。按照节点id排序。上面例子的对应结果为:
id | Type |
1 | Root |
2 | Inner |
3 | Leaf |
4 | Leaf |
5 | Leaf |
说明
下面是树的图形:
注意
如果一个树只有一个节点,只需要输出根节点属性。
【START】
思路:首先需要拼接一张既有上节点又有下节点的表,然后再根据条件使用CASE WHEN对他们赋值。
CREATE TABLE tree (
id VARCHAR(4) NOT NULL,
p_id VARCHAR(4) ,
PRIMARY KEY(Id)
);
INSERT INTO tree VALUES (1,null),
(2,1),
(3,1),
(4,2),
(5,2)
SELECT DISTINCT id,
CASE WHEN p_id IS NULL AND t_id IS NOT NULL THEN 'Root'
WHEN p_id IS NOT NULL AND t_id IS NOT NULL THEN 'Inner'
WHEN p_id IS NOT NULL AND t_id IS NULL THEN 'Leaf'
END Type
FROM
(SELECT t1.id,
t1.p_id,
t2.id t_id
FROM tree t1
LEFT JOIN tree t2
ON t1.id = t2.p_id) t
【END】
Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。
id | name | department | managerid |
101 | John | A | null |
102 | Dan | A | 101 |
103 | James | A | 101 |
104 | Amy | A | 101 |
105 | Anne | A | 101 |
106 | Ron | B | 101 |
针对Employee表,写一条SQL语句找出有5个下属的主管。对于上面的表,结果应输出:
注意:
没有人向自己汇报。
【START】
思路:找到managerid的名称,并统计managerid出现的次数,之后再做筛选。
CREATE TABLE employee_t (
id VARCHAR(3) NOT NULL,
name VARCHAR(25) NOT NULL,
department VARCHAR(1),
managerid VARCHAR(3),
PRIMARY KEY(id)
);
INSERT INTO employee_t VALUES (101,'John','A',null),
(102,'Dan','A',101),
(103,'James','A',101),
(104,'Amy','A',101),
(105,'Anne','A',101),
(106,'Ron','A',101)
SELECT managername AS name
FROM
SELECT t1.name,
t1.managerid,
t2.name AS managername,
COUNT(1) AS num
FROM employee_t t1
LEFT JOIN employee_t t2
ON t1.managerid = t2.id
GROUP BY t1.name,
t1.managerid) t3
GROUP BY managername
HAVING SUM(num) >= 5
结果:
【END】
练习三的分数表,实现排名功能,但是排名需要是非连续的
【START】
思路:将练习3中的desen_rank()改为rank即可。
SELECT Score,
RANK() OVER (ORDER BY Score desc) AS 'Rank'
FROM score_t
【END】
求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
uid是用户id;action的值为:“show”, “answer”, “skip”;当action是"answer"时,answer_id不为空,相反,当action是"show"和"skip"时为空(null);q_num是问题的数字序号。
写一条sql语句找出回答率最高的问题。
uid | action | question_id | answer_id | q_num | timestamp |
5 | show | 285 | null | 1 | 123 |
5 | answer | 285 | 124124 | 1 | 124 |
5 | show | 285 | null | 2 | 125 |
5 | skip | 285 | null | 2 | 126 |
【SATRT】
思路:找到回到问题的次数以及问题的次数,之后再计算回答率并求出最大回答率。
CREATE TABLE survey_log (
uid VARCHAR(1) NOT NULL,
action VARCHAR(25) NOT NULL,
question_id VARCHAR(3),
answer_id VARCHAR(25),
q_num INT,
timestamp VARCHAR(3)
);
INSERT INTO survey_log VALUES (5,'show',285,null,1,123),
(5,'answer',285,124124,1,124),
(5,'show',369,null,2,125),
(5,'skip',369,null,2,126)
SELECT CASE WHEN MAX(answer_num/quest_num) THEN question_id END survey_log
FROM(
SELECT question_id,
1 quest_num,
count(answer_id) answer_num
FROM survey_log
GROUP BY question_id) t1
【END】
编写一个 SQL 查询,找出每个部门工资前三高的员工。此外,请考虑实现各部门前N高工资的员工功能
id | name | salary | departmentid |
1 | Joe | 70000 | 1 |
2 | Henry | 80000 | 2 |
3 | Sam | 60000 | 2 |
4 | Max | 90000 | 1 |
5 | Janet | 69000 | 1 |
6 | Randy | 85000 | 1 |
id | name |
1 | IT |
2 | Sales |
【START】
思路:根据departmentid以及name进行分组排序,然后找到排序前三的,这里使用的是窗口的函数ROW_NUMBER(),对于找前N高工资的功能只需要将最后where语句的3改为n即可。
CREATE TABLE employee_n (
id VARCHAR(3) NOT NULL,
name VARCHAR(25) NOT NULL,
salary INT,
departmentid VARCHAR(1),
PRIMARY KEY(id)
);
CREATE TABLE department (
Id VARCHAR(1) NOT NULL,
Name VARCHAR(225) NOT NULL,
PRIMARY KEY(Id)
);
INSERT INTO employee_n VALUES(1,'Joe',70000,1),
(2,'Henry',80000,2),
(3,'Sam',60000,2),
(4,'Max',90000,1),
(5,'Janet',69000,1),
(6,'Randy',85000,1);
INSERT INTO department VALUES('1','IT'),
('2','Sales');
SELECT Department,
Emplyee,
Salary
FROM
(SELECT dep.name Department,
emp.Name Emplyee,
emp.Salary,
ROW_NUMBER() OVER(PARTITION BY dep.name ORDER BY Salary) AS ranking
FROM employee_n emp
LEFT JOIN department dep
ON emp.departmentid = dep.id
)t1
WHERE ranking <= 3;
【END】
point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)
写一条查询语句求出这些点中的最短距离并保留2位小数。(注意: 所有点的最大距离小于10000)
x | y |
-1 | -1 |
0 | 0 |
-1 | -2 |
最短距离是1,从点(-1,-1)到点(-1,-2)。
结果为:
shortest |
1.00 |
【START】
思路:我们需要去遍历所有的x,y的组合,所以我们考虑用笛卡尔积,但是我们不需要自身到自身的距离,所以引入id序号,将id序号相同的排除,然后使用欧式距离计算公式sqrt((x1-x2)^2+(y1-y1)^2)得到最小距离。
CREATE TABLE point_2d
(id INT,
x INT,
y INT)
INSERT INTO point_2d VALUES
(1,-1,-1),
(2,0,0),
(3,-1,-2)
SELECT t1.x x1,
t1.y y1,
t2.x x2,
t2.y y2,
MIN(SQRT(POW((t1.x-t2.x),2)+POW((t1.y-t2.y),2))) shortest
FROM
point_2d t1,point_2d t2
WHERE t1.id <> t2.id
SELECT MIN(SQRT(POW((t1.x-t2.x),2)+POW((t1.y-t2.y),2))) shortest
FROM
point_2d t1,point_2d t2
WHERE t1.id <> t2.id
Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)
id | client_id | driver_id | city_id | status | request_at |
1 | 1 | 10 | 1 | completed | 2013-10-1 |
2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-1 |
3 | 3 | 12 | 6 | completed | 2013-10-1 |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-1 |
5 | 1 | 10 | 1 | completed | 2013-10-2 |
6 | 2 | 11 | 6 | completed | 2013-10-2 |
7 | 3 | 12 | 6 | completed | 2013-10-2 |
8 | 2 | 12 | 12 | completed | 2013-10-3 |
9 | 3 | 10 | 12 | completed | 2013-10-3 |
10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-3 |
Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。
user_id | banned | role |
1 | No | client |
2 | Yes | client |
3 | No | client |
4 | No | client |
10 | No | driver |
11 | No | driver |
12 | No | driver |
13 | No | driver |
写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
结果为:
Day | Cancellation Rate |
2013-10-01 | 0.33 |
2013-10-02 | 0.00 |
2013-10-03 | 0.50 |
【START】
思路:
第一,我们查的是request_at在2013-10-1至2013-10-3范围;
第二,banned要为Yes,一个是client_id需要匹配Users表,一个是driver_id需要匹配Users表
第三,Status包含cancelled的为取消订单。合计订单包含取消订单数,取消率为取消订单数除以合计订单数
CREATE TABLE trips
(id INT NOT NULL,
client_id INT,
driver_id INT,
city_id INT,
status VARCHAR(60),
request_at DATE);
INSERT INTO trips VALUES
(1,1,10,1,'completed','2013-10-1'),
(2,2,11,1,'cancelled_by_driver','2013-10-1'),
(3,3,12,6,'completed','2013-10-1'),
(4,4,13,6,'cancelled_by_client','2013-10-1'),
(5,1,10,1,'completed','2013-10-2'),
(6,2,11,6,'completed','2013-10-2'),
(7,3,12,6,'completed','2013-10-2'),
(8,2,12,12,'completed','2013-10-3'),
(9,3,10,12,'completed','2013-10-3'),
(10,4,13,12,'cancelled_by_driver','2013-10-3');
CREATE TABLE users
(user_id INT NOT NULL,
banned VARCHAR(4),
role VARCHAR(60));
INSERT INTO users VALUES
(1,'No','client'),
(2,'YES','client'),
(3,'No','client'),
(4,'No','client'),
(10,'No','driver'),
(11,'No','driver'),
(12,'No','driver'),
(13,'No','driver');
SELECT request_at Day,
ROUND(sum(cancelled_num)/sum(all_num),2) Cancellation_Rate
FROM
(SELECT request_at,
status,
count(1) all_num,
CASE WHEN t1.status <> 'completed' THEN 1 ELSE 0 END cancelled_num
FROM trips t1
LEFT JOIN users u1
ON t1.client_id = u1.user_id
LEFT JOIN users u2
ON t1.driver_id = u2.user_id
WHERE u1.banned = 'No'
AND u2.banned = 'No'
AND request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY request_at,
status) tp
GROUP BY request_at
【END】