创建Employee 表,包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。
+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
+----+-------+--------+--------------+
创建Department 表,包含公司所有部门的信息。
+----+----------+
| Id | Name |
+----+----------+
| 1 | IT |
| 2 | Sales |
+----+----------+
编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。
select department.name as department,employee.name as employee,max(employee.salary) as salary from employee left join department on employee.DepartmentId=department.id
group by employee.id
小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。
其中纵列的id是连续递增的
小美想改变相邻俩学生的座位。
你能不能帮她写一个 SQL query 来输出小美想要的结果呢?
请创建如下所示seat表:
select s.id,s.student from(
select id-1 as id,student from seat where mod(id,2)=0
union
select id+1 as id,student from seat where mod(id,2)=1 and id!=(select count(*) from seat)
union
select id,student from seat where mod(id,2)=1 and id=(select count(*) from seat)
)s order by id
假设在某次期末考试中,二年级四个班的平均成绩分别是 93、93、93、91
,请问可以实现几种排名结果?分别使用了什么函数?排序结果是怎样的?(只考虑降序)
+-------+-----------+
| class | score_avg |
+-------+-----------+
| 1 | 93 |
| 2 | 93 |
| 3 | 93 |
| 4 | 91 |
+-------+-----------+
使用窗口函数
计算排序时,如果存在相同位次的记录,则会跳过之后的位次。
例)有 3 条记录排在第 1 位时:1 位、1 位、1 位、4 位……
同样是计算排序,即使存在相同位次的记录,也不会跳过之后的位次。
例)有 3 条记录排在第 1 位时:1 位、1 位、1 位、2 位……
赋予唯一的连续位次。
# RANK
SELECT class, score_avg, RANK() OVER (ORDER BY score_avg DESC ) as "ranking"
From Score;
# DENSE_RANK
SELECT class, score_avg, DENSE_RANK() OVER (ORDER BY score_avg DESC ) as "ranking"
FROM Score;
# ROW_NUMBER
SELECT class, score_avg, ROW_NUMBER() OVER (ORDER BY score_avg DESC ) as "ranking"
From Score;
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
+----+-----+
| Id | Num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+
例如,给定上面的 Logs 表, 1 是唯一连续出现至少三次的数字。
+-----------------+
| ConsecutiveNums |
+-----------------+
| 1 |
+-----------------+
select distinct t1.Num as ConsecutiveNums
from
subnum t1,
subnum t2,
subnum t3
where t1.Id = t2.Id-1
and t2.Id = t3.Id-1
and t1.Num=t2.Num
and t2.Num=t3.Num;
---=========================================
select distinct num as consecutiveNums
from(
select num,
case
when @prev = num then @count:=@count+1
when (@prev := num) is not null then @count := 1
end
as cnt
from subnum,(select @prev := null,@count := null)as t
)
as temp
where
temp.cnt >= 3
---------------------------------------
select b.num
from (select z.num,@rownum:=@rownum+1 ,
if(@nums=z.num,@rank:=@rank+1,@rank:=1) as rank,
@nums:=z.num
from subnum as z,
(select @rownum :=0 , @nums := null ,@rank:=0)as a
)as b
where b.rank>=3
对于tree表,id是树节点的标识,p_id是其父节点的id。
+----+------+
| id | p_id |
+----+------+
| 1 | null |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
+----+------+
每个节点都是以下三种类型中的一种:
写一条查询语句打印节点id及对应的节点类型。按照节点id排序。上面例子的对应结果为:
+----+------+
| id | Type |
+----+------+
| 1 | Root |
| 2 | Inner|
| 3 | Leaf |
| 4 | Leaf |
| 5 | Leaf |
+----+------+
说明
下面是树的图形:
1
/ \
2 3
/ \
4 5
注意
如果一个树只有一个节点,只需要输出根节点属性。
select id,(
case
when (select count(*) from tree)=1 then 'Root'
when id not in(select DISTINCT t1.id
from tree as t1,tree as t2 where t1.id=t2.p_id)then 'Leaf'
when p_id is null then 'Root'
Else 'Inner'
end)
as Type
from tree
Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。
+------+----------+-----------+----------+
|Id |Name |Department |ManagerId |
+------+----------+-----------+----------+
|101 |John |A |null |
|102 |Dan |A |101 |
|103 |James |A |101 |
|104 |Amy |A |101 |
|105 |Anne |A |101 |
|106 |Ron |B |101 |
+------+----------+-----------+----------+
针对Employee表,写一条SQL语句找出有5个下属的主管。对于上面的表,结果应输出:
+-------+
| Name |
+-------+
| John |
+-------+
注意:
没有人向自己汇报。
select e.name from employee2 as e,(
select managerid,count(*)as num
from employee2
group by managerid
having managerid is not null)as temp
where temp.num>=5 and temp.managerid=e.id
求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
uid是用户id;action的值为:“show”, “answer”, “skip”;当action是"answer"时,answer_id不为空,相反,当action是"show"和"skip"时为空(null);q_num是问题的数字序号。
写一条sql语句找出回答率最高的 question_id
。
举例:
输入
uid | action | question_id | answer_id | q_num | timestamp |
---|---|---|---|---|---|
5 | show | 285 | null | 1 | 123 |
5 | answer | 285 | 124124 | 1 | 124 |
5 | show | 369 | null | 2 | 125 |
5 | skip | 369 | null | 2 | 126 |
输出
question_id |
---|
285 |
说明
问题285的回答率为1/1,然而问题369的回答率是0/1,所以输出是285。
注意:
最高回答率的意思是:同一个问题出现的次数中回答的比例。
SELECT question_id as survey_log
FROM
(
SELECT question_id,
SUM(case when action = “answer” THEN 1 ELSE 0 END) as num_answers,
SUM(case when action = “show” THEN 1 ELSE 0 END) as num_shows
FROM survey_log
GROUP BY question_id
) as tbl
ORDER BY (num_answers / num_shows) DESC
LIMIT 1;
将练习一中的 employee
表清空,重新插入以下数据(也可以复制练习一中的 employee
表,再插入第5、第6行数据):
+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
+----+-------+--------+--------------+
编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Max | 90000 |
| IT | Randy | 85000 |
| IT | Joe | 70000 |
| Sales | Henry | 80000 |
| Sales | Sam | 60000 |
+------------+----------+--------+
此外,请考虑实现各部门前N高工资的员工功能。
select Department.Name,t.Name,t.Salary from
(select Id,Name,DepartmentId,salary,row_number() over (partition by DepartmentId order by Salary DESC) as ‘rank’
from employee) t
left join Department
on t.DepartmentID = Department.id
where t.rank<=3
point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)。
写一条查询语句求出这些点中的最短距离并保留2位小数。
|x | y |
|----|----|
| -1 | -1 |
| 0 | 0 |
| -1 | -2 |
最短距离是1,从点(-1,-1)到点(-1,-2)。所以输出结果为:
| shortest |
1.00
+--------+
|shortest|
+--------+
|1.00 |
+--------+
**注意:**所有点的最大距离小于10000。
SELECT T.request_at AS Day,
ROUND(
SUM(IF(T.STATUS = ‘completed’,0,1))
/
COUNT(T.STATUS),2) AS Cancellation Rate
FROM Trips AS T
JOIN Users AS U1 ON (T.client_id = U1.users_id AND U1.banned =‘No’)
JOIN Users AS U2 ON (T.driver_id = U2.users_id AND U2.banned =‘No’)
WHERE T.request_at BETWEEN ‘2013-10-01’ AND ‘2013-10-03’
GROUP BY T.request_at
Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。
Id | Client_Id | Driver_Id | City_Id | Status | Request_at |
---|---|---|---|---|---|
1 | 1 | 10 | 1 | completed | 2013-10-1 |
2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-1 |
3 | 3 | 12 | 6 | completed | 2013-10-1 |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-1 |
5 | 1 | 10 | 1 | completed | 2013-10-2 |
6 | 2 | 11 | 6 | completed | 2013-10-2 |
7 | 3 | 12 | 6 | completed | 2013-10-2 |
8 | 2 | 12 | 12 | completed | 2013-10-3 |
9 | 3 | 10 | 12 | completed | 2013-10-3 |
10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-3 |
Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。
+----------+--------+--------+
| Users_Id | Banned | Role |
+----------+--------+--------+
| 1 | No | client |
| 2 | Yes | client |
| 3 | No | client |
| 4 | No | client |
| 10 | No | driver |
| 11 | No | driver |
| 12 | No | driver |
| 13 | No | driver |
+----------+--------+--------+
写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
+------------+-------------------+
| Day | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 | 0.33 |
| 2013-10-02 | 0.00 |
| 2013-10-03 | 0.50 |
+------------+-------------------+
select t.Request_at Day,(round(count(if(status!="completed",status,null))/count(status),2) ) `Cancellation Rate`
from Users u inner join Trips t
on u.Users_id = t.Client_Id and u.banned != 'Yes'
where t.Request_at >= '2013-10-01' and t.Request_at <= '2013-10-03'
group by t.Request_at
假设 A B C 三位小朋友期末考试成绩如下所示:
+-----+-----------+------|
| name| subject |score |
+-----+-----------+------|
| A | chinese | 99 |
| A | math | 98 |
| A | english | 97 |
| B | chinese | 92 |
| B | math | 91 |
| B | english | 90 |
| C | chinese | 88 |
| C | math | 87 |
| C | english | 86 |
+-----+-----------+------|
请使用 SQL 代码将以上成绩转换为如下格式:
+-----+-----------+------|---------|
| name| chinese | math | english |
+-----+-----------+------|---------|
| A | 99 | 98 | 97 |
| B | 92 | 91 | 90 |
| C | 88 | 87 | 86 |
+-----+-----------+------|---------|
select name
SUM(case subject when 'chinese' then score else 0 end) as shinese,
SUM(case subject when 'math' then score else 0 end) as math,
SUM(case subject when 'english' then score else 0 end) as english
from scores group by name
假设 A B C 三位小朋友期末考试成绩如下所示:
+-----+-----------+------|---------|
| name| chinese | math | english |
+-----+-----------+------|---------|
| A | 99 | 98 | 97 |
| B | 92 | 91 | 90 |
| C | 88 | 87 | 86 |
+-----+-----------+------|---------|
请使用 SQL 代码将以上成绩转换为如下格式:
+-----+-----------+------|
| name| subject |score |
+-----+-----------+------|
| A | chinese | 99 |
| A | math | 98 |
| A | english | 97 |
| B | chinese | 92 |
| B | math | 91 |
| B | english | 90 |
| C | chinese | 88 |
| C | math | 87 |
| C | english | 86 |
+-----+-----------+------|
select name,
'chinese' as subject,
chinese as score
from scores2
union all
select name,
'math' as subject,
math as score
from scores2
union all
select name,
'english' as subject,
english as score
from scores2
order by name desc
假设,某平台2021年主播带货销售额日统计数据如下:
表名 anchor_sales
+-------------+------------+---------|
| anchor_name | date | sales |
+-------------+------------+---------|
| A | 20210101 | 40000 |
| B | 20210101 | 80000 |
| A | 20210102 | 10000 |
| C | 20210102 | 90000 |
| A | 20210103 | 7500 |
| C | 20210103 | 80000 |
+-------------+------------+---------|
定义:如果某主播的某日销售额占比达到该平台当日销售总额的 90% 及以上,则称该主播为明星主播,当天也称为明星主播日。
请使用 SQL 完成如下计算:
a. 2021年有多少个明星主播日?
select
count(if(a.percent>=0.9,1,null)) number_date
from
(select
date
,anchor_name
,sales
,sum(sales) over(partition by date order by date) total_sales
,sales/sum(sales) over(partition by date order by date) percent
from anchor_sales) a
b. 2021年有多少个明星主播?
select
count(distinct anchor_name) number_person
from
(select
date
,anchor_name
,sales
,sum(sales) over(partition by date order by date) total_sales
,sales/sum(sales) over(partition by date order by date) percent
from anchor_sales) a
where a.percent>=0.9
假设有如下比赛结果:
+--------------+-----------+
| cdate | result |
+--------------+-----------+
| 2021-01-01 | 胜 |
| 2021-01-01 | 胜 |
| 2021-01-01 | 负 |
| 2021-01-03 | 胜 |
| 2021-01-03 | 负 |
| 2021-01-03 | 负 |
+------------+-------------+
请使用 SQL 将比赛结果转换为如下形式:
+--------------+-----+-----|
| 比赛日期 | 胜 | 负 |
+--------------+-----------+
| 2021-01-01 | 2 | 1 |
| 2021-01-03 | 1 | 2 |
+------------+-----------+
select
cdate as '比赛日期'
,count(if(result='胜',1,null)) as '胜'
,count(if(result='负',1,null)) as '负'
from cresults
group by 1
假设有如下比赛结果:
+--------------+-----+-----|
| 比赛日期 | 胜 | 负 |
+--------------+-----------+
| 2021-01-01 | 2 | 1 |
| 2021-01-03 | 1 | 2 |
+------------+-----------+
请使用 SQL 将比赛结果转换为如下形式:
+--------------+-----------+
| cdate | result |
+--------------+-----------+
| 2021-01-01 | 胜 |
| 2021-01-01 | 胜 |
| 2021-01-01 | 负 |
| 2021-01-03 | 胜 |
| 2021-01-03 | 负 |
| 2021-01-03 | 负 |
+------------+-------------+
select
比赛日期 cdate,'胜' result
from results
union all
select
比赛日期 cdate,'负' result
from results
order by 1
有用户表行为记录表t_act_records表,包含两个字段:uid(用户ID),imp_date(日期)
构造表mysql如下:
DROP TABLE if EXISTS t_act_records;
CREATE TABLE t_act_records
(uid VARCHAR(20),
imp_date DATE);
INSERT INTO t_act_records VALUES('u1001', 20210101);
INSERT INTO t_act_records VALUES('u1002', 20210101);
INSERT INTO t_act_records VALUES('u1003', 20210101);
INSERT INTO t_act_records VALUES('u1003', 20210102);
INSERT INTO t_act_records VALUES('u1004', 20210101);
INSERT INTO t_act_records VALUES('u1004', 20210102);
INSERT INTO t_act_records VALUES('u1004', 20210103);
INSERT INTO t_act_records VALUES('u1004', 20210104);
INSERT INTO t_act_records VALUES('u1004', 20210105);
假设现在需要根据算法给每个 user_id
推荐购买商品,推荐算法比较简单,推荐和他相似的用户购买过的 product
即可,说明如下:
输入表:orders
+---------+------------+
| user_id | product_id |
+---------+------------+
| 123 | 1 |
| 123 | 2 |
| 123 | 3 |
| 456 | 1 |
| 456 | 2 |
| 456 | 4 |
+---------+------------+
输出表:
+---------+------------+
| user_id | product_id |
+---------+------------+
| 123 | 4 |
| 456 | 3 |
+---------+------------+
假设 t1 表有6行(关联列 name 有2行为空),t2 表有6行(关联列 name 有3行为空),
那么 SELECT * FROM t1 LEFT JOIN t2 on t1.name = t2.name
会返回多少行结果?
可以参考下图