【数据分析SQL练习LeetCode Hard】(3/21)

本篇文章将会记录LeetCode上21道SQL Hard题的解题思路和通过代码(包含Plus会员题),持续更新中。欢迎大家一起讨论~

LeetCode185-部门工资前三高的所有员工

编写一个 SQL 查询,找出每个部门获得前三高工资的所有员工
思路:前三高工资的是指按照工资进行排序,存在工资相同的情况。这里有多种解题思路: a. 窗口函数dense_rank()进行排序,b.利用自连接+groupby+having(count)找出各部门前三高的工资,以(departmentID,Salary)去和部门、员工匹配。

## way b
select
   d.name as department,
   e.name as employee,
   t1.salary
from department d
left join 
(
select
   e1.departmentId,
   e1.salary,
   count(distinct e2.salary) as num
from Employee e1,employee e2
where e1.departmentId = e2.departmentID and e2.salary >= e1.salary
group by e1.departmentId,e1.salary
having num <=3) t1
on d.id = t1.departmentID
inner join employee e
on t1.departmentId = e.departmentId and t1.salary = e.salary

LeetCode262-行程和用户

写一段 SQL 语句查出 2013年10月1日 至 2013年10月3日 期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。

思路:

  1. 首先需要明确“非禁止用户的取消率”的计算公式=(被司机或乘客取消的非禁止用户生成的订单数量) / (非禁止用户生成的订单总数)
  2. 将trips表分别left join从users表里筛选出的非禁止client和driver,并对时间进行筛选
  3. 利用groupby对日期进行分组,sum(case when then end)计算取消的数量,count(*)计算订单总数
select
    request_at as Day,
    round(sum(case when status = 'cancelled_by_client' or status = 'cancelled_by_driver' then 1 else 0 end) /count(*),2) as 'Cancellation Rate'
from trips t
left join (select * from users where banned='No' and role = 'client') c
    on t.client_id = c.users_id
left join (select * from users where banned = 'No' and role='driver') d
    on t.driver_id = d.users_id
where request_at between '2013-10-01' and '2013-10-03' and c.users_id is not null and d.users_id is not null
group by request_at
order by request_at

LeetCode1097-游戏玩法分析5⃣️

吐槽一下这道题的中文翻译,简单来说就是一道求次日留存率的题目。求留存率的主要思路是利用自链接构造临时表,利用group by 对时间进行分组,case when添加留存率的时间条件,sum distinct计算符合条件的id。这一题需要注意的是,需要先对player的安装日期进行判断,同时存在留存率为0的情况,也需要查询出来。

select
    a1.event_date as install_dt,
    count(distinct a1.player_id) as installs,
    round(sum(case when datediff(a2.event_date,a1.event_date)=1 then 1 else 0 end)/count(distinct a1.player_id),2) as Day1_retention
from
(
select
    *,
    row_number() over(partition by player_id order by event_date) as rn
from activity) a1,activity a2
where a1.player_id = a2.player_id
    and a1.rn = 1
group by a1.event_date

LeetCode1645-Hopper Company Queries II

题目解读:本题想要计算的是2020年每个月活跃司机的占比。活跃司机是指当月至少accept一次ride的司机。总司机数量是指截止计算月份,所有加入的司机的cumsum
思路:需要注意的是drivers表和rides表提供的都是记录数据,是间断的,而result要的是从1-12的连续数据。
1)所以第一步可以利用with recursive构建一列1-12月的month临时表。
2)1-12月的活跃司机数量,可以通过将month表left join 每个月活跃司机的数量,未关联上的null即为0个活跃司机数量。活跃司机的数量通过rides和acceptrides关联,根据month分组,利用count(distinct driver_id)计算得到,注意对司机id需要去重
3)总司机数量可以利用窗口函数sum over(order by date)累加得到,但注意需要先和month表join,得到空缺月份的新增司机数量为0,再进行累加

with recursive tn as 
(select 1 as n 
union all
select n+1 from tn where n<12)

select
   month,

   round(ifnull(num2/num1*100,0),2) as working_percentage
from
(
select
   t1.month,
   ifnull(sum(num1) over(order by month),0) as num1,
   ifnull(num2,0) as num2  
from
(
(select 0 as month,(select count(*) from drivers where join_date <'2020-1-1') as num1)
union all
(
   select 
       tn.n as month,
       count(driver_id) as num1
   from tn left join drivers d 
   on tn.n = month(d.join_date) and year(d.join_date) = 2020
   group by tn.n
)) t1
left join 
(
   select
       month(requested_at) as month,
       count(distinct driver_id) as num2
   from rides r left join acceptedrides a on r.ride_id = a.ride_id
   where year(requested_at) = 2020
   group by month(requested_at)
) t2 on t1.month = t2.month) t3
where month > 0

你可能感兴趣的:(【数据分析SQL练习LeetCode Hard】(3/21))