leetcode的高频SQL50题基础版(答案和要点)

第一题:

1.可回收且低脂的产品

select
product_id
from
Products
where
low_fats = "Y" and recyclable = "Y"

2.寻找用户推荐人

考察SQL里面空值也是不等于2的。

即:数值不等于2或者空值。也可以用<=>表示,即not referee_id <=> 2。

select
name
from
customer
where
referee_id <> 2 or referee_id is NULL

3.大的国家

select
name,population,area
from
World
where
area >= 3000000 or population >= 25000000

4.文章浏览 I

select
distinct author_id as id
from
Views
where
viewer_id = author_id
order by
author_id

5.无效的推文

 length()中一个汉字会算3个字符,而char_length() 一个汉字算一个字符。

select
tweet_id
from
Tweets
where length(content) > 15

6.使用唯一标识码替换员工ID

select
unique_id,name
from
Employees
left join
EmployeeUNI
on
Employees.id = EmployeeUNI.id

7.产品销售分析 I

select
product_name,year,price
from
Sales
left join
Product
on
Sales.product_id = Product.product_id

8.进店却未进行过交易的顾客

select
customer_id,
count(*) count_no_trans
from
Visits
left join
Transactions
on
Visits.visit_id = Transactions.visit_id
where
Transactions.visit_id is null
group by
customer_id

9.上升的温度

数据库表纵向相比的思路就行转化为同表联立后做差。

日期做差函数:datediff(W1.recordDate,W2.recordDate) = 1

select
W1.id as id
from
Weather W1
left join
Weather W2
on
datediff(W1.recordDate,W2.recordDate) = 1
where
W1.Temperature > W2.Temperature

10.每台机器的进程平均运行时间

小数取值函数round(,3)。

这题可以采用上题的思路,自连接后进行数据判断,然后用平均函数计算一下就行,即:

select
a1.machine_id,
round(avg(a2.timestamp - a1.timestamp),3) processing_time
from
Activity a1
left join
Activity a2
on
a1.machine_id = a2.machine_id and a1.process_id = a2.process_id
where 
a1.activity_type = "start" and a2.activity_type = "end"
group by
machine_id

答案给了其他的思路,即采用根据activity_type字段的值来判断timestamp是否取负值,采用if函数来完成这个操作。如果数据库的不同字段存在一定关联的时候,要能够想到这样的解法可能。

select
    machine_id,
    round(avg(if(activity_type='start', -timestamp, timestamp))*2, 3) processing_time 
from Activity 
group by machine_id

11.员工奖金

select
name,bonus
from
Employee
left join
Bonus
on
Employee.empId = Bonus.empId
where
bonus < 1000 or bonus is null

12.学生们参加各科测试的次数

笛卡尔积可以用cross join 或者join表示;

select
Students.student_id,student_name,Subjects.subject_name,
count(Examinations.subject_name) attended_exams
from
Students
cross join
Subjects
left join
Examinations
on
Examinations.student_id = Students.student_id
and
Examinations.subject_name = Subjects.subject_name
group by
Students.student_id,subject_name
order by
student_id,subject_name

13.至少有5名直接下属的经理

select
name
from(
select
e1.name,
count(*) num
from
Employee e1
left join
Employee e2
on
e1.id = e2.managerId
group by
e1.id
having
num >=5)t

14.确认率

比率问题可以用MySQL的avg+if进行操作,null值会被归为0,无需ifnull。

select
Signups.user_id,
round(avg(if(action = 'confirmed',1,0)),2) confirmation_rate
from
Signups
left join
Confirmations
on
Signups.user_id = Confirmations.user_id
group by
Signups.user_id

15.有趣的电影

select
id,movie,description,rating
from
cinema
where
description <> 'boring' and id%2 = 1
order by
rating desc

16.平均售价

能在select里面用if的就要考虑是不是取数据的时候就取少一点。

select
UnitsSold.product_id,
round(sum(price*units)/sum(units),2) average_price
from
UnitsSold
join
Prices
on
UnitsSold.product_id = Prices.product_id
where
purchase_date between start_date and end_date
group by
product_id

17.项目员工 I

select
project_id,
round(sum(experience_years)/count(*),2)
average_years
from
Project
left join
Employee
on
Project.employee_id = Employee.employee_id
group by
project_id

18.各赛事的用户注册率

取一个数据库的值直接(select count(*) from Users)就行。

select
contest_id,
round(100 * count(Register.user_id)/(select count(*) from Users),2)
percentage
from
Register
left join
Users
on
Register.user_id = Users.user_id
group by
contest_id
order by
percentage desc,contest_id

19.查询结果的质量和占比

select
query_name,
round(avg(rating / position),2) quality,
round(100*sum(if(rating<3,1,0))/count(*),2) poor_query_percentage
from
Queries
group by
query_name

20.每月交易 I

不能直接group by的情况,最好在select部分把划分依据修改好,在group by后直接用别名划分最好。

select
date_format(trans_date,"%Y-%m") month,country,
count(*) trans_count,
sum(if(state = 'approved',1,0)) approved_count,
sum(amount) trans_total_amount,
sum(if(state = 'approved',amount,0)) approved_total_amount
from
Transactions
group by
month,country

21.即时食物配送 II

select
round(100 * sum(if(order_date = customer_pref_delivery_date,1,0))/count(*),2) immediate_percentage
from
Delivery
where
(customer_id, order_date) in (select customer_id, min(order_date)
from Delivery group by customer_id)

22.游戏玩法分析 IV

当连接过的表除了需要的行外,全为null时。在分组情况下,avg(列名)可以直接得出比例。

select
round(avg(event_date is not null),2) fraction
from
(select
player_id, min(event_date) as login
from
Activity
group by
player_id)t
left join
Activity
on
t.player_id = Activity.player_id and
datediff(Activity.event_date, t.login) = 1

23.每位教师所教授的科目种类的数量

select
teacher_id, count(distinct subject_id) cnt
from
Teacher
group by
teacher_id

24.查询近30天活跃用户数

select
activity_date day,
count(distinct user_id) active_users
from
Activity
where
activity_date between "2019-06-28" and "2019-07-27"
group by
activity_date

25.销售分析III

只在某个日期可以直接分组后判断最小日期和最大日期是否在这个区间(where后面不能跟聚合函数)。

select
Product.product_id, product_name
from
Product
join
Sales
on
Product.product_id = Sales.product_id
group by
Product.product_id
having
min(sale_date) >= "2019-01-01" and max(sale_date) <= "2019-03-31"

26.超过5名学生的课

select
class
from
Courses
group by
class
having count(student) >= 5

27.求关注者的数量

select
user_id,
count(distinct follower_id) followers_count
from
Followers
group by
user_id

28.只出现一次的最大数字

没有输出结果要输出null的情况,只需要外层嵌套一个select就行。

select(
select
num
from
MyNumbers
group by
num
having
count(distinct num) = count(num)
order by
num desc limit 1)as num

29.买下所有产品的客户

select
customer_id
from 
Customer
group by
customer_id
having
count(distinct product_key) = (select count(*) from Product)

30.每位经理的下属员工数量

select
Employees.reports_to employee_id,e2.name,
count(*) reports_count,
round(avg(Employees.age),0) average_age
from
Employees
join
Employees e2
on 
Employees.reports_to = e2.employee_id
group by
Employees.reports_to
having
employee_id is not null
order by
employee_id

31.员工的直属部门

nuion可以自动去重,union all是保留重复。

select
employee_id,
department_id
from
Employee
group by
employee_id
having
count(*) = 1
union
select
employee_id,
department_id
from
Employee
where
primary_flag = 'Y'
group by
employee_id

32.判断三角形

select
x,y,z,
if(x+y>z and x+z>y and y+z>x, 'Yes', 'No') triangle
from
Triangle

33.连续出现的数字

用窗口函数lag()来解决这个问题。

select
distinct num ConsecutiveNums
from
(
    select
    id,num,
    lag(num,1)over(order by id) num1,
    lag(num,2)over(order by id) num2
    from
    Logs
) t
where
t.num = t.num1 and t.num1 = t.num2

34.指定日期的产品价格

思路介绍一下:

1.先取出指定日期之前的数据,然后从中再取出最大时间的。这就可以得到规定日期之前的最新数据,但是这数据不包括没有修改的数据。

2.取出全表数据,然后连接,补齐所有数据。

select
t1.product_id,
if(t2.new_price is null,10,t2.new_price) price
from
(
  select product_id,new_price
  from Products
  group by
  product_id
)t1
left join
(
select
product_id,
new_price
from
Products
where
(product_id,change_date) in (
  select product_id, max(change_date)
  from
  Products 
  where change_date <= '2019-08-16'
  group by
  product_id
)
)t2
on
t1.product_id = t2.product_id

35.最后一个能进入电梯的人

用窗口函数累加,累加完成后取出累加值小于1000的,最后取出最后一行即可。

select
person_name
from
(
    select
    person_name,
    sum(weight) over(order by turn) as s
    from
    Queue
)t
where t.s <= 1000
order by
s desc limit 1

36.按分类统计薪水

select
'Low Salary' category,
count(*) accounts_count
from
Accounts
where
income < 20000
union
select
'Average Salary' category,
count(*) accounts_count
from
Accounts
where
income >= 20000 and income <= 50000
union
select
'High Salary' category,
count(*) accounts_count
from
Accounts
where
income > 50000

37.上级经理已离职的公司员工

提一下多个or和and相连的准则吧:condition1  OR   condition2  AND   condition3

其运算实际上是等价于:condition1 OR   (condition2 AND condition3)     //先运算and  再运算or

select
employee_id
from
Employees
where
manager_id not in (select employee_id from Employees) and salary < 30000
order by
employee_id

38.换座位

select(
  case
  when id % 2 = 1 and id = (select count(*) from Seat) then id
  when id % 2 = 1 then id + 1
  else id - 1
  end
)as id, student
from
Seat
order by
id

39.电影评分

如果 union/union all前后有 order by,limit 就需要加括号,否则语法通过不了。

(select
name results
from
MovieRating
left join
Users
on
MovieRating.user_id = Users.user_id
group by
MovieRating.user_id
order by 
count(movie_id) desc, Users.name limit 1)
union all
(select
title results
from
MovieRating
left join
Movies
on
MovieRating.movie_id = Movies.movie_id
where
MovieRating.created_at like "2020-02%"
group by
MovieRating.movie_id
order by
avg(rating) desc, Movies.title limit 1)

40.餐馆营业额变化增长

1.窗口函数range和row字段的使用。

2.表中有重复行,按天排序才行,所以用dense_rank而不能用row_number。

select
distinct visited_on,
t.amount,
t.average_amount
from(
select
distinct visited_on,
sum(amount) over(order by visited_on range interval 6 day preceding) amount,
round(sum(amount) over(order by visited_on range interval 6 day preceding)/7,2) average_amount,
dense_rank() over(order by visited_on) rk
from
Customer)t
where t.rk > 6
order by 
visited_on

41.好友申请 II :谁有最多的好友

select
t.requester_id id,
count(*) num
from(
select
requester_id,accepter_id,accept_date
from
RequestAccepted
union all
select
accepter_id,requester_id,accept_date
from
RequestAccepted
)t
group by
t.requester_id
order by
num desc limit 1

42.2016年的投资

多个条件判断的不能用where的情况,考虑用窗口函数。

select
round(sum(tiv_2016),2) tiv_2016
from
(
    select
    tiv_2016,
    count(*) over(partition by tiv_2015) c1,
    count(*) over(partition by lat,lon) c2
    from
    Insurance
)t
where
t.c1>1 and t.c2 = 1

43. 部门工资前三高的所有员工

select
Department, Employee, Salary
from(
select
Department.name Department, Employee.name Employee, Salary,
dense_rank() over(partition by departmentId order by salary desc) rk
from
Employee
join
Department
on
Employee.departmentId = Department.id)t
where t.rk <= 3

44.修复表中的名字

CONCAT 用来拼接字符串 ;LEFT 从左边截取字符 ;RIGHT 从右边截取字符;UPPER 变为大写 ; LOWER 变为小写 ;LENGTH 获取字符串长度。

select
user_id,
concat(upper(left(name,1)),lower(right(name,length(name)-1))) name
from
Users
order by
user_id

45.患某种疾病的患者

select
patient_id,patient_name,conditions
from
Patients
where
conditions like "% DIAB1%" or conditions like "DIAB1%"

46.删除重复的电子邮箱

删除语句不是查询语句。

delete from Person
where id in (
  select
  id
  from(
  select
  id,
  row_number() over(partition by email order by id) rk
  from
  Person)t
  where t.rk > 1
)

47.第二高的薪水

主要有这几种可能:1.如果第一高的薪水有两个,那么第二高的薪水默认没有,即为空,这样的话row_number()就不合适了;2.rank()函数会跳数,会出现没有第二高薪水的情况;3.dense_rank()比较合适,可能会输出重复的,去个重就可以解决。

select
(
select
distinct t.salary SecondHighestSalary
from
(select
id,salary,
dense_rank() over(order by salary desc) rn
from
Employee)t
where
t.rn = 2)as SecondHighestSalary

48.按日期分组销售产品

组内字符串拼接函数:group_concat(distinct product order by product separator ',')

select
sell_date,
count(distinct product) num_sold,
group_concat(distinct product order by product separator ',') products
from
Activities
group by
sell_date

49.列出指定时间段内所有的下单产品

select
product_name,
sum(unit) unit
from
Orders
join
Products
on
Orders.product_id = Products.product_id
where
order_date like "2020-02%"
group by
Orders.product_id
having
unit >= 100

50.查找拥有有效邮箱的用户

正则表达式的应用

select
user_id,name,mail
from
Users
where
mail rlike '^[a-z|A-Z][0-9|A-Z|a-z|_|.|/|-]*@leetcode\\.com$'

你可能感兴趣的:(大数据,学习,sql,mysql)