SQL面试题, since 2022-03-29

(2022.03.29 Tues)

1. 存在一个关系score_board(stu_id, stu_name, subjects, stu_score)，保存的是学生姓名、学号、科目和成绩。找出最低成绩大于80的学生姓名。

SELECT stu_name
FROM score_board
GROUP BY stu_name
HAVING min(stu_score) > 80;

另一种解法，排除法

SELECT stu_name
FROM score_board
WHERE stu_name NOT IN (
SELECT stu_name 
FROM score_board
WHERE score < 80);

1. 同上题关系，找出平均成绩大于80的学生姓名和平均成绩

SELECT stu_name, AVG(stu_score)
FROM score_board
GROUP BY stu_name
HAVING AVG(stu_score) > 80;

1. 同上题关系，加入一个自增的stid列作为primary key，关系中有些记录的内容重复，仅stid不同。显示非冗余项。

SELECT * 
FROM score_board
WHERE stid in (
SELECT min(stid)
FROM score_board
GROUP BY stu_name, stu_score, subjects, stu_score);

或删除

DELETE FROM score_board
WHERE stid not in (
SELECT min(stid)
FROM score_board
GROUP BY stu_name, stu_score, subjects, stu_score);

1. 一个关系r，仅有一列，该列的值为单一的字母。查询出该列中不同的字母两两得到的所有可能组合。

SELECT a.cname, b.cname
FROM alphabet_table a, alphabet_table b
where a.cname < b.cname;

1. 一个关系r(years, months, amount)，其中的month字段的取值范围是[1, 2, 3, 4]。写一个查询，查询结果是关系k(years, month1, month2, month3, month4)，其中month1到month4字段保留的是特定years在对应month的amount值。
  提示：使用case命令的话会出现比较多的空值。

SELECT years,
(select amount from month_amt m where months = 1 and m.years = a.years) as 'm1',
(select amount from month_amt m where months = 2 and m.years = a.years) as 'm2',
(select amount from month_amt m where months = 3 and m.years = a.years) as 'm3',
(select amount from month_amt m where months = 4 and m.years = a.years) as 'm4'
FROM month_amt a
GROUP BY a.years;

(2022.03.31 Thur)
如果每个月份有多个记录

SELECT years,
SUM(CASE WHEN months = 1 then month_amt else 0 end) AS m1,
SUM(CASE WHEN months = 2 then month_amt else 0 end) AS m2,
SUM(CASE WHEN months = 3 then month_amt else 0 end) AS m3,
SUM(CASE WHEN months = 4 then month_amt else 0 end) AS m4
FROM month_amt
group by years;

(2022.03.31 Thur)

1. 创建关系满足，1) 根据另一个关系的结构，2) 根据另一个关系结构并复制数据

CREATE TABLE table2 LIKE table1;

CREATE TABLE table2 LIKE table1;
INSERT INTO table2 SELECT * FROM table1;

1. 有关系a(article, submitter, update_ts)，返回每个文章和提交人的最后update时间。注，同一个文章可能被多人update，显示每个submitter的最后update时间。

SELECT article, submitter, max(update_ts)
FROM a
GROUP BY article, submitter;

另一个方法，可用性令人怀疑

select a.article, a.submitter, c.update_ts 
from a b, 
(select max(update_ts) from a where a.article=b.article) c;

1. 有主副两个关系，删除主关系中在副关系中不存在的sid

DELETE FROM primary_table NOT EXISTS (SELECT * FROM primary_table where primary_table.sid=secondary_table.sid);

1. 有两个关系da和db，都有key和value两个字段，修改表da的值，使得如果db表中有相同的key字段值，则da中对应的value字段与db中的value字段相同。

UPDATE da SET value = (select value from db where da.key = db.key) 
where key in (select key from db where db.key = da.key);

1. 成绩单关系r(name, subjects, score)，返回一个新的关系r1(name, subjects, score, grade)，成绩高于80则good，否则为normal。

select *,
case when score > 80 then 'good'
else 'normal' end as mark
from r

1. 存在一个关系score_board(stu_id, stu_name, subjects, stu_score)，保存的是学生姓名、学号、科目和成绩。找出没有一门成绩低于75，并且有至少两门成绩高于85分的学生，并给出其平均成绩。

SELECT stu_name, AVG(score)
FROM score_board
WHERE 
stu_name NOT IN (
    SELECT stu_name FROM score_board
    WHERE score < 75)
AND
stu_name IN (
    SELECT stu_name
    FROM score_board
    WHERE score > 85
    GROUP BY stu_name
    HAVING count(*) > 1)
GROUP BY stu_name
ORDER BY AVG(score);

(2022.04.11 Mon)

1. 依然考虑关系score_board(stu_id, stu_name, subjects, stu_score)，找出每个科目最高分，已经该学生的信息。

select * 
from score_board a
where score in 
(select max(score) from score_board b 
 where a.subjects=b.subjects);

或者使用窗口函数。最低最高的问题都可以用窗口函数解决，比如FIRST_VALUE配合DESC/ASC解决。

SELECT *
FROM (
SELECT *, 
FIRT_VALUE(score) OVER(PARTITION BY subjects ORDER BY scores DESC) AS highest_score
FROM score_board) AS a
WHERE a.score = a.highest_score;

1. 考虑关系score_board(stu_id, stu_name, subjects, stu_score)，已知有所有人都参加了三门课考试，找出每个人的最低成绩，和其他信息。

select * from 
(select *, 
 dense_ranking() over(partition by stu_name order by scores desc) AS ranking 
 from score_board) a 
where a.ranking > 2;

注意，该题可转化为经典的Top N问题，或“既要分组又要排序”问题，而该问题都可以用ranking的窗口函数来解决。

1. 考虑关系score_board(stu_id, stu_name, subjects, stu_score)，找出成绩低于科目平均成绩的同学信息

select * 
from test_data.stu_score a
where score < 
(select avg(score) from test_data.stu_score b 
 where a.subjects = b.subjects);

(2022.05.27 Fri)

15 (leetcode 176. second highest score, easy) 一个关系employ(id, salary)，该关系中id是primary key不重复。返回一个关系仅有变量SecondHighestSalary其中给出工资第二高的salary

select max(salary) SecondHighestSalary
from employ
where salary < (select max(salary) from employ);

17 (leetcode 177. the Nth highest salary) 一个关系employ(id, salary)，其中id不重复，返回一个表只有一个变量getNthHighestSalary。（提示：用窗口函数）

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (
      # Write your MySQL query statement below.
      select distinct(salary) getNthHighestSalary from (
      select salary, dense_rank() over(order by salary desc) 'rank'
          from Employee
      ) a
      where a.rank=N
  );
END

用窗口函数生成的表作为源表，从中筛选出满足特定条件的值。源表中只保留最终要使用的变量，即salary可节省runtime。

18 (leetcode 178. rank score, medium)关系如上employ(id, salary)，每个id不同，对salary排序，如果有并列则并列之后的序号是当前序号加一。返回关系含有salary和rank两个字段。

select salary, DENSE_RANK() OVER(ORDER BY salary DESC) AS 'rank'
FROM employ;

这里需要注意的是使用DENSE_RANK指令可以使得排序没有跳号。更值得注意的是返回字段rank是sql的关键字，所以在重命名时需要加引号。

19 (leetcode 180. consecutive numbers, medium)有一个关系log(id, num)，写一个SQL查询找到num字段中连续出现最少3次的数字，返回关系的字段是ConsecutiveNums。
(placeholder)
20 (leetcode 191. Employees Earning More Than Their Managers) 有关系Employee(id, salary, name, managerID)，所有人的id在这个表里。提示：用join。

select a.name Employee
from Employee a
inner join Employee b
ON a.managerID = b.id
WHERE a.salary > b.salary;

21 (182. Duplicate Emails, easy) 有关系Person(id, email)，id独一无二，email有的有重复，找出有重复的email

SELECT email from Person
group by email 
having count(*) > 1;

(2022.05.28 Sat)

22 (197. Rising Temperature, easy) 有关系Weather(id, recordDate, temperature)，其中的recordDate是连续日期，找出所有当天气温比前一天气温高的id。

select a.id from Weather a, Weather b
WHERE timestampdiff(day, a.recordDate, b.recordDate) = -1 
AND a.temperature > b.temperature;

这里用到的MySQL中的timestampdiff函数，传入的第一个参数是时间维度，可选year/month/day等，第二三个参数是日期。这里第二个参数的日期比第三个参数的滞后一天，则返回-1，提前则为正。

23 (183. Customers Who Never Order, easy) 有两个关系Customers(id, name)和Orders(id, customerID)，其中的Customers.id和Orders.customerID等价。Orders关系中记录了售出商品的id和购买者id。写一个查询找出没有购买商品的用户名。

SELECT a.name from Customers a
WHERE a.id not in (
SELECT b.id FROM Customers b 
INNER JOIN Orders c
ON b.id = c.customerID
);

这里的查询条件是customers.id而非名字，因为可能有重名的情况，这点需要注意。

24 (185. Department Top Three Salaries, hard) 有两个关系Employee(id, salary, name, departmentID)和Department(id, name)分别记录了职工的基本信息和部门信息，要求返回一个关系，记录了每个部门工资最高数额前三位的所有员工，其中包含(Department, Name, Salary)。每个数额上可能有不止一位员工。

SELECT b.name 'Department', a. name Name, a.salary Salary
FROM (select name, id, departmentID, 
DENSE_RANK() over(partition by departmentID order by salary desc) 'rank'
from Employee
) a
INNER JOIN Department b
ON a.departmentID = b.id
WHERE a.rank <= 3;

先分组在求top问题需要用到窗口函数，根据题目中的提示，可知排序用的函数是dense_rank而非row_number和rank。

25 (196. Delete Duplicate Emails, easy) 有关系Person(id, email)，其中相同email对应了不同的id，清除冗余的id信息，只保留最小的id。

# Mysql
DELETE p1 FROM Person p1, Person p2
WHERE p1.email = p2.email AND p1.id > p2.id;
# Oracle
DELETE FROM Person 
WHERE id in (
SELECT a.id id FROM Person a, Person b
WHERE a.email = b.email AND a.id > b.id
);

26 (184. Department Highest Salary, medium) 有关系Employee(id, name, salary, departmentID)和Department(id, name)，要求找出各部门最高工资的同事，可并列，(Department, Employee, Salary)

SELECT b.name 'Department', a.name 'Employee', a.salary 'Salary'
FROM ( SELECT id, name, salary, departmentID, 
rank() over(partition by departmentID order by salary desc) 'ranking' 
FROM Employee
) a
INNER JOIN Department b
ON a.departmentID = b.id
WHERE a.ranking = 1;

27 (608. Tree Node, medium) 有关系Tree(id, p_id)，保存的是树中的节点id和其parent id，如果p_id为Null则该点为Root，如果该节点是其他节点的parent，则该点为Inner，否则该节点类型为Leaf。返回节点类型(id, type)。

SELECT id, 
(CASE WHEN p_id is NUL THEN 'Root'
      WHEN id in (select p_id from Tree) THEN 'Inner'
      ELSE 'Leaf'
 END) 'type'
FROM Tree

28 (1158. Market Analysis I, medium) 有三个关系记录用户和订单信息，Users(user_id, join_date, favourite_brand)，Orders(order_id, order_date, item_id, buyer_id, seller_id)和Items(item_id, item_brand)，返回用户id和加入日期以及在2019年的下单次数，(buyer_id, join_date, orders_in_2019)

SELECT user_id 'buyer_id', join_date, count(*) orders_in_2019
FROM Users a
LEFT JOIN ( SELECT item_id, buyer_id From Orders 
WHERE order_date between '2019-01-01' and '2019-12-31'
) b
ON a.user_id = b.buyer_id
GROUP BY a.user_id;

本题过多无关信息，注意分辨。

SQL面试题, since 2022-03-29

你可能感兴趣的:(SQL面试题, since 2022-03-29)