力扣题目资源来自“熊大的数据分析之路”,写下这系列练习记录是为了自己实践一遍,非商业用途。
1. 游戏玩法分析 I
活动表 Activity:
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
表的主键是 (player_id, event_date)。
这张表展示了一些游戏玩家在游戏平台上的行为活动。
每行数据记录了一名玩家在退出平台之前,当天使用同一台设备登录平台后打开的游戏的数目(可能是 0 个)。
写一条 SQL 查询语句获取 每位玩家 第一次登陆平台的日期。
查询结果的格式如下所示:
Activity 表:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Result 表:
+-----------+-------------+
| player_id | first_login |
+-----------+-------------+
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
+-----------+-------------+
解:每一位玩家(用group by)第一次登陆平台的日期 (求日期的最小值)
select player_id,min(event_date) as first_login
from activity
group by player_id;
2.游戏玩法分析 II
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) 是这个表的两个主键
这个表显示的是某些游戏玩家的游戏活动情况
每一行是在某天使用某个设备登出之前登录并玩多个游戏(可能为0)的玩家的记录
请编写一个 SQL 查询,描述每一个玩家首次登陆的设备名称
查询结果格式在以下示例中:
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Result table:
+-----------+-----------+
| player_id | device_id |
+-----------+-----------+
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
+-----------+-----------+
解:每一个玩家首次登陆的设备名称
select player_id,device_id
from activity
where(player_id,event_date) in
(select player_id, min(event_date) as first_login
from activity
group by player_id)
3. 游戏玩法分析 III
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id,event_date)是此表的主键。
这张表显示了某些游戏的玩家的活动情况。
每一行是一个玩家的记录,他在某一天使用某个设备注销之前登录并玩了很多游戏(可能是 0 )。
编写一个 SQL 查询,同时报告每组玩家和日期,以及玩家到目前为止玩了多少游戏。也就是说,在此日期之前玩家所玩的游戏总数。详细情况请查看示例。
查询结果格式如下所示:
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 1 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Result table:
+-----------+------------+---------------------+
| player_id | event_date | games_played_so_far |
+-----------+------------+---------------------+
| 1 | 2016-03-01 | 5 |
| 1 | 2016-05-02 | 11 |
| 1 | 2017-06-25 | 12 |
| 3 | 2016-03-02 | 0 |
| 3 | 2018-07-03 | 5 |
+-----------+------------+---------------------+
对于 ID 为 1 的玩家,2016-05-02 共玩了 5+6=11 个游戏,2017-06-25 共玩了 5+6+1=12 个游戏。
对于 ID 为 3 的玩家,2018-07-03 共玩了 0+5=5 个游戏。
请注意,对于每个玩家,我们只关心玩家的登录日期。
解:①用两表连接的方法可以求出来
select a1.player_id,a1.event_date,sum(a2.games_played) as games_played_so_far
from activity a1,activity a2
where a1.player_id = a2.player_id
and a1.event_date >= a2.event_date
group by a1.player_id,a1.event_date
②学到这个oracle窗口函数,以玩家分组按时间排序:
select player_id ,event_date,
sum(games_played) over(partition by player_id orderby event_date ) as games_played_so_far
from activity;
4.游戏玩法分析IV
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id,event_date)是此表的主键。
这张表显示了某些游戏的玩家的活动情况。
每一行是一个玩家的记录,他在某一天使用某个设备注销之前登录并玩了很多游戏(可能是 0)。
编写一个 SQL 查询,报告在首次登录的第二天再次登录的玩家的分数,四舍五入到小数点后两位。换句话说,您需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
查询结果格式如下所示:
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Result table:
+-----------+
| fraction |
+-----------+
| 0.33 |
+-----------+
只有 ID 为 1 的玩家在第一天登录后才重新登录,所以答案是 1/3 = 0.33
解:
# 找到首次登录的日期selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id
# 将首次登录的日期作为临时表select*fromActivitya,(selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id)bonb.player_id=a.player_id
# 计算日期差,使用datediff函数selectdatediff(a.event_date,b.first_date),(selectcount(distinct(player_id))fromActivity)fromActivitya,(selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id)bonb.player_id=a.player_id
# 过滤出符合的玩家selectsum(casewhendatediff(a.event_date,b.first_date)=1then1else0end),(selectcount(distinct(player_id))fromActivity)fromActivitya,(selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id)bonb.player_id=a.player_id
# 使用公式,保留两位小数selectround(sum(casewhendatediff(a.event_date,b.first_date)=1then1else0end)/(selectcount(distinct(player_id))fromActivity),2)fromActivitya,(selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id)bonb.player_id=a.player_id
# 最终selectround(sum(casewhendatediff(a.event_date,b.first_date)=1then1else0end)/(selectcount(distinct(player_id))fromActivity),2)asfractionfromActivitya,(selectplayer_id,min(event_date)asfirst_datefromActivitygroupbyplayer_id)bwhereb.player_id=a.player_id
5. 员工薪水中位数
Employee表包含所有员工。Employee表有三列:员工Id,公司名和薪水。
+-----+------------+--------+
|Id | Company | Salary |
+-----+------------+--------+
|1 | A | 2341 |
|2 | A | 341 |
|3 | A | 15 |
|4 | A | 15314 |
|5 | A | 451 |
|6 | A | 513 |
|7 | B | 15 |
|8 | B | 13 |
|9 | B | 1154 |
|10 | B | 1345 |
|11 | B | 1221 |
|12 | B | 234 |
|13 | C | 2345 |
|14 | C | 2645 |
|15 | C | 2645 |
|16 | C | 2652 |
|17 | C | 65 |
+-----+------------+--------+
请编写SQL查询来查找每个公司的薪水中位数。挑战点:你是否可以在不使用任何内置的SQL函数的情况下解决此问题。
+-----+------------+--------+
|Id | Company | Salary |
+-----+------------+--------+
|5 | A | 451 |
|6 | A | 513 |
|12 | B | 234 |
|9 | B | 1154 |
|14 | C | 2645 |
+-----+------------+--------+
解:中位数的定义:中位数是按顺序排列的一组数据中居于中间位置的数,即在这组数据中,有一半的数据比他大,有一半的数据比他小。
奇数的时候,只有一个结果;偶数的时候就会有两个中位数。
需要先对每个公司的员工工资进行排序并增加一列序号,然后计算每个公司人数,把两部分join起来,取每个公司序号为该公司(总数+1)/2的数据
学习下,还有点不是特别明白
select a.Id, a.Company, a.Salary
from(
select a.*,
@num:=if(@company=Company, @num+1, 1) as num,
@company:=Company
from Employee as a, (select @num:=0, @company:='') as b
order by Company, Salary desc
) as a
join (
select Company, count(*) as count
from Employee
group by Company
) as b
on a.Company=b.Company and a.num=(b.count+1) div 2
6. 至少有5名直接下属的经理
Employee表包含所有员工和他们的经理。每个员工都有一个 Id,并且还有一列是经理的 Id。
+------+----------+-----------+----------+
|Id |Name |Department |ManagerId |
+------+----------+-----------+----------+
|101 |John |A |null |
|102 |Dan |A |101 |
|103 |James |A |101 |
|104 |Amy |A |101 |
|105 |Anne |A |101 |
|106 |Ron |B |101 |
+------+----------+-----------+----------+
给定Employee表,请编写一个SQL查询来查找至少有5名直接下属的经理。对于上表,您的SQL查询应该返回:
+-------+
| Name |
+-------+
| John |
+-------+
注意:
没有人是自己的下属。
select a.name"Name"
from employee a,employee b
where a.id = b.managerid
group by a.name
having count(*)>=5;
7. 当选者
表:Candidate
+-----+---------+
| id | Name |
+-----+---------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
+-----+---------+
表:Vote
+-----+--------------+
| id | CandidateId |
+-----+--------------+
| 1 | 2 |
| 2 | 4 |
| 3 | 3 |
| 4 | 2 |
| 5 | 5 |
+-----+--------------+
id 是自动递增的主键,
CandidateId 是 Candidate 表中的 id.
请编写 sql 语句来找到当选者的名字,上面的例子将返回当选者B.
+------+
| Name |
+------+
| B |
+------+
注意:
你可以假设没有平局,换言之,最多只有一位当选者。
selectdistinctname
fromcandidate c,vote v
wherec.id = v.candidateidandv.candidateid = (selectcandidateid
from(selectcandidateid,dense_rank()over(orderbycndesc) rn
from(selectcandidateid,count(*) cn
fromvote
groupbycandidateid))
wherern=1);
或者:
selectNamefromCandidatewhereid=(selectCandidateIdfromVotegroupbyCandidateIdhavingcount(CandidateId)=(selectmax(total_count)from(selectcount(CandidateId)astotal_countfromVotegroupbyCandidateId) tmp))
577. 员工奖金
选出所有 bonus < 1000 的员工的 name 及其 bonus。
Employee表单
+-------+--------+-----------+--------+
| empId | name | supervisor| salary |
+-------+--------+-----------+--------+
| 1 | John | 3 | 1000 |
| 2 | Dan | 3 | 2000 |
| 3 | Brad | null | 4000 |
| 4 | Thomas | 3 | 4000 |
+-------+--------+-----------+--------+
empId 是这张表单的主关键字
Bonus表单
+-------+-------+
| empId | bonus |
+-------+-------+
| 2 | 500 |
| 4 | 2000 |
+-------+-------+
empId 是这张表单的主关键字
输出示例:
+-------+-------+
| name | bonus |
+-------+-------+
| John | null |
| Dan | 500 |
| Brad | null |
+-------+-------
select name,bonus
from employee e
left join bonus b
on e.empid=b.empid
where nvl(bonus,0)<1000;
NVL( string1, replace_with)
NVL(E1, E2)的功能为:如果E1为NULL,则函数返回E2,否则返回E1本身
578. 查询回答率最高的问题
从survey_log表中获得回答率最高的问题,survey_log表包含这些列:uid,action,question_id,answer_id,q_num,timestamp。
uid 表示用户 id;action 有以下几种值:"show","answer","skip";当 action 值为 "answer" 时 answer_id 非空,而 action 值为 "show" 或者 "skip" 时 answer_id 为空;q_num 表示当前会话中问题的编号。
请编写SQL查询来找到具有最高回答率的问题。
示例:
输入:
+------+-----------+--------------+------------+-----------+------------+
| uid | action | question_id | answer_id | q_num | timestamp |
+------+-----------+--------------+------------+-----------+------------+
| 5 | show | 285 | null | 1 | 123 |
| 5 | answer | 285 | 124124 | 1 | 124 |
| 5 | show | 369 | null | 2 | 125 |
| 5 | skip | 369 | null | 2 | 126 |
+------+-----------+--------------+------------+-----------+------------+
输出:
+-------------+
| survey_log |
+-------------+
| 285 |
+-------------+
解释:
参考一个我觉得最好的答案:
***求什么比率一般都是:
(sum(case when `action` like 'answer' then 1 else 0 end) / sum(case when `action` like 'show' then 1 else 0 end)) as rate
select question_id as survey_log
from(
select (sum(case when `action` like 'answer' then 1 else 0 end) / sum(case when `action` like 'show' then 1 else 0 end)) as rate,question_id
from survey_log
group by question_id
order by rate desc
limit 1
————————————————
版权声明:本文为CSDN博主「偲偲粑」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_43329319/java/article/details/96367844