回城传送–》《32天SQL筑基》
今天是学习 SQL 打卡的第 5 天,每天我会提供一篇文章供群成员阅读( 不需要订阅付钱 )。
希望大家先自己思考,如果实在没有想法,再看下面的解题思路,自己再实现一遍。在小虚竹JAVA社区 中对应的 【打卡贴】打卡,今天的任务就算完成了,养成每天学习打卡的好习惯。
虚竹哥会组织大家一起学习同一篇文章,所以有什么问题都可以在群里问,群里的小伙伴可以迅速地帮到你,一个人可以走得很快,一群人可以走得很远,有一起学习交流的战友,是多么幸运的事情。
我的学习策略很简单,题海策略+ 费曼学习法。如果能把这些题都认认真真自己实现一遍,那意味着 SQL 已经筑基成功了。后面的进阶学习,可以继续跟着我,一起走向架构师之路。
今天的学习内容是:必会的常用函数
题目链接 | 难度 |
---|---|
条件函数:SQL26 计算25岁以上和以下的用户数量 | ★★★☆☆ |
日期函数:SQL28 计算用户8月每天的练题数量 | ★★★☆☆ |
日期函数:SQL29 计算用户的平均次日留存率 | ★★★★★ |
文本函数:SQL32 截取出年龄 | ★★★☆☆ |
窗口函数:SQL33 找出每个学校GPA最低的同学 | ★★★☆☆ |
drop table if exists `user_profile`;
drop table if exists `question_practice_detail`;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int
);
CREATE TABLE `question_practice_detail` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`question_id`int NOT NULL,
`result` varchar(32) NOT NULL
);
CREATE TABLE `question_detail` (
`id` int NOT NULL,
`question_id`int NOT NULL,
`difficult_level` varchar(32) NOT NULL
);
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
INSERT INTO user_profile VALUES(4,2315,'female',23,'浙江大学',3.6,5,1,2);
INSERT INTO user_profile VALUES(5,5432,'male',25,'山东大学',3.8,20,15,70);
INSERT INTO user_profile VALUES(6,2131,'male',28,'山东大学',3.3,15,7,13);
INSERT INTO user_profile VALUES(7,4321,'male',28,'复旦大学',3.6,9,6,52);
INSERT INTO question_practice_detail VALUES(1,2138,111,'wrong');
INSERT INTO question_practice_detail VALUES(2,3214,112,'wrong');
INSERT INTO question_practice_detail VALUES(3,3214,113,'wrong');
INSERT INTO question_practice_detail VALUES(4,6543,111,'right');
INSERT INTO question_practice_detail VALUES(5,2315,115,'right');
INSERT INTO question_practice_detail VALUES(6,2315,116,'right');
INSERT INTO question_practice_detail VALUES(7,2315,117,'wrong');
INSERT INTO question_practice_detail VALUES(8,5432,117,'wrong');
INSERT INTO question_practice_detail VALUES(9,5432,112,'wrong');
INSERT INTO question_practice_detail VALUES(10,2131,113,'right');
INSERT INTO question_practice_detail VALUES(11,5432,113,'wrong');
INSERT INTO question_practice_detail VALUES(12,2315,115,'right');
INSERT INTO question_practice_detail VALUES(13,2315,116,'right');
INSERT INTO question_practice_detail VALUES(14,2315,117,'wrong');
INSERT INTO question_practice_detail VALUES(15,5432,117,'wrong');
INSERT INTO question_practice_detail VALUES(16,5432,112,'wrong');
INSERT INTO question_practice_detail VALUES(17,2131,113,'right');
INSERT INTO question_practice_detail VALUES(18,5432,113,'wrong');
INSERT INTO question_practice_detail VALUES(19,2315,117,'wrong');
INSERT INTO question_practice_detail VALUES(20,5432,117,'wrong');
INSERT INTO question_practice_detail VALUES(21,5432,112,'wrong');
INSERT INTO question_practice_detail VALUES(22,2131,113,'right');
INSERT INTO question_practice_detail VALUES(23,5432,113,'wrong');
INSERT INTO question_detail VALUES(1,111,'hard');
INSERT INTO question_detail VALUES(2,112,'medium');
INSERT INTO question_detail VALUES(3,113,'easy');
INSERT INTO question_detail VALUES(4,115,'easy');
INSERT INTO question_detail VALUES(5,116,'medium');
INSERT INTO question_detail VALUES(6,117,'easy');
要求统计:
分析:
SELECT
case
when age < 25 then ‘25岁以下’
when age >= 25 then ‘25岁及以上’
end age_cut
FROM
user_profile
group by
age_cut
SELECT
case
when age < 25 then ‘25岁以下’
when age >= 25 then ‘25岁及以上’
end age_cut,
count(1) as number
FROM
user_profile
group by
age_cut
SELECT
case
when age < 25
or age is null then '25岁以下'
when age >= 25 then '25岁及以上'
end age_cut,
count(1) as number
FROM
user_profile
group by
age_cut
drop table if exists `user_profile`;
drop table if exists `question_practice_detail`;
drop table if exists `question_detail`;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int
);
CREATE TABLE `question_practice_detail` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`question_id`int NOT NULL,
`result` varchar(32) NOT NULL,
`date` date NOT NULL
);
CREATE TABLE `question_detail` (
`id` int NOT NULL,
`question_id`int NOT NULL,
`difficult_level` varchar(32) NOT NULL
);
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
INSERT INTO user_profile VALUES(4,2315,'female',23,'浙江大学',3.6,5,1,2);
INSERT INTO user_profile VALUES(5,5432,'male',25,'山东大学',3.8,20,15,70);
INSERT INTO user_profile VALUES(6,2131,'male',28,'山东大学',3.3,15,7,13);
INSERT INTO user_profile VALUES(7,4321,'male',28,'复旦大学',3.6,9,6,52);
INSERT INTO question_practice_detail VALUES(1,2138,111,'wrong','2021-05-03');
INSERT INTO question_practice_detail VALUES(2,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(3,3214,113,'wrong','2021-06-15');
INSERT INTO question_practice_detail VALUES(4,6543,111,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(5,2315,115,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(6,2315,116,'right','2021-08-14');
INSERT INTO question_practice_detail VALUES(7,2315,117,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(8,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(9,3214,113,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(10,6543,111,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(11,2315,115,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(12,2315,116,'right','2021-08-14');
INSERT INTO question_practice_detail VALUES(13,2315,117,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(14,3214,112,'wrong','2021-08-16');
INSERT INTO question_practice_detail VALUES(15,3214,113,'wrong','2021-08-18');
INSERT INTO question_practice_detail VALUES(16,6543,111,'right','2021-08-13');
INSERT INTO question_detail VALUES(1,111,'hard');
INSERT INTO question_detail VALUES(2,112,'medium');
INSERT INTO question_detail VALUES(3,113,'easy');
INSERT INTO question_detail VALUES(4,115,'easy');
INSERT INTO question_detail VALUES(5,116,'medium');
INSERT INTO question_detail VALUES(6,117,'easy');
要求统计:
分析:
SELECT DAY(date) AS day
FROM question_practice_detail
WHERE date LIKE ‘2021-08-%’
SELECT DAY(date) AS day , COUNT(date) AS question_cnt
FROM question_practice_detail
WHERE date LIKE '2021-08-%'
GROUP BY date
只要日期大于等于2021-8-1 且小于等于2021-8-31 ,那查询出来的条件也是符合的,使用日期函数date() ,把字符串格式的日期进行转化。
select
day(date) as day,
count(date) as question_cnt
from
question_practice_detail
where
date('2021-8-31') >= date
and date('2021-8-1') <= date
group by
date;
只要年份是2021 ,月份是8 ,也符合条件:这时可使用函数 YEAR() 获取年份,使用函数MONTH() 获取月份。
SELECT
DAY(date) AS day,
COUNT(date) AS question_cnt
FROM
question_practice_detail
WHERE
YEAR(date) = 2021
AND MONTH(date) = 08
GROUP BY
day
drop table if exists `user_profile`;
drop table if exists `question_practice_detail`;
drop table if exists `question_detail`;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int
);
CREATE TABLE `question_practice_detail` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`question_id`int NOT NULL,
`result` varchar(32) NOT NULL,
`date` date NOT NULL
);
CREATE TABLE `question_detail` (
`id` int NOT NULL,
`question_id`int NOT NULL,
`difficult_level` varchar(32) NOT NULL
);
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
INSERT INTO user_profile VALUES(4,2315,'female',23,'浙江大学',3.6,5,1,2);
INSERT INTO user_profile VALUES(5,5432,'male',25,'山东大学',3.8,20,15,70);
INSERT INTO user_profile VALUES(6,2131,'male',28,'山东大学',3.3,15,7,13);
INSERT INTO user_profile VALUES(7,4321,'male',28,'复旦大学',3.6,9,6,52);
INSERT INTO question_practice_detail VALUES(1,2138,111,'wrong','2021-05-03');
INSERT INTO question_practice_detail VALUES(2,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(3,3214,113,'wrong','2021-06-15');
INSERT INTO question_practice_detail VALUES(4,6543,111,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(5,2315,115,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(6,2315,116,'right','2021-08-14');
INSERT INTO question_practice_detail VALUES(7,2315,117,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(8,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(9,3214,113,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(10,6543,111,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(11,2315,115,'right','2021-08-13');
INSERT INTO question_practice_detail VALUES(12,2315,116,'right','2021-08-14');
INSERT INTO question_practice_detail VALUES(13,2315,117,'wrong','2021-08-15');
INSERT INTO question_practice_detail VALUES(14,3214,112,'wrong','2021-08-16');
INSERT INTO question_practice_detail VALUES(15,3214,113,'wrong','2021-08-18');
INSERT INTO question_practice_detail VALUES(16,6543,111,'right','2021-08-13');
INSERT INTO question_detail VALUES(1,111,'hard');
INSERT INTO question_detail VALUES(2,112,'medium');
INSERT INTO question_detail VALUES(3,113,'easy');
INSERT INTO question_detail VALUES(4,115,'easy');
INSERT INTO question_detail VALUES(5,116,'medium');
INSERT INTO question_detail VALUES(6,117,'easy');
要求统计:
分析:
SELECT DATEDIFF(day,‘2008-12-29’,‘2008-12-30’) AS DiffDate
//结果DiffDate 值为1
SELECT DATEDIFF(day,‘2008-12-30’,‘2008-12-29’) AS DiffDate
//结果DiffDate 值为-1
SELECT
qpd1.*
FROM
question_practice_detail qpd1
LEFT JOIN question_practice_detail qpd2 ON qpd1.device_id = qpd2.device_id
and datediff(qpd1.date, qpd2.date) = 1
COUNT(distinct qpd2.device_id, qpd2.date) / COUNT(distinct qpd1.device_id, qpd1.date) as avg_ret
SELECT
round(COUNT(distinct qpd2.device_id, qpd2.date) / COUNT(distinct qpd1.device_id, qpd1.date),4) as avg_ret
FROM
question_practice_detail qpd1
LEFT JOIN question_practice_detail qpd2 ON qpd1.device_id = qpd2.device_id
and datediff(qpd1.date, qpd2.date) = 1
drop table if exists user_submit;
CREATE TABLE `user_submit` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`profile` varchar(100) NOT NULL,
`blog_url` varchar(100) NOT NULL
);
INSERT INTO user_submit VALUES(1,2138,'180cm,75kg,27,male','http:/url/bisdgboy777');
INSERT INTO user_submit VALUES(1,3214,'165cm,45kg,26,female','http:/url/dkittycc');
INSERT INTO user_submit VALUES(1,6543,'178cm,65kg,25,male','http:/url/tigaer');
INSERT INTO user_submit VALUES(1,4321,'171cm,55kg,23,female','http:/url/uhsksd');
INSERT INTO user_submit VALUES(1,2131,'168cm,45kg,22,female','http:/url/sysdney');
要求统计:
分析:
substring_index(str,delim,count)
//说明
str:要处理的字符串
delm:分隔符
count:计数
如果count是正数,那么就是从左往右数
select substring_index(profile, ‘,’, 2)
from user_submit
如果count是负数,那么就是从右往左数
结果:
SELECT
substring_index(substring_index(profile, ',', -2), ',', 1) as age,
count(device_id) as number
from
user_submit
group by
age;
drop table if exists user_profile;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int
);
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
INSERT INTO user_profile VALUES(4,2315,'female',23,'浙江大学',3.6,5,1,2);
INSERT INTO user_profile VALUES(5,5432,'male',25,'山东大学',3.8,20,15,70);
INSERT INTO user_profile VALUES(6,2131,'male',28,'山东大学',3.3,15,7,13);
INSERT INTO user_profile VALUES(7,4321,'male',28,'复旦大学',3.6,9,6,52);
要求统计:
先不用窗口函数实现
分析:
select university,min(gpa) min_gpa from user_profile group by university
select
up1.device_id,
up1.university,
up1.gpa
from
user_profile up1
inner join (
select
university,
min(gpa) min_gpa
from
user_profile
group by
university
) up2 On up1.university = up2.university
and up1.gpa = up2.min_gpa
select
up1.device_id,
up1.university,
up1.gpa
from
user_profile up1
inner join (
select
university,
min(gpa) min_gpa
from
user_profile
group by
university
) up2 On up1.university = up2.university
and up1.gpa = up2.min_gpa
order by up1.university asc
这种写法是能实现,但有漏洞,你发现了吗?有发现的话,评论区说下
分析:
说明:MySQL从8.0版本开始支持窗口函数
窗口函数的语法 :
<窗口函数> over (partition by <用于分组的列名>
order by <用于排序的列名>)
<窗口函数>的位置,可以放以下两种函数:
1) 专用窗口函数,包括后面要讲到的rank, dense_rank, row_number等专用窗口函数。
2) 聚合函数,如sum. avg, count, max, min等
如果有并列名次的行,会占用下一名次的位置。比如正常排名是1,2,3,4,但是现在前3名是并列的名次,结果是:1,1,1,4。
不考虑并列名次的情况。比如前3名是并列的名次,排名是正常的1,2,3,4。
比如正常排名是1,2,3,4,但是现在前3名是并列的名次,结果是:1,1,1,2。
SELECT
*,
ROW_NUMBER() over (
PARTITION BY university
ORDER BY
gpa
) AS rn
FROM
user_profile
SELECT
device_id,
university,
gpa
FROM
(
SELECT
*,
ROW_NUMBER() over (
PARTITION BY university
ORDER BY
gpa
) AS rn
FROM
user_profile
) AS temp
WHERE
temp.rn = 1
1)同时具有分组(partition by)和排序(order by)的功能
2)不会减少原表的行数
我是虚竹哥,我们明天见~