这篇是进阶sql题目的记录,由于上一篇文章已经写将近一万字,有点长,就把剩下的再开一篇,免得总是重新发布
本题目要求统计,并且是多行,就需要使用group by查询
首先需要统计月份,这个需要format格式化出月份,统计每个月份里平均的活跃天数(各人活跃天数和/选取去重的人数),
月度活跃人数(在这个月且submit_time不为空的人数)
这里使用 count(distinct(date_format(submit_time,‘%Y-%m’))统计月份,但是发现数量对不上
经过反复尝试后发现这里不需要distinct去重,因为count自带去重,并且也去掉null了
select date_format(submit_time,'%Y%m') AS month,round(count(submit_time)/count(distinct(uid)),2) AS avg_active_days,count(distinct(uid)) AS mau
from exam_record
where year(submit_time)=2021 AND submit_time is not null
group by date_format(submit_time,'%Y%m')
之后发现有个用例通不过,检查后发现这里有一个用户在一天做了两种卷子
于是需要组合去重
select date_format(submit_time,'%Y%m') AS month,
round((count(distinct uid,date_format(submit_time,'%Y%m%d')))/count(distinct uid),2)
AS avg_active_days,
count(distinct uid) AS mau
from exam_record
where year(submit_time)=2021
group by date_format(submit_time,'%Y%m')
这里distinct不写括号也可以
round这里括号比较多,需要注意
类似于上一道题,统计每个月的总题目数和日均刷题数量,group by肯定要用,但是第三行要求总的数量
这里求日均需要求这个月的天数,使用这个函数DAY(LAST_DAY(yourColumnName))
select date_format(submit_time,'%Y%m') AS submit_month
,count(date_format(submit_time,'%Y%m')) AS month_q_cnt
,round(count(date_format(submit_time,'%Y%m'))/DAY(LAST_DAY(submit_time)),3) AS avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
group by date_format(submit_time,'%Y%m')
这里有一个错误
SQL_ERROR_INFO: "Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘practice_record.submit_time’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
查阅得知这个/DAY(LAST_DAY(submit_time))因为day(last_day(submit_time)运算结果还是跟submit_time同样的一串数列,只有加上avg(),min()或max()运算才变成了一个数值作为分母使用
这样输出正确了
之后需要在最后一行输出总和
看到都是用union,union all做的,这两个分别是合并重复和不合并的,都是把两个查询结果上下合到一个表里
select date_format(submit_time,'%Y%m') AS submit_month
,count(date_format(submit_time,'%Y%m')) AS month_q_cnt
,round(count(date_format(submit_time,'%Y%m'))/avg(DAY(LAST_DAY(submit_time))),3) AS avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
group by date_format(submit_time,'%Y%m')
union all
select '2021汇总' as submit_month,
count(submit_time) as month_q_cnt,
round(count(submit_time)/max(31),3) as avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
order by submit_month
这里31加max是为了做分母,用30会报错,和放在下面是做不到的,需要另外计算
这题目涉及两个表的统计,那么就是inner join 后再group by
开始是这样写的,但是发现count里为空条件和不为空条件查出来一样,好像是无效的,查找后发现可以这么写
if成立就是1,算一行,但是好像只能在mysql里运行,网站报错
于是改用sum
select r.uid
,sum(if(r.submit_time is null , 1 ,0) ) incomplete_cnt
,sum(if(r.submit_time is not null, 1 ,0) ) complete_cnt
,group_concat(distinct concat(date_format(r.start_time,'%Y-%m-%d'),':',i.tag) order by start_time separator ';') detail
from exam_record r inner join examination_info i
on r.exam_id = i.exam_id
where year(r.start_time)=2021
group by r.uid
having incomplete_cnt>1 AND incomplete_cnt<5 AND complete_cnt>=1
order by incomplete_cnt desc
后续在这里发布