2019拼多多学霸批第一批笔试题:
问题一:
(考点:活动运营数据分析)
表1——订单表orders,大概字段有(user_id‘用户编号’, order_pay‘订单金额’ , order_time‘下单时间’)。
表2——活动报名表act_apply,大概字段有(act_id‘活动编号’, user_id‘报名用户’,act_time‘报名时间’)
要求:
1.select a.act_id,sum(a.order_pay),count(o.user_id)
from orders o inner join act_apply a on o.user_id=a.user_id and o.order_time>=a.act_time
group by a.act_id
2.select t1.act_id,count(*)/datediff(satrt_time,getdate())
from orders o left join
(select *,min(act_time) over (partition by act_id) as start_time from act_apply) t1
on o.user_id=t1.user_id
where o.order_time>=t1.act_time
group by t1.act_id
(考点:用户行为路径分析)
表1——用户行为表tracking_log,大概字段有(user_id‘用户编号’,opr_id‘操作编号’,log_time‘操作时间’)
要求:
1.先统计每个用户每天的总的操作次数,再按日期分组计算每天的访客数和平均操作次数。
select t1.date,count(distinct t1.user_id) as '访客数',avg(op) as '平均操作次数'
(select user_id,count(opr_id) as 'op', convert(log_time,date) as 'date' from tracking_log
group by user_id,date) t1
group by t1.date
2.select t1.date,count(*) as ‘A-B路径用户计数’ from
(select distinct user_id, convert(log_time,date) as 'date', lead(opr_id,1) over(partition by user_id,convert(log_time,date) order by log_time) as '2nd',opr_id as '1st from tracking_log) t1
where t1.1st='A' and t1.2nd='B'
group by t1.date
问题三:
(考点:用户新增留存分析)
表1——用户登陆表user_log,大概字段有(user_id‘用户编号’,log_time‘登陆时间’)
要求:
1.每天新增用户数,以及他们第2天、30天的回访比例
参考答案:
找出用户最早登陆的时间,聚合第一次登陆的数量
找出用户最早登陆的时间,聚合相距第一次登陆相隔1天登陆的数量
找出用户最早登陆的时间,聚合相距第一次登陆相隔30天登陆的数量
再三个表连接,计算回访比率。
select aa.first_time as '第一次登陆的时间',aa.new as'新增用户', concat(round(100*bb.2_back/a.new,2),'%')
as '第二天回访率',concat(round(100*cc.2_back/a.new,2),'%') as '第三十天回访率'
from
(select a.first_time,count(distinct a.user_id) as new from
(select user_id,convert(min(log_time) over(partition by user_id),date) as first_time from user_log) a
group by a.first_time) aa
left join
(select b.first_time,count(distinct b.user_id) as 2_back from
(select user_id,convert(min(log_time) over(partition by user_id),date) as first_time,logtime from user_log) b
where datediff(first_time,log_time)=1
group by b.first_time) bb
on aa.first_time=bb.first_time
left join
(select c.first_time,count(distinct c.user_id) as 30_back from
(select user_id,convert(min(log_time) over(partition by user_id),date) as first_time,logtime from user_log) c
where datediff(first_time,log_time)=1
group by c.first_time) cc
on aa.first_time=cc.first_time
2019年小红书数据分析笔试:
100家店铺,purchase表中有销售记录,统计5月和6月,总gmv中,两个月分别的贡献前50%gmv的店铺名,purchase表的字段有:id、dt、seller_id、sell_name、item_id、gmv
计算出每个月每个店铺的gmv,然后使用开窗函数得到贡献排名前50%的店铺。
select b.m,b.sell_name from
(select a.m,a.sell_name ,
sum(a.total_gmv) over(partition by a.m) as sum1,
sum(a.total_gmv) over(partition by a.m order by a.total_gmv desc) sum2
from
(select month(dt) as m,sell_name,sum(gmv) as total_gmv from purchase
where month(dt) in (5,6) group by month(dt),sell_name
order by m,total_gmv desc) a
) b
where b.sum2/b.sum1<=0.5