mysql案例学习总结

案例: 使用mysql 进行数据分析
老师指路->https://www.jianshu.com/u/1f32f227da5f
使用工具:MySQL、Navicat

一、准备好订单数据和用户数据:

a、打开命令窗口或者 navicat 、workbench 等数据库软件
b、任意选择一个数据库,或者创建一个新的数据库并进入数据库

create database data charset utf8;
use data;

c、执行以下source 命令,注意改成自己的路径并且不能包含中文

#将.sql文件拖到命令行source后 即可显示文件路径
source C:\Users\data\Desktop\Mysql_case\orderinfo.sql
source C:\Users\data\Desktop\Mysql_case\userinfo.sql

注意:后面不需要加分号,上面的命令会把 **.sql 文件的语句都执行一遍

desc orderinfo;--显示表结构

select *  from orderinfo limit 10; #显示10条记录

select  count(*) from orderinfo;#显示记录条数

二、题目要求:

1、统计不同月份的下单人数
2、统计用户三月份的回购率和复购率
3、统计男女用户消费频次是否有差异
4、统计多次消费的用户,第一次和最后一次消费间隔是多少天
5、统计不同年龄段,用户的消费金额是否有差异
6、统计消费的二八法则,消费的top20%用户,贡献了多少消费额

涉及表:
orderinfo 订单详情表
| orderid 订单id
| userid 用户id
| isPaid 是否支付
| price 付款价格
| paidTime 付款时间

userinfo 用户信息表
| userid 用户id
| sex 用户性别
| birth 用户出生日期

1、统计不同月份的下单人数

观察数据,过滤未支付数据,去重
不同月份下单人数

select year(paidTime)as year_ ,
month(paidTime) as month_ ,
count(distinct userid) as cons
from orderinfo 
where isPaid='已支付'
group by year(paidTime),month(paidTime);

2、统计用户三月份的回购率和复购率

三月份交易量

select count(orderid)as all_order from orderinfo  
where month(paidTime)=3;

三月份下单总人数

select 
count(distinct userid) as cons
from orderinfo 
where  month(paidTime)=3;

复购率:当月购买了多次用户占当月用户比例
回购率:上月购买用户中有多少用户本月又再次购买
(回购率:本月购买用户中有多少用户下个月又再次购买)
复购率:
a、先筛选出3月份消费情况

select 
*
from orderinfo
where isPaid='已支付'  and month(paidTime)=3;

b、统计每个用户在3月份消费了多少次

select 
userid, count(1) as cons
from orderinfo
where isPaid='已支付'  
and month(paidTime)=3

group by userid;

c、对购买次数进行判断,统计出消费多次的用户数(大于1次)
对用户购买次数进行判断,并sum求和,如果购买次数大于1,用户数加1
复购率:当月购买了多次用户占当月用户比例

select 

count(1) as userid_cons,
sum(if (cons>1,1,0)) as fugou_cons,
sum(if (cons>1,1,0))/count(1) as fugou_rate

from(select 
     userid,
     count(1) as cons
    from orderinfo
    where isPaid='已支付'  
    and month(paidTime)=3
    
    group by userid 
    ) a; 

回购率:本月购买用户中有多少用户下个月又再次购买
计算所有月份回购率
举例:三月份的回购率=3月用户中4月又再次购买的人数/3月用户总数
a、统计每年每月的用户消费情况

#分组:每个月 有哪些用户ID
select 
    userid,
    date_format(paidTime,'%Y-%m-01') as month_dt,
    count(1)  as cons

from orderinfo
where isPaid='已支付'
group by userid,date_format(paidTime,'%Y-%m-01');

b、相邻月份 进行关联,能关联上的用户为回购
自己与自己关联,用户对应,月份对应
此处使用left join ,保留用户总数;
inner join 会删掉关联不上的用户,三月用户总数会减少

select 
*
from(select 
    userid,
    date_format(paidTime,'%Y-%m-01') as month_dt,
    count(1)  as cons

from orderinfo
where isPaid='已支付'
group by userid,date_format(paidTime,'%Y-%m-01')
) a

left join (select 
    userid,
    date_format(paidTime,'%Y-%m-01') as month_dt,
    count(1)  as cons

from orderinfo
where isPaid='已支付'
group by userid,date_format(paidTime,'%Y-%m-01')


) b on a.userid=b.userid
and date_sub(b.month_dt,interval 1 month)=a.month_dt;

c、用统计出四月份有关联的 用户数据量 除以 三月份用户数据量
count(a.userid)>count(b.userid) 4月份有null情况

select 
    a.month_dt,
    count(a.userid),
    count(b.userid),
    count(b.userid)/ count(a.userid) as '回购率'

from(select 
    userid,
    date_format(paidTime,'%Y-%m-01') as month_dt,
    count(1)  as cons

from orderinfo
where isPaid='已支付'
group by userid,date_format(paidTime,'%Y-%m-01')

) a

left join (select 
    userid,
    date_format(paidTime,'%Y-%m-01') as month_dt,
    count(1)  as cons

from orderinfo
where isPaid='已支付'
group by userid,date_format(paidTime,'%Y-%m-01')


) b on a.userid=b.userid
and date_sub(b.month_dt,interval 1 month)=a.month_dt
group  by a.month_dt;

3、统计男女用户消费频次是否有差异

获取用户 性别 消费次数 男生用户消费总次数/男生人数
a、统计每个用户消费次数、带性别,筛选出性别不为空的用户

select   

a.userid,
sex,
count(1) as cons

from orderinfo a
inner join (select *  from userinfo where sex!='') b
on a.userid=b.userid
group  by a.userid,sex ;

b、对性别做一个消费次数平均计算

select 
    sex,
    avg(cons) as avg_cons
 
from(select   

a.userid,
sex,
count(1) as cons

from orderinfo a
inner join (select *  from userinfo where sex!='') b
on a.userid=b.userid
group  by a.userid,sex 
) c
group  by sex;

4、统计多次消费的用户,第一次和最后一次消费间隔是多少天

a、取出多次消费用户

select 
 userid
from orderinfo
 where isPaid='已支付'
 group by userid
 
 having count(1)>1;

b、取出第一次与第二次购物时间

计算时间差

select 
 
 userid,
 
 min(paidTime),
 max(paidTime),
 datediff(max(paidTime),min(paidTime)) as jiange

from orderinfo
where isPaid='已支付'
group by userid
 
having count(1)>1;

5、统计不同年龄段,用户的消费金额是否有差异

a、计算每个用户年龄,并对用户年龄进行分层
0-10:1 11-20:2 21-30:3 31-40 41-50 10岁为一层
ceil 向上取整

select 
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age

from userinfo  
where  birth> '1900-01-01';

b、关联订单信息,获取不同年龄段的消费频次和消费数据

select 
  a.userid,
  age,
  count(1) as cons ,
  sum(price) as prices
from orderinfo a

inner join (
select 
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age

from userinfo  
where  birth> '1900-01-01'

) b
on a.userid=b.userid
group by b.userid,b.age;

c、对不同年龄层 进行聚合 最终得到不同年龄层消费情况

select
age,
avg(cons),
avg(prices)

from (
select 
  a.userid,
  age,
  count(1) as cons ,
  sum(price) as prices

from orderinfo a

inner join (
select 
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age

from userinfo  
where  birth> '1900-01-01'

) b

on a.userid=b.userid

group by b.userid,b.age
) c
 
group  by age ;

6、统计消费的二八法则,消费的top20%用户,贡献了多少消费额

用户消费占比,20%用户 消费贡献占80%

a、统计每个用户消费金额,并进行降序排序

select 
userid,
sum(price) as total_prices

from orderinfo a

where  isPaid='已支付'
group  by userid 
order by total_prices desc ;

b、统计一共有多少用户,以及用户总消费金额

select

count(1) as cons,
sum(total_prices) as all_prices

from ( select 
userid,
sum(price) as total_prices

from orderinfo a

where  isPaid='已支付'
group  by userid  ) b;

c、取出消费前20%用户,进行金额统计
8万用户去20%-17000名

select 

count(1) as cons,
sum(total_prices) as all_prices

from (
select 
userid,
sum(price) as total_prices

from orderinfo a

where  isPaid='已支付'
group  by userid  

order by total_prices desc
limit 17000) b;

select (2.7/3.1);

你可能感兴趣的:(mysql案例学习总结)