MySQL之用户行为分析项目

一、数据要求

涉及表:

  orderinfo  订单详情表

    | orderid  订单id

    | userid    用户id

    | isPaid    是否支付

    | price    付款价格

    | paidTime  付款时间


  userinfo  用户信息表

    | userid    用户id

    | sex        用户性别

    | birth      用户出生日期

二、数据分析目的

1、统计不同月份的下单人数

select   year(paidTime),month(paidTime),count(distinct userid) as cons

from orderinfo where isPaid="已支付"  and paidTime<>'0000-00-00 00:00:00'

group by year(paidTime),month(paidTime);

【粗体斜字部分为对脏数据进行清洗】




2、统计用户三月份的回购率和复购率

复购率:当月购买了多次的用户占当月用户的比例


a、先筛选出3月份的消费情况

select  *    from orderinfo  where isPaid="已支付"

and month(paidTime)="03";

b、统计一下每个用户在3月份消费了多少次

select  userid,count(1) as cons  from orderinfo

where isPaid="已支付"    and month(paidTime)="03"   group by userid;

c、对购买次数做一个判断,统计出来那些消费了多次(大于1次)的用户数

select  count(1) as userid_cons,

  sum(if(cons>1,1,0)) as fugou_cons,

  sum(if(cons>1,1,0))/count(1) as fugou_rate

from (select userid, count(1)  as cons    from  orderinfo

      where isPaid="已支付"    and month(paidTime)="03"   group by userid) a;



回购率:上月购买用户中有多少用户本月又再次购买

本月回购率:本月购买用户中有多少用户下个月又再次购买

3月份的回购率 =  3月用户中4月又再次购买的人数 / 3月的用户总数

a、统计每年每月的一个用户消费情况

select

  userid,

  date_format(paidTime,'%Y-%m-01') as month_dt,

  count(1) as cons

from orderinfo

where isPaid="已支付"

group by userid,date_format(paidTime,'%Y-%m-01');

b、相邻月份进行关联,能关联上的用户说明就是回购

select

  *

from (select

  userid,

  date_format(paidTime,'%Y-%m-01') as month_dt,

  count(1) as cons

from orderinfo

where isPaid="已支付"

group by userid,date_format(paidTime,'%Y-%m-01')) a

left join (select

  userid,

  date_format(paidTime,'%Y-%m-01') as month_dt,

  count(1) as cons

from orderinfo

where isPaid="已支付"

group by userid,date_format(paidTime,'%Y-%m-01')) b

on a.userid=b.userid

and date_sub(b.month_dt,interval 1 month)=a.month_dt;


c、统计每个月份的消费人数情况及格得到回购率

select

  a.month_dt,

  count(a.userid) ,

  count(b.userid) ,

  count(b.userid) / count(a.userid)

from (select

  userid,

  date_format(paidTime,'%Y-%m-01') as month_dt,

  count(1) as cons

from orderinfo

where isPaid="已支付"

group by userid,date_format(paidTime,'%Y-%m-01')) a

left join (select

  userid,

  date_format(paidTime,'%Y-%m-01') as month_dt,

  count(1) as cons

from orderinfo

where isPaid="已支付"

group by userid,date_format(paidTime,'%Y-%m-01')) b

on a.userid=b.userid

and date_sub(b.month_dt,interval 1 month)=a.month_dt

group by a.month_dt;

3、统计男女用户消费频次是否有差异

1、统计每个用户的消费次数,注意要带性别

select

  a.userid,

  sex,

  count(1) as cons

from orderinfo a

inner join (select * from userinfo where sex<>'') b

on a.userid=b.userid

group by a.userid,sex;

2、对性别做一个消费次数平均计算

select

  sex,

  avg(cons) as avg_cons

from (select

  a.userid,

  sex,

  count(1) as cons

from orderinfo a

inner join (select * from userinfo where sex<>'') b

on a.userid=b.userid

group by a.userid,sex) a

group by sex;


4、统计多次消费的用户,第一次和最后一次消费间隔是多少天

1、取出多次消费的用户

select

  userid

from orderinfo

where isPaid="已支付"

group by userid

having count(1)>1;

2、取出第一次和最后一次的时间

select

  userid,

  min(paidTime),

  max(paidTime),

  datediff(max(paidTime), min(paidTime))

from orderinfo

where isPaid="已支付"

group by userid

having count(1)>1;

5、统计不同年龄段,用户的消费金额是否有差异

a、计算每个用户的年龄,并对年龄进行分层:0-10:1,11-20:2,21-30:3

select

  userid,

  birth,

  now(),

  timestampdiff(year,birth,now()) as age

from userinfo

where birth>'1900-00-00';

select

  userid,

  birth,

  now(),

  ceil(timestampdiff(year,birth,now())/10) as age

from userinfo

where birth>'1901-00-00';


ceil函数返回不小于x的最小整数

b、关联订单信息,获取不同年龄段的一个消费频次和消费金额

select

  a.userid,

  age,

  count(1) as cons,

  sum(price) as prices

from orderinfo a

inner join (select

  userid,

  birth,

  now(),

  ceil(timestampdiff(year,birth,now())/10) as age

from userinfo

where birth>'1901-00-00') b

on a.userid=b.userid

group by a.userid,age;


c、再对年龄分层进行聚合,得到不同年龄层的消费情况

select

  age,

  avg(cons),

  avg(prices)

from (select

  a.userid,

  age,

  count(1) as cons,

  sum(price) as prices

from orderinfo a

inner join (select

  userid,

  birth,

  now(),

  ceil(timestampdiff(year,birth,now())/10) as age

from userinfo

where birth>'1901-00-00') b

on a.userid=b.userid

group by a.userid,age) a

group by age;


6、统计消费的二八法则,消费的top20%用户,贡献了多少消费额

1、统计每个用户的消费金额,并进行一个降序排序

select

  userid,

  sum(price) as total_price

from orderinfo a

where isPaid="已支付"

group by userid;


2、统计一下一共有多少用户,以及总消费金额是多少

select

  count(1) as cons,

  sum(total_price) as all_price

from (select

  userid,

  sum(price) as total_price

from orderinfo a

where isPaid="已支付"

group by userid) a;


3、取出前20%的用户进行金额统计

select

  count(1) as cons,

  sum(total_price) as all_price

from (

select

  userid,

  sum(price) as total_price

from orderinfo a

where isPaid="已支付"

group by userid

order by total_price desc

limit 17000) b ;



最后大概20%有2.7亿金额,总共金额大概3.18亿

select (2.7/3.18);

:前20%的用户贡献 了85%的金额,所以想要维持好运营要先维持好消费金额前20%的用户


你可能感兴趣的:(MySQL之用户行为分析项目)