系统windows,使用cmd命令行,MYSQL版本为8.0 。如果在加载数据的时候如果报错,一般为本地数据加载被禁用,可以参考官网:https://dev.mysql.com/doc/refman/8.0/en/load-data-local.html
以下为我的解决报错和加载:
mysql -u root -p shop --local-infile=1
//--local-infile=1指定可以在shop数据库里面,启用本地数据加载
Enter password: ******
//LOAD DATA LOCAL INFILE 为本地数据加载命令语句
mysql>LOAD DATA LOCAL INFILE 'C:/Users/tian/Desktop/order_info_utf.csv' into table shop.order
-> fields terminated by ',';
//fields terminated by ',';指定文件的分割方式,csv默认为‘,’分割。
mysql> LOAD DATA LOCAL INFILE 'C:/Users/tian/Desktop/user_info_utf.csv' into table shop.user
-> fields terminated by ',';
// csv分割符号为‘,’,需要指定分割方式
SQL有很多的是时间处理函数,now()返回当前时间。取字段中的mouth()月份,year()年份,day()年份。
select now()
常用的时间格式化方法:
select paytime,date(paytime),date_format(paytime,'%Y-%m') ,date_add(paytime ,interval 1 day) from shop.order
select paytime,date(paytime),date_format(paytime,'%Y-%m') ,date_add(paytime ,interval 1 day) from shop.order
group by date_format(paytime,'%Y-%m') //考虑年份的月份分组
// month(paytime)//若用此分组条件则只考虑月份,不考虑年份
// distinct 为去重关键字
select month(paytime),count(distinct useid) from shop.order
where inpay = '已支付'
group by month(paytime)
复购率是指带一个计算时期内有多少用户购买了1次以上。回购率是多少用户在本月购买,在下月还购买
select count(ct),count(if(ct>1,1,null)),count(if(ct>1,1,null))/count(ct) as '回购率' from
(select useid,count(useid) as ct from shop.order
where inpay = '已支付'
and month(paytime) = 3
group by useid
order by useid) as t
select t1.m,count(t1.m),count(t2.m),count(t2.m)/count(t1.m) as '回购率' from
(select useid, date_format(paytime,'%Y-%m-01') as m from shop.order
where inpay = '已支付'
group by useid,date_format(paytime,'%Y-%m-01')
order by useid) as t1
left join
(select useid, date_format(paytime,'%Y-%m-01') as m from shop.order
where inpay = '已支付'
group by useid,date_format(paytime,'%Y-%m-01')
order by useid) as t2
on t1.useid = t2.useid and t1.m = date_sub(t2.m,interval 1 month)
group by t1.m
select sex ,avg(ct) from (
select o.useid ,sex, count(1) as ct from shop.order o
inner join(
select * from shop.user
where sex <> '') t
on o.useid = t.userid
group by useid , sex) t2
group by sex
客户生命周期是指从一个客户开始对企业进行了解或企业欲对某一客户进行开发开始,直到客户与企业的业务关系完全终止且与之相关的事宜完全处理完毕的这段时间。
(所谓的客户生命周期指一个客户对企业而言是有类似生命一样的诞生、成长、成熟、衰老、死亡的过程。)
select useid,max(paytime),min(paytime),datediff(max(paytime),min(paytime)) from shop.order
where inpay = '已支付'
group by useid having count(1) > 1
// datediff()返回两个数据的时间差
select age,avg(ct) from
(
(select useid ,age , count(useid) as ct from shop.order o
inner join(
(select userid,ceil((year(now())-year(brithday))/10) as age from shop.user
where brithday>'1901-00-00')
) t
on o.useid = t.userid
group by useid,age)
) t1
group by age
order by age
//ceil 为向上取整函数
可以看出消费与年龄段有差异,但是可能数据并不是很标准,得到的结果,参考意义需不是很大。
由于mysql没有分组排名,没办法根据如10分位法进行排名
select count(useid),sum(total) from
(
select useid,sum(price) as total from shop.order o
// 按照用户和用户消费总金额构建子查询
where inpay = '已支付'
group by useid
order by total desc
) t
求出用户的总人数和消费总金额:
可以看出一共统计人数为85649,那么20%人数约为17000。
select count(useid),sum(total) from
(
select useid,sum(price) as total from shop.order o
where inpay = '已支付'
group by useid
order by total desc
limit 17000
// 17000是预估消费前的20%的人数
) t
求出前20%用户的总人数(约为17000人)和消费总金额:
那么,前20%的用户贡献的消费比例为:
(272032633/318503081)* 100% == 85.4%
本例主要使用MYSQL进行数据处理,处理了本地数据加载报错问题,重点练习了时间处理函数的应用,jion的连接等。同时对实际业务处理中常见的sql用法进行了练习:
1、统计不同月份的下单人数,
2、统计用户三月份的回购物率和复购率,
3、统计男女消费频率是否有差异,
4、统计不同年龄段用户消费是否有差异,
5、统计消费的二八法则,
6、消费的top20%的人数,贡献了多少额度。