用户复购周期计算

用户复购周期(两次购买之间的时间间隔)
一、首先使用SQL进行计算
注:用户在一天中发生多次购买则只记为1次购买。
1.根据用户id与购买日期进行分组,将一天内发生多次消费记录进行合并。
DROP TABLE member_Repurchase_cycle_01;
CREATE TABLE member_Repurchase_cycle_01
AS
SELECT * FROM member_Repurchase_cycle_all# member_Repurchase_cycle_all是全部数据表
GROUP BY memberid,DATE(createdatetime)
ORDER BY memberid,createdatetime
2. 选出具有复购行为的用户id(复购指的是具有多天消费记录)
DROP TABLE Repurchase_cycle_memberid;
CREATE TABLE Repurchase_cycle_memberid
AS
SELECT memberid ,COUNT(memberid) FROM member_Repurchase_cycle_01
GROUP BY memberid
HAVING COUNT(memberid)>1
ORDER BY memberid
3.复购用户记录,包括消费时间
DROP TABLE member_Repurchase_cycle_duble;
CREATE TABLE member_Repurchase_cycle_duble
AS
SELECT * FROM member_Repurchase_cycle_01
WHERE memberid IN(SELECT memberid FROM Repurchase_cycle_memberid)
4.复购用户两次相邻消费之间时间间隔
DROP TABLE member_Repurchase_cycle_priod;
CREATE TABLE member_Repurchase_cycle_priod
AS
SELECT memberid,createdatetime,
DATEDIFF(a.createdatetime,(SELECT MAX(b.createdatetime) FROM member_Repurchase_cycle_duble b WHERE b.memberid=a.memberid
AND b.createdatetime FROM member_Repurchase_cycle_duble a
ORDER BY memberid,createdatetime
5.用户第一次消费与第二次消费之间的时间间隔(第一次指的是第一天,第二次指的是第二天)
DROP TABLE member_Repurchase_cycle_priod_fir_sec;
CREATE TABLE member_Repurchase_cycle_priod_fir_sec
AS
SELECT * FROM member_Repurchase_cycle_priod AS a WHERE (SELECT COUNT(*) FROM member_Repurchase_cycle_priod
WHERE memberid=a.memberid AND createdatetime<=a.createdatetime)<=2
二、使用python进行处理
1.加载数据包

import pandas as pd
import numpy as np
from pandas import DataFrame,Series

2.加载excel

shift=pd.read_excel('D:/FBS/member_repurchase_cycle_duble.xlsx')

3.使用python的group函数和shift函数进行操作,通过shift函数里面的值来控制向前还是向后偏移, 缺少的值会填充NaN.groupby函数里的参数控制基于什么字段进行shift.

shift['time_diff']=shift.groupby('memberid')['createdatetime'].shift(1)

4.对偏移后的时间做减法

shift['diff']=pd.to_datetime(shift['createdatetime'])-pd.to_datetime(shift['time_diff'])

5.自定义一个函数,让结果看起来更直观

def y(x):
    try:
        return int(x.days)+int(x.seconds)/3600/24
    except:
        return x
shift['diff_s']=shift['diff'].apply(y)

结果如图所示:
实例

你可能感兴趣的:(数据分析)