1、现状分析
目前我们公司的宏观营收如何,各个商品,各个渠道的销量情况,渠道最好,可以加大投放的是哪些?
2、涉及数据指标的原因分析
为什么这个商品卖的最好?女性的销量是否更好?销量与年龄是否有关?
3、涉及用户的原因分析
为什么用户的复购率低?哪些用户是我们的核心用户,和流失用户的区别在哪?
4、涉及预测模型的预测分析
投放了这么多钱,预计能否回本,以及大概多久才能实现回本?
5、涉及目标设定的预测分析
下个季度需要如何制定KPI,预计达成的目标是多少?
1、现状分析——数据报表、日常监控。
2、原因分析——数据下钻、用户挖掘。
3、预测分析——预测模型、目标设定。
1、增长指标——新增用户数是用来衡量渠道推广效果的重要指标,主要基于用户基础信息和行为爱好(比如他们的性别、年龄、职业收入、城市线级)。
2、活跃度指标——通过考核用户的留存率(即用户粘性),活跃用户数根据不同的统计周期可分为日活跃(DAU)周活跃(WAU)月活跃(MAU)。
3、变现指标——判断最终能否盈利收益,主要通过营收,用户量,客单价等指标体现。
1、增长指标衍生为投放域、销售域。除了对新用户对拉新以外,还有销售域,用销售对用户触达。
2、活跃度指标衍生为出勤域、练习域。
3、变现指标衍生为订单域。
1、产品/数据分析师根据业务需求,收集相关指标与定义。
2、研发对指标进行技术评审并给出排期生产指标,同时产研相关人员将指标进行基于数据中台的沉淀。
3、产品根据业务需求,针对已评审(或已存在)指标进行分装与产品化设计。
4、指标体系看板搭建完成之后,需要输出一个数据字典。
1、搜索引擎,信息流广告。关注指标:素材、文案、地域、渠道、课程类型。
2、线下推广。关注指标:地域、用户属性、学校、基地人员。
3、新媒体推广。推广成本低、可以高效传播、用户质量不稳定。
4、第三方应用市场。拥有良好的位置、推荐等都会影响到产品的下载和用户规模。
1、找出所有指标。基于通过多维度的指标分析来优化广告投放,CPC、CTR、CVR、成 本、有效率、ROI、客单价等。
2、数据剖析。需要把各类指标的数据合并在一起分析,保障全面无遗漏,最好都放在一张底表上。
3、聚焦关键环节。尽可能多的时间聚焦到问题上,花时间进行优化, 核心公式:营收= 数量 * 线索转化率 * ARPU(客单价)
这里准备了一份180万行400M的数据。
包含以下信息:
利用MySQL对表的数据进行一个简单的聚合运算:
# 渠道属性
select channel, count(distinct userid) as 用户数
from `part4_ad_test原表`
GROUP BY channel
order by 用户数 DESC
# 用户行为属性(type=show才有cost,type=buy,才有amount )
select type,count(distinct userid) as 用户数
from `part4_ad_test原表`
group by type
# 用户自身属性
select city,city_level,count(distinct userid) as 用户数
from `part4_ad_test原表`
group by city,city_level
order by 用户数 desc
# 商品属性
select item_type,count(distinct userid) as 用户数
from `part4_ad_test原表`
group by item_type
order by 用户数 desc
SELECT
sum(amount) amount,
round(sum(cost),2) cost,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click,
sum(case when type='buy' then 1 else 0 end) as ad_buy,
Convert(sum(cost)/sum(case when type='show' then 1 else 0 end)*1000, decimal(18,2)) CPM,
Convert(sum(cost)/sum(case when type='click' then 1 else 0 end), decimal(18,2)) CPC,
cast(round(sum(case when type='click' then 1 else 0 end)*100/NULLIF(sum(case when type='show' then 1 else 0 end),0),2) as char)+'%' AS CTR,
cast(round(sum(case when type='buy' then 1 else 0 end)*100/NULLIF(sum(case when type='click' then 1 else 0 end),0),2) as char)+'%' AS CVR,
round(sum(cost)/sum(case when type='buy' then 1 else 0 end),2) as "订单成本" ,
round(sum(amount)/sum(cost),2) ROI
FROM `part4_ad_test原表`
where substring(createtime,1,10) between '2020-12-01' and '2020-12-31'
按照核心:营收= 数量 * 线索转化率 * ARPU(客单价),假设优先从曝光数量和转化率CTR入手。
# 不同渠道点击情况
SELECT
channel,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click
FROM `part4_ad_test原表`
GROUP BY channel
ORDER BY ad_show DESC
# 不同渠道点击率
SELECT
channel,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click,
cast(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2) as char)+'%' as ctr,
round(sum(amount)/sum(cost),2) ROI
FROM `part4_ad_test原表`
GROUP BY channel
having sum(case when type='show' then 1 else 0 end)>1000
ORDER BY ROI DESC
这里可以看到快手的数据比较大,ROI也很高,说明快手是一个值得加大投入的渠道。广点通的ROI最高,但是他的数据相对比较少,需要结合其他数据进行分析,如果盲目根据ROI和CTR就加大投入,可能会造成随着用户数量变多,ROI下降的局面。
# 不同日期的点击率
SELECT
substring(createtime,1,10) days,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click,
cast(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2) as char)+'%' AS ctr
FROM `part4_ad_test原表`
GROUP BY substring(createtime,1,10)
ORDER BY days DESC
# 年龄属性
SELECT
coalesce(age,'总计') age,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click,
-- CONCAT(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2),'%') AS ctr
cast(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2) as char)+'%' AS ctr
FROM `part4_ad_test原表`
GROUP BY age
with rollup
ORDER BY age
# 性别和新老用户
SELECT
COALESCE(sex,'总计') sex,
COALESCE(user_type,'总计') user_type,
sum(cost) "消耗",
SUM(amount) "营收",
sum(case when type='show' then 1 else 0 end) "展示pv",
sum(case when type='click' then 1 else 0 end) "点击pv",
sum(case when type='buy' then 1 else 0 end) "购买pv",
cast(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2) as CHAR)+'%' AS ctr
FROM `part4_ad_test原表`
GROUP BY
sex,
user_type
with ROLLUP
ORDER BY "消耗" desc
# 职业属性
SELECT
profession,
sum(case when type='show' then 1 else 0 end) as ad_show,
sum(case when type='click' then 1 else 0 end) as ad_click,
CONCAT(ROUND(sum(case when type='click' then 1 else 0 end)*100/sum(case when type='show' then 1 else 0 end),2),'%') AS ctr
FROM `part4_ad_test原表`
GROUP BY profession
ORDER BY ad_show desc
Tableau中计算字段的语法和MySQL语法有细微差别。
cpc:sum([Cost])/sum(case [Type] when 'click' then 1 else 0 end)
cpm:sum([Cost])/sum(case [Type] when 'show' then 1 else 0 end)*1000
ctr:round(sum(case [Type] when 'click' then 1 else 0 end)*100/ifnull(sum(case [Type] when 'show' then 1 else 0 end),0),2)
cvr:round(sum(case [Type] when 'buy' then 1 else 0 end)*100/ifnull(sum(case [Type] when 'click' then 1 else 0 end),0),2)
roi:sum([Amount])/sum([Cost])
RFM:
R(recency)——最近一次消费,体现粘性;
F(frequency)——消费频次,体现忠诚度;
M(monetary)——消费金额,体现购买力。
这里准备了一份RFM表:
三列分别是用户、用户购买时间、用户购买金额。
R是最近一次消费,所以应该是计算时间差,MySQL中计算时间差用DATEDIFF函数。
F是消费频次,计算用户出现的次数,MySQL中使用count(*)即可。
M是消费金额,对用户消费金额求和即可,MySQL中用sum函数。
select user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user
SELECT R, count(*) '用户数'
from (
select user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user)a
GROUP BY R
ORDER BY R
单独查询F值
SELECT F, count(*) '用户数'
from (
select user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user)a
GROUP BY F
ORDER BY F
单独查询M值
SELECT M, count(*) '用户数'
from (
select user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user)a
GROUP BY M
ORDER BY M
根据R、F、M的查询结果,我们可以对他们进行打分评价:
R值:
F值:
M值:
写出MySQL语句:
with t as (
select *,
(case
when R<150 then 5
when R between 150 and 299 then 4
when R between 300 and 449 then 3
when R between 450 and 599 then 2
when R>=600 then 1
else null end) as R_p,
(case
when F=1 then 1
when F=2 then 2
when F between 3 and 10 then 3
when F between 11 and 17 then 4
when F>=18 then 5
else null end) as F_p,
(case
when M<50 then 1
when M between 50 and 99 then 2
when M between 100 and 499 then 3
when M between 500 and 4999 then 4
when M>=5000 then 5
else null end) as M_p
from (
select
user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user
)a
)
select avg(R_p*1.0), avg(F_p*1.0), avg(M_p*1.0) from t
这里最后计算出均值,可以判断出RFM对应对平均水平。
这时,我们可以定义,
R大于3.359的时候得到A评分,小于则获得B评分;
F大于2.678的时候得到A评分,小于则获得B评分;
M大于3.189的时候得到A评分,小于则获得B评分。
所有用户可以分类为:
1.AAA 重要价值用户
2.ABA 重要发展用户
3.BAA 重要保持用户
4.BBA 重要挽留用户
5.AAB 一般价值用户
6.ABB 一般发展用户
7.BAB 一般保持用户
8.BBB 一般挽留用户
用MySQL语句表达:
with t as (
select *,
(case
when R<150 then 5
when R between 150 and 299 then 4
when R between 300 and 449 then 3
when R between 450 and 599 then 2
when R>=600 then 1
else null end) as R_p,
(case
when F=1 then 1
when F=2 then 2
when F between 3 and 10 then 3
when F between 11 and 17 then 4
when F>=18 then 5
else null end) as F_p,
(case
when M<50 then 1
when M between 50 and 99 then 2
when M between 100 and 499 then 3
when M between 500 and 4999 then 4
when M>=5000 then 5
else null end) as M_p
from (
select
user,
max(DATEDIFF('2021-1-1',pay_day)) as R,
count(*) as F,
sum(pay_amount) as M
from `part5_1.RFM原表`
group by user
)a
)
select
*,
(case
when R_class='B' and F_class='B' and M_class='B' then '一般挽留用户'
when R_class='B' and F_class='B' and M_class='A' then '重要挽留用户'
when R_class='B' and F_class='A' and M_class='B' then '一般保持用户'
when R_class='B' and F_class='A' and M_class='A' then '重要保持用户'
when R_class='A' and F_class='B' and M_class='B' then '一般发展用户'
when R_class='A' and F_class='B' and M_class='A' then '重要发展用户'
when R_class='A' and F_class='A' and M_class='B' then '一般价值用户'
when R_class='A' and F_class='A' and M_class='A' then '重要价值用户'
else null end) as user_type
from (
select
t.*,
(case when R>=3.359 then 'A' else 'B' end) as R_class,
(case when F>=2.678 then 'A' else 'B' end) as F_class,
(case when M>=3.189 then 'A' else 'B' end) as M_class
from t) a
R:DATEDIFF('day',[Pay Day],#2021-01-01#)
R_p:IF [R]<150 then 5 ELSEIF [R]<299 then 4 ELSEIF [R]<499 then 3 ELSEIF [R]<599 then 4 ELSEIF [R]>=600 then 5 ELSE null END
R_class:IF AVG([R_p])>=3.359 then 'A' ELSE 'B' END
F:SUM(1)
F_p:IF [F]=1 then 1 ELSEIF [F]=2 then 2 ELSEIF [F]<10 then 3 ELSEIF [F]<17 then 4 ELSEIF [F]>=18 then 5 ELSE null END
F_class:IF [F]>=2.678 then 'A' ELSE 'B' END
M:SUM([Pay Amount])
M_p:IF [M]<50 then 1 ELSEIF [M]<99 then 2 ELSEIF [M]<499 then 3 ELSEIF [M]<4999 then 4 ELSEIF [M]>=5000 then 5 ELSE null END
M_class:if [M]>=3.189 then 'A' else 'B' END
用户划分:IF [R_class]='B' and [F_class]='B' and [M_class]='B' then '一般挽留用户' ELSEIF [R_class]='B' and [F_class]='B' and [M_class]='A' then '重要挽留用户' ELSEIF [R_class]='B' and [F_class]='A' and [M_class]='B' then '一般保持用户' ELSEIF [R_class]='B' and [F_class]='A' and [M_class]='A' then '重要保持用户' ELSEIF [R_class]='A' and [F_class]='B' and [M_class]='B' then '一般发展用户' ELSEIF [R_class]='A' and [F_class]='B' and [M_class]='A' then '重要发展用户' ELSEIF [R_class]='A' and [F_class]='A' and [M_class]='B' then '一般价值用户' ELSEIF [R_class]='A' and [F_class]='A' and [M_class]='A' then '重要价值用户' END
LTV用处:
1、计算回报周期,验证UE模型。如果发现回收期太长,或难以回收成本时,则需要改造产品功能及商业逻辑。
2、对比渠道质量,调整投放策略。渠道评估,成本控制,投放合理配比,产品功能调整等。
3、监测异常情况。若偏离预计数据样本,很可能是代理作弊,或者刷单。
4、衡量用户质量。ROI=LTV/获客成本,LTV/CAC<1则亏本。