6.关键路径转化率分析——漏斗模型
转化:在一条指定的业务流程中,各个步骤的完成人数及相对上一个步骤的百分比
6.1 需求分析
step number rate
1 101110 100%
2 40000 40%
3 20000 20%
4 1000 10%
6.2 模型设计
定义好业务流程中的页面标识,下例中的步骤为:
Step1、 /item%
Step2、 /category
Step3、 /index
Step4、 /order
6.3 开发实现
分步骤开发:
1、查询每一个步骤的总访问人数
create table dw_oute_numbs as
select ‘step1’ as step,count(distinct remote_addr) as numbs from ods_click_pageviews where request like ‘/item%’
union
select ‘step2’ as step,count(distinct remote_addr) as numbs from ods_click_pageviews where request like ‘/category%’
union
select ‘step3’ as step,count(distinct remote_addr) as numbs from ods_click_pageviews where request like ‘/order%’
union
select ‘step4’ as step,count(distinct remote_addr) as numbs from ods_click_pageviews where request like ‘/index%’;
2、查询每一步骤相对于路径起点人数的比例
思路:利用join
select rn.step as rnstep,rn.numbs as rnnumbs,rr.step as rrstep,rr.numbs as rrnumbs from route_num rn
inner join
route_num rr
select tmp.rnstep,tmp.rnnumbs/tmp.rrnumbs as ratio
from
(
select rn.step as rnstep,rn.numbs as rnnumbs,rr.step as rrstep,rr.numbs as rrnumbs from dw_oute_numbs rn
inner join
dw_oute_numbs rr) tmp
where tmp.rrstep=’step1’;
3、查询每一步骤相对于上一步骤的漏出率
select tmp.rrstep as rrstep,tmp.rrnumbs/tmp.rnnumbs as ration
from
(
select rn.step as rnstep,rn.numbs as rnnumbs,rr.step as rrstep,rr.numbs as rrnumbs from route_num rn
inner join
route_num rr) tmp
where cast(substr(tmp.rnstep,5,1) as int)=cast(substr(tmp.rrstep,5,1) as int)-1
4、汇总以上两种指标
select abs.step,abs.numbs,abs.ratio as abs_ratio,rel.ratio as rel_ratio
from
(
select tmp.rnstep as step,tmp.rnnumbs as numbs,tmp.rnnumbs/tmp.rrnumbs as ratio
from
(
select rn.step as rnstep,rn.numbs as rnnumbs,rr.step as rrstep,rr.numbs as rrnumbs from route_num rn
inner join
route_num rr) tmp
where tmp.rrstep=’step1’
) abs
left outer join
(
select tmp.rrstep as step,tmp.rrnumbs/tmp.rnnumbs as ratio
from
(
select rn.step as rnstep,rn.numbs as rnnumbs,rr.step as rrstep,rr.numbs as rrnumbs from route_num rn
inner join
route_num rr) tmp
where cast(substr(tmp.rnstep,5,1) as int)=cast(substr(tmp.rrstep,5,1) as int)-1
) rel
on abs.step=rel.step
8 模块开发——结果导出
报表统计结果,由sqoop从hive表中导出,示例如下,详见工程代码
sqoop export \
–connect jdbc:mysql://hdp-node-01:3306/webdb –username root –password root \
–table click_stream_visit \
–export-dir /user/hive/warehouse/dw_click.db/click_stream_visit/datestr=2013-09-18 \
–input-fields-terminated-by ‘\001’