15. 数据仓库分层之DWS层、ADS层--本周回流用户数

本周活跃用户数即不是新注册用户且上周未登录的用户。所以:
本周回流用户数=本周活跃用户数-本周新增用户数-上周活跃用户数
  1. DWS层使用日活跃设备明细表dws_uv_detail_wk、dws_new_mid_day作为DWS层数据。
  2. ADS层新建表ads_back_count

    hive (gmall)>
    drop table if exists ads_back_count;
    create external table ads_back_count( 
       `dt` string COMMENT '统计日期',
       `wk_dt` string COMMENT '统计日期所在周',
       `wastage_count` bigint COMMENT '回流设备数'
    ) 
    row format delimited fields terminated by '\t'
    location '/warehouse/gmall/ads/ads_back_count';
  3. 插入数据

    hive (gmall)>
    insert into table ads_back_count
    select
       '2020-02-03' dt,
       concat(date_add(next_day('2020-02-03','mo'),-7),'_',date_add(next_day('2020-02-03','mo'),-1)) wk_dt,
       count(*)
    from
    (
       select t1.mid_id
       from
       (
           select mid_id
           from dws_uv_detail_wk
           where wk_dt=concat(date_add(next_day('2020-02-03','mo'),-7),'_',date_add(next_day('2020-02-03','mo'),-1))
       )t1 left join 
       (
           select mid_id
           from dws_new_mid_day
           where create_date>=date_add(next_day('2020-02-03','mo'),-7) and create_date<=date_add(next_day('2020-02-03','mo'),-1)
       )t2 on t1.mid_id=t2.mid_id 
       left join
       (
           select mid_id
           from dws_uv_detail_wk
           where wk_dt=concat(date_add(next_day('2020-02-03','mo'),-7*2),'_',date_add(next_day('2020-02-03','mo'),-1-7))
       )t3 on t1.mid_id=t3.mid_id
       where t2.mid_id is null and t3.mid_id is null
    )t4;
  4. 查询结果

    hive (gmall)> select * from ads_back_count;

你可能感兴趣的:(数据仓库,hive)