Hive综合案例练习(中级)第二十九题:求出商品连续售卖的时间区间

求出商品连续售卖的时间区间

题目需求

从订单详情表(order_detail)中,求出商品连续售卖的时间区间

结果如下(截取部分):

Sku_id(商品id) Start_date(起始时间) End_date(结束时间)
1 2021-09-27 2021-09-27
1 2021-09-30 2021-10-01
1 2021-10-03 2021-10-08
10 2021-10-02 2021-10-03
10 2021-10-05 2021-10-08
11 2021-10-02 2021-10-08
12 2021-09-30 2021-09-30
12 2021-10-02 2021-10-06
12 2021-10-08 2021-10-08

代码实现

  • 每个商品售卖的日期以及拿到按排序后日期的差值
select sku_id,
       create_date,
       date_sub(create_date, rank() over (partition by sku_id order by create_date)) ddrk
from order_detail
group by sku_id, create_date;
  • 拿到每次售卖的区间
select distinct sku_id,
                first_value(t1.create_date)
                            over (partition by t1.sku_id,t1.ddrk order by t1.create_date rows between unbounded preceding and unbounded following) start_date,
                last_value(t1.create_date)
                           over (partition by t1.sku_id,t1.ddrk order by t1.create_date rows between unbounded preceding and unbounded following)  end_date
from (
         select sku_id,
                create_date,
                date_sub(create_date, rank() over (partition by sku_id order by create_date)) ddrk
         from order_detail
         group by sku_id, create_date
     ) t1

你可能感兴趣的:(Hive综合案例练习,hive,hadoop,数据仓库)