这个问题可以扩展到很多相似的问题:连续几个月充值会员、连续天数有商品卖出、连续打滴滴、连续逾期。
目标表格:
+---------+--------+-------------+-------------+--+
| uid | times | start_date | end_date |
+---------+--------+-------------+-------------+--+
| guid01 | 4 | 2018-03-04 | 2018-03-07 |
| guid02 | 3 | 2018-03-01 | 2018-03-03 |
+---------+--------+-------------+-------------+--+
思路:
写sql呗 1.分组,排序,打行号,2.让时间戳-行号根据差值检查是否为连续
整体的答案:
select
uid,min(dt),max(dt),count(1) as counts
from
(
select
uid ,dt, date_sub(dt,rn) as dis
from
(
select
uid ,dt,row_number()over (partition by uid order by dt)rn
from continuous
)t1
)t2
group by uid ,dis having counts>2
答案解析:
1.分组 排序 打行号
select
uid ,dt,row_number()over (partition by uid order by dt)rn
from continuous
表格实现
+------+-------------------+---+
| uid| dt| rn|
+------+-------------------+---+
|guid02|2018-03-01 00:00:00| 1|
|guid02|2018-03-02 00:00:00| 2|
|guid02|2018-03-03 00:00:00| 3|
|guid02|2018-03-06 00:00:00| 4|
|guid01|2018-02-28 00:00:00| 1|
|guid01|2018-03-01 00:00:00| 2|
|guid01|2018-03-02 00:00:00| 3|
|guid01|2018-03-04 00:00:00| 4|
|guid01|2018-03-05 00:00:00| 5|
|guid01|2018-03-06 00:00:00| 6|
|guid01|2018-03-07 00:00:00| 7|
+------+-------------------+---+
2…让时间戳-行号根据差值检查是否为连续
select
uid ,dt, date_sub(dt,rn) as dis
from
(
select
uid ,dt,row_number()over (partition by uid order by dt)rn
from continuous
)t1
表格实现
+------+-------------------+----------+
| uid| dt| dis|
+------+-------------------+----------+
|guid02|2018-03-01 00:00:00|2018-02-28|
|guid02|2018-03-02 00:00:00|2018-02-28|
|guid02|2018-03-03 00:00:00|2018-02-28|
|guid02|2018-03-06 00:00:00|2018-03-02|
|guid01|2018-02-28 00:00:00|2018-02-27|
|guid01|2018-03-01 00:00:00|2018-02-27|
|guid01|2018-03-02 00:00:00|2018-02-27|
|guid01|2018-03-04 00:00:00|2018-02-28|
|guid01|2018-03-05 00:00:00|2018-02-28|
|guid01|2018-03-06 00:00:00|2018-02-28|
|guid01|2018-03-07 00:00:00|2018-02-28|
+------+-------------------+----------+
3.为连续的接果为一样的用count(1)函数计算行数总数
select
uid,min(dt),max(dt),count(1) as counts
from
(
select
uid ,dt, date_sub(dt,rn) as dis
from
(
select
uid ,dt,row_number()over (partition by uid order by dt)rn
from continuous
)t1
)t2
group by uid
表格实现
+------+-------------------+-------------------+------+
| uid| min(dt)| max(dt)|counts|
+------+-------------------+-------------------+------+
|guid02|2018-03-01 00:00:00|2018-03-03 00:00:00| 3|
|guid01|2018-02-28 00:00:00|2018-03-02 00:00:00| 3|
|guid01|2018-03-04 00:00:00|2018-03-07 00:00:00| 4|
+------+-------------------+-------------------+------+
4.这样的结果店铺有重复为了只显示收入最好的一个店铺在在这个基础上包两层select
select
*
from
(
select
*,
row_number() over(partition by uid order by counts desc) aa
from
(
select
uid,min(dt),max(dt),count(1) as counts
from
(
select
uid ,dt, date_sub(dt,rn) as dis
from
(
select
uid ,dt,row_number()over (partition by uid order by dt)rn
from continuous
)t1
)t2
group by uid
)t3
)t4
where aa = 1
表格实现:
+------+-------------------+-------------------+------+
| uid| min(dt)| max(dt)|counts|
+------+-------------------+-------------------+------+
|guid02|2018-03-01 00:00:00|2018-03-03 00:00:00| 3|
|guid01|2018-02-28 00:00:00|2018-03-02 00:00:00| 3|
|guid01|2018-03-04 00:00:00|2018-03-07 00:00:00| 4|
+------+-------------------+-------------------+------+
这是本人个人简介如有更好的办法请留言咱们一起讨论
期待收到你的宝贵意见