SQL面试题-近30日内用户最新观看电影题材

用户登录视频网站,有时当日未观看电影,但需要分析用户喜欢的电影题材,就需要补充近30日内最新观看电影的题材。

用户ID 登录日期 观看电影
1 2023-01-05 爱情
1 2023-01-17
1 2023-01-29 恐怖
1 2023-02-15 科幻
1 2023-03-05
2 2023-01-07 传记
2 2023-02-25 记录
2 2023-02-26

要点:使用开窗函数按时间戳排序,通过range between来限制时间范围。日期与电影题材拼接,若电影题材为空,拼接结果为null,取最大值,就是用户最近观看的电影题材。

select
    id,
    dt,
    dt_timestamp,
    type,
    substr(last_30d_type,11)  last_30d_type
from
(
    select
        id,
        dt,
        unix_timestamp(dt,'yyyy-MM-dd') dt_timestamp,
        type,
        max(concat(dt,type)) over (partition by id order by unix_timestamp(dt,'yyyy-MM-dd') range between  2505600 preceding and current row) as last_30d_type
    from
    (
        select 1 as id, '2023-01-05' as dt, '爱情' as type
        union all select 1 as id, '2023-01-17' as dt, null   as type
        union all select 1 as id, '2023-01-29' as dt, '恐怖' as type
        union all select 1 as id, '2023-02-15' as dt, '科幻' as type
        union all select 1 as id, '2023-03-05' as dt, null   as type
        union all select 2 as id, '2023-01-07' as dt, '传记' as type
        union all select 2 as id, '2023-02-25' as dt, '记录' as type
        union all select 2 as id, '2023-02-26' as dt, null   as type
    ) a 
) b 

你可能感兴趣的:(数据仓库,sql,hive)