Hive 年龄区间判断优化

优化前


with movies as(
select
    t4l.user_id, name, age,t4l.url,
    case when age >= 0 and age <= 10
            then "0-10岁"
         when age > 10 and age <= 20
             then "10-20岁"
         when age > 20 and age <= 30
             then "20-30岁"
         when age > 30 and age <= 40
             then "30-40岁"
         when age > 40 and age <= 50
             then "40-50岁"
         when age > 50 and age <= 60
             then "50-60岁"
        else "60岁以上"
    end age_range
from test_sql.test4user t4u
join
    test_sql.test4log t4l
on
        t4u.user_id = t4l.user_id)
select
    age_range,count(url) cnt
from
    movies
group by
    age_range
order by
    age_range;

一大堆 case when 不光写起来索然无味,看起来也很 low。

优化后

with movies as(
select
    t4l.user_id, t4u.name, t4u.age,t4l.url,
    concat(`floor`(t4u.age/10)*10,'-',`floor`(t4u.age/10 + 1)*10,'岁') age_range
from test_sql.test4user t4u
join
    test_sql.test4log t4l
on
        t4u.user_id = t4l.user_id)
select
    age_range,count(url) cnt
from
    movies
group by
    age_range
order by
    age_range;

关键部分:concat(floor(t4u.age/10)*10,'-',floor(t4u.age/10 + 1)*10,'岁')

区间可以根据自己的实际业务去灵活调整,我这里只是提供一种思路。

你可能感兴趣的:(hive,数据库,sql)