HiveSQL分组取topN

 

 参考sql:
 ##统计国内,各省份的城市排名

select b.*
from
(select country,
    province,
    city,

    cnt,

    row_number() over (partition by country,province order by cnt desc) rank
from 
    (select country,
            province,
            city,
            count(1) as cnt
    from tb_pmp_region_report_hive_mapping
    where country = '中国'
    group by country,province,city
    ) a
)b
where b.rank<=3


 

 

 

 

 

 

 

业务sql:

select 
datacity_cn,industry_cn,guimo,tswqwt,count_date,amount 
from
(
select 
datacity_cn,industry_cn,guimo,tswqwt,count_date,amount
,row_number() over (partition by datacity_cn,industry_cn,guimo order by amount desc)  rank
from tswq_dwfx_tswqwt 
) a
where a.rank <= 10 and count_date = substr(date_sub(concat(substr(current_date,1,7),'-01'),1),1,7)

 

 

 

 

 

 

 

 

 

 

 

 

 

sql目的:
1.tswq_dwfx_tswqwt已经是结果表,amount是统计好的字段,所以没有必要再加group by字段
2.分组排序后,是取tswqwt这个字段的top10,所以分组找那个是不加tswqwt这个字段的

你可能感兴趣的:(hive,sql)