彷徨 | Hive---报表统计

联级累计报表查询

有如下数据:

A,2015-01-08,5

A,2015-01-11,15

B,2015-01-12,5

A,2015-01-12,8

B,2015-01-13,25

A,2015-01-13,5

C,2015-01-09,10

C,2015-01-11,20

A,2015-02-10,4

A,2015-02-11,6

C,2015-01-12,30

C,2015-02-13,10

B,2015-02-10,10

B,2015-02-11,5

A,2015-03-20,14

A,2015-03-21,6

B,2015-03-11,20

B,2015-03-12,25

C,2015-03-10,10

C,2015-03-11,20

要求统计出如下累计报表:

用户

月份

月总额

累计到当月的总额

A

2015-01

33

33

A

2015-02

10

43

A

2015-03

30

73

B

2015-01

30

30

B

2015-02

15

45

1 建表

建表映射:

create table t_access_times(username string,month string,counts int)

row format delimited fields terminated by ',';

导入数据 :

2 月份数据切割聚合,只留到月份

select  username,substr(day,1,7) as month_sale,sum(msale) from t_sale
group by username,substr(day,1,7);

结果 :

彷徨 | Hive---报表统计_第1张图片

3 自关联

select t1.*,t2.* from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username;

彷徨 | Hive---报表统计_第2张图片

结果 :

彷徨 | Hive---报表统计_第3张图片

4 筛选

select t1.*,t2.* from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username
where t1.month_sale>=t2.month_sale;

彷徨 | Hive---报表统计_第4张图片

结果 : 

彷徨 | Hive---报表统计_第5张图片

5 聚合统计

select t1.username,t1.month_sale,sum(t2.cnt) from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username
where t1.month_sale>=t2.month_sale
group by t1.username,t1.month_sale;

彷徨 | Hive---报表统计_第6张图片

结果 : 

彷徨 | Hive---报表统计_第7张图片

你可能感兴趣的:(数据库,Hive,大数据)