Hive面试题——累计求和

需求:

有如下访客访问次数统计表 t_access_times

访客

月份

访问次数

A

2015-01

5

A

2015-01

15

B

2015-01

5

A

2015-01

8

B

2015-01

25

A

2015-01

5

A

2015-02

4

A

2015-02

6

B

2015-02

10

B

2015-02

5

……

……

……

 

需要输出报表:t_access_times_accumulate

访客

月份

月访问总计

累计访问总计

A

2015-01

33

33

A

2015-02

10

43

…….

…….

…….

…….

B

2015-01

30

30

B

2015-02

15

45

…….

…….

…….

…….

思路:

1、第一步,先求个用户的月总金额

select username,month,sum(salary) salary from t_access_times group by username,month;

+-----------+----------+---------+--+
| username | month | salary |
+-----------+----------+---------+--+
| A | 2015-01 | 33 |
| A | 2015-02 | 10 |
| B | 2015-01 | 30 |
| B | 2015-02 | 15 |
+-----------+----------+---------+--+

2、第二步,将月总金额表 自己连接自己

select A.*,B.*
from 
(select username,month,sum(salary) salary from t_access_times group by username,month) A 
join 
(select username,month,sum(salary) salary from t_access_times group by username,month) B
on
A.username=B.username;

+-------------+----------+-----------+-------------+----------+-----------+--+
| A.username | A.month | A.salary | B.username | B.month | B.salary |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A | 2015-01 | 33 | A | 2015-01 | 33 |
| A | 2015-01 | 33 | A | 2015-02 | 10 |
| A | 2015-02 | 10 | A | 2015-01 | 33 |
| A | 2015-02 | 10 | A | 2015-02 | 10 |
| B | 2015-01 | 30 | B | 2015-01 | 30 |
| B | 2015-01 | 30 | B | 2015-02 | 15 |
| B | 2015-02 | 15 | B | 2015-01 | 30 |
| B | 2015-02 | 15 | B | 2015-02 | 15 |
+-------------+----------+-----------+-------------+----------+-----------+--+

3、第三步,从上一步的结果中
进行分组查询,分组的字段是A.username,A.month
求月累计值: 将B.month <= A.month的所有B.salary求和即可

select A.username,A.month,max(A.salary) salary,sum(B.salary) accumulate
from 
(select username,month,sum(salary) salary from t_access_times group by username,month) A 
join 
(select username,month,sum(salary) salary from t_access_times group by username,month) B
on
A.username=B.username
where B.month <= A.month
group by A.username,A.month
order by A.username,A.month;

 

你可能感兴趣的:(大数据)