几个比较有意思的sql(案例一:级联累计查询)

级联累计查询(基于hive)

原始数据类型

  • 原始数据文件 jilianqiuhe.dat
A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5
  • 创建数据文件
cd /opt
mkdir my_test
touch jilianqiuhe.dat
vi jilianqiuhe.dat		##将以上数据粘贴进来
  • 建表语句
----------------------------------
create table t_access_times(username string,month string,salary int)
row format delimited fields terminated by ',';

load data local inpath '/opt/my_test/jilianqiuge.dat' into table t_access_times;
  • 查询数据库

hive>select * from t_access_times ;
+--------------------------+-----------------------+------------------------+--+
| t_access_times.username  | t_access_times.month  | t_access_times.salary  |
+--------------------------+-----------------------+------------------------+--+
| A                        | 2015-01               | 5                      |
| A                        | 2015-01               | 15                     |
| B                        | 2015-01               | 5                      |
| A                        | 2015-01               | 8                      |
| B                        | 2015-01               | 25                     |
| A                        | 2015-01               | 5                      |
| A                        | 2015-02               | 4                      |
| A                        | 2015-02               | 6                      |
| B                        | 2015-02               | 10                     |
| B                        | 2015-02               | 5                      |
+--------------------------+-----------------------+------------------------+--
  • 按月聚合后的结果:求每月的总金额
hive>select username,month,sum(salary) from t_access_times group by username,month;
  • 结果
    按月聚合后结果
  • 自己 inner join
hive>select a.*,b.* from
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
  • 结果
    几个比较有意思的sql(案例一:级联累计查询)_第1张图片
  • 生成累加值
hive>select a.username,a.month,max(a.salary) as salary,sum(b.salary) as accumulate from
(select username,month,sum(salary) as salary from t_access_times group by username,month) A inner join (select username,month,sum(salary) as salary from t_access_times group by username,month) B on a.username=b.username
where b.month <= a.month
group by a.username,a.month
order by a.username,a.month;
  • 结果
    生成结果

你可能感兴趣的:(基础)