HQL聚集计算之进阶篇

HQL聚集函数可以使用GROUPING SETS, CUBE, 和ROLLUP等关键词。

  1. GROUPING SETS
    该子句等同于GROUP BY子句和UNION ALL子句一起组合使用。另外该子句是在单一阶段一次性完成相关处理,效率相对更高。GROUPING SETS这个子句后是空集合的话,会计算整体聚集。GROUPING SETS这个子句后()之外的部分,用于确定 UNION ALL的执行方式和个数;()之内的部分,用于确定GROUP BY的执行方式。
    例1.1 一个元素:一个两列组合
SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date       
GROUPING SETS((name, start_date));      
--||-- equals to      
SELECT      
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date;      
+---------+------------+---------+      
| name    | start_date | sin_cnt |      
+---------+------------+---------+      
| Lucy    | 2010-01-03 | 1       |      
| Michael | 2014-01-29 | 1       |      
| Steven  | 2012-11-03 | 1       |      
| Will    | 2013-10-02 | 1       |      
+---------+------------+---------+      
4 rows selected (26.3 seconds)

例1.2 两个元素:两个列

SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date       
GROUPING SETS(name, start_date);      
--||-- equals to      
SELECT       
name, null as start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name      
UNION ALL      
SELECT       
null as name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY start_date;      
----------+------------+---------+      
| name    | start_date | sin_cnt |      
+---------+------------+---------+      
| NULL    | 2010-01-03 | 1       |      
| NULL    | 2012-11-03 | 1       |      
| NULL    | 2013-10-02 | 1       |      
| NULL    | 2014-01-29 | 1       |      
| Lucy    | NULL       | 1       |      
| Michael | NULL       | 1       |      
| Steven  | NULL       | 1       |      
| Will    | NULL       | 1       |      
+---------+------------+---------+      
8 rows selected (22.658 seconds)

例1.3 两个元素:一个两列组合,一个列

SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date       
GROUPING SETS((name, start_date), name);      
--||-- equals to      
SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date      
UNION ALL      
SELECT       
name, null as start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name;      
+---------+------------+---------+      
| name    | start_date | sin_cnt |      
+---------+------------+---------+      
| Lucy    | NULL       | 1       |      
| Lucy    | 2010-01-03 | 1       |      
| Michael | NULL       | 1       |      
| Michael | 2014-01-29 | 1       |      
| Steven  | NULL       | 1       |      
| Steven  | 2012-11-03 | 1       |      
| Will    | NULL       | 1       |      
| Will    | 2013-10-02 | 1       |      
+---------+------------+---------+      
8 rows selected (22.503 seconds)

例1.4 四个元素:两列的所有排列组合

SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date       
GROUPING SETS((name, start_date), name, start_date, ());      
--||-- equals to      
SELECT       
name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name, start_date      
UNION ALL      
SELECT       
name, null as start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY name      
UNION ALL      
SELECT       
null as name, start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
GROUP BY start_date      
UNION ALL      
SELECT       
null as name, null as start_date, count(sin_number) as sin_cnt       
FROM employee_hr      
+---------+------------+---------+      
| name    | start_date | sin_cnt |      
+---------+------------+---------+      
| NULL    | NULL       | 4       |      
| NULL    | 2010-01-03 | 1       |      
| NULL    | 2012-11-03 | 1       |      
| NULL    | 2013-10-02 | 1       |      
| NULL    | 2014-01-29 | 1       |      
| Lucy    | NULL       | 1       |      
| Lucy    | 2010-01-03 | 1       |      
| Michael | NULL       | 1       |      
| Michael | 2014-01-29 | 1       |      
| Steven  | NULL       | 1       |      
| Steven  | 2012-11-03 | 1       |     
| Will    | NULL       | 1       |      
| Will    | 2013-10-02 | 1       |      
+---------+------------+---------+      
13 rows selected (24.916 seconds)
  1. ROLLUP
    提供n+1层级的聚集计算,这里n为参与分组的列的个数。例如GROUP BY a,b,c WITH ROLLUP 等效于 GROUP BY a,b,c GROUPING SETS ((a,b,c),(a,b),(a),())
  2. CUBE
    提供2的n次方个层级的聚集计算,这里n为参与分组的列的个数,这个层级数为n个元素所有组合数。例如GROUP BY a,b,c WITH CUBE等效于 GROUP BY a,b,c GROUPING SETS ((a,b,c),(a,b),(b,c),(a,c),(a),(b),(c),())
  3. GROUPING__ID 和 GROUPING 函数
    GROUPING__ID函数,无需输入参数,返回值用来标识用于聚集计算的层次,这个值是GROUP BY后具体列组合的位向量的数字值。具有相同GROUP BY后具体列组合的行,该函数返回相同的数字ID。
    GROUPING函数用于判断某列是否包含在当前行的聚集计算(也即是否包含在该行的GROUP BY之后)。0,指不包含在GROUP BY之后的列中;1,指包含在GROUP BY之后的列中。请看以下示例,
SELECT 
name, start_date, count(employee_id) as emp_id_cnt,
GROUPING__ID,
grouping(name) as gp_name, 
grouping(start_date) as gp_sd
FROM employee_hr 
GROUP BY name, start_date 
WITH CUBE ORDER BY name, start_date;
+---------+------------+------------+-----+---------+-------+
| name    | start_date | emp_id_cnt | gid | gp_name | gp_sd |
+---------+------------+------------+-----+---------+-------+
| NULL    | NULL       | 4          | 3   | 1       | 1     |
| NULL    | 2010-01-03 | 1          | 2   | 1       | 0     |
| NULL    | 2012-11-03 | 1          | 2   | 1       | 0     |
| NULL    | 2013-10-02 | 1          | 2   | 1       | 0     |
| NULL    | 2014-01-29 | 1          | 2   | 1       | 0     |
| Lucy    | NULL       | 1          | 1   | 0       | 1     |
| Lucy    | 2010-01-03 | 1          | 0   | 0       | 0     |
| Michael | NULL       | 1          | 1   | 0       | 1     |
| Michael | 2014-01-29 | 1          | 0   | 0       | 0     |
| Steven  | NULL       | 1          | 1   | 0       | 1     |
| Steven  | 2012-11-03 | 1          | 0   | 0       | 0     |
| Will    | NULL       | 1          | 1   | 0       | 1     |
| Will    | 2013-10-02 | 1          | 0   | 0       | 0     |
+---------+------------+------------+-----+---------+-------+
13 rows selected (55.507 seconds)

你可能感兴趣的:(HQL聚集计算之进阶篇)