HQL聚集计算之条件篇

自Hive 0.70之后,HAVING 子句可以用于对聚集结果进行条件过滤。该用法和SQL中类似,这样我们就不需要把聚集语句作为子查询来过滤相关结果。请看以下示例,

> SELECT 
> gender_age.age 
> FROM employee
> GROUP BY gender_age.age 
> HAVING count(*)=1;
+----------------+
| gender_age.age |
+----------------+
| 27             |
| 30             |
| 35             |
| 57             |
+----------------+
4 rows selected (25.829 seconds)
> SELECT 
> gender_age.age, 
> count(*) as cnt -- Support use column alias in HAVING, like ORDER BY
> FROM employee
 > GROUP BY gender_age.age HAVING cnt=1; 
+----------------+-----+
| gender_age.age | cnt |
+----------------+-----+
| 27             | 1   |
| 30             | 1   |
| 35             | 1   |
| 57             | 1   |
+----------------+-----+ 
4 rows selected (25.804 seconds)

此外,HAVING子句支持针对非聚集列进行过滤,这一功能在SQL语句中没有。不过,为了更好的性能,不推荐这样使用,推荐在WHERE子句中加入相关过滤。以下为不使用HAVING,而用子查询的示例,

> SELECT 
> a.age
> FROM (
> SELECT count(*) as cnt, gender_age.age 
> FROM employee GROUP BY gender_age.age
> ) a WHERE a.cnt <= 1;
+--------+| a.age  |+--------+
| 57     |
| 27     |
| 35     |
+--------+
3 rows selected (87.298 seconds)

你可能感兴趣的:(HQL聚集计算之条件篇)