Hive SQL——group by函数的注意点

Hive SQL的group by对比MySQL,有一个让我特别不能接受的原则:select后面所有的列中,没有使用聚合函数的列,必须出现在group by子句中。

例如在hue中执行以下语句:

select a.chnl,b.goods_id,b.goods_name,b.goods_spec,b.manufacturer,b.atc1_new,b.atc2_new,b.atc3_new,b.atc4_new,
avg(a.salep),avg(a.cost_price),sum(a.paid_in_amt),
sum(a.profit)/sum(a.paid_in_amt),
sum(a.paid_in_amt-a.rec_amt)/sum(a.rec_amt),
count(distinct a.bill_code),count(distinct a.org_no)
from 
(select goods_id,data_from,goods_name,goods_spec,manufacturer,atc1_new,atc2_new,atc3_new,atc4_new from gjst.dim_goods_category_manual where dt='20200903'
) b 
right join 
(select goods_id,salep,cost_price,data_from,paid_in_amt,profit,rec_amt,bill_code,org_no,chnl from gjdw.dw_sale_tr_goods_dt where dt='20200903'  and dates>='2019-09-01' and dates<='2020-08-31') a
on a.goods_id=b.goods_id and a.data_from=b.data_from
group by a.chnl,b.goods_id;

会报以下错误:

Error while compiling statement: FAILED: SemanticException [Error 10002]: line 35:87 Invalid column reference 'goods_name'

以上错误就是因为b.goods_name,b.goods_spec,b.manufacturer,b.atc1_new,b.atc2_new,b.atc3_new,b.atc4_new等字段不是聚合函数,且其未出现在group by中。

你可能感兴趣的:(Hive,数据库)