核心内容:
1、两个实例
2、case语句中的the关键字可以理解为select关键字
3、两个group by的实例
4、线性维表的实现方式
优惠前金额:before_prefr_unit_price
实例1、从表gdm_m04_ord_det_sum提取如下数据: 求出在2014年11月11日这天的全部订单中,优惠前金额before_prefr_unit_price不同分布区间的订单的数量,按子订单统计。 输出结果字段如下: 100元以下单量,100至199元单量, 200至299元单量, 300元及以上单量。
思路1:复杂表示
select sum(1) as a1
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and before_prefr_unit_price < 100
select sum(1) as a2
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and
(before_prefr_unit_price >= 100 and before_prefr_unit_price < 200 )
select sum(1) as a3
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and
(before_prefr_unit_price >= 200 and before_prefr_unit_price < 300 )
select sum(1) as a4
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and
(before_prefr_unit_price >= 300 )
通过上面的语句,我们将形成4个表,现在我们通过case when…..then….else将4个表进行连接形成一个表。
代码1汇总:(where中写4个表中相同的语句进行过滤)
select sum(case when before_prefr_unit_price < 100 then 1 else 0 end ) as a1,
sum(case when before_prefr_unit_price >= 100 and before_prefr_unit_price < 200 then 1 else 0 end) as a2,
sum(case when before_prefr_unit_price >= 200 and before_prefr_unit_price < 300 then 1 else 0 end) as a3,
sum(case when before_prefr_unit_price >= 300 then 1 else 0 end) as a4
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE'
如何去理解case when before_prefr_unit_price >= 200 and before_prefr_unit_price < 300 then 1 else 0 end这句话呢?
如果before_prefr_unit_price >= 200 and before_prefr_unit_price < 300则select(echo、printf、then)输出1,否则的话什么也不做。
优惠前金额:before_prefr_amount 优惠后金额:after_prefr_amount 订单量:即记录的总和
实例2:从表gdm_m04_ord_det_sum中提取出如下数据: 求出在2014年11月11日下单的手机品类的优惠后金额,电脑产品和电脑办公这两个品类的优惠前金额,服饰内衣品类的有效完成订单量。手机,电脑产品,电脑办公,服饰内衣的一级代码item_first_cate_cd分别为9987 ,174 ,670,1315,有效订单 sale_ord_valid_flag=1.最终输出为:手机品类优惠后金额总和(phone_amount)、电脑产品和电脑办公这两个品类的优惠前金额之和(compute_amount),服饰内衣品类的有效完成订单量(clothing_ord_num),一共3列。
思路1:复杂表示
select sum(after_prefr_amount) as phone_amount
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and item_first_cate_cd = '9987'
select sum(before_prefr_amount ) as compute_amount
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and
(item_first_cate_cd = '174' or item_first_cate_cd = '670')
select sum(1) as clothing_ord_num
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE' and
item_first_cate_cd = '1315' and sale_ord_valid_flag = '1'
下面我们通过case when…..then…….else来进行表示,形成一张表
思路2:case语句
select sum(case when item_first_cate_cd = '9987' then after_prefr_amount else 0 end) as phone_amount,
sum(case when item_first_cate_cd = '174' or item_first_cate_cd = '670' then before_prefr_amount else 0 end) as compute_amount,
sum(case when item_first_cate_cd = '1315' and sale_ord_valid_flag = '1' then 1 else 0 end) as clothing_ord_num
from gdm_m04_ord_det_sum
where to_date(sale_ord_dt) = '2014-11-11' and dp = 'ACTIVE'
实例3、从表gdm_m04_ord_det_sum中,统计2014年11月11日有效订单商品销量最多的一级分类中,每个二级分类的销量,并降序排序。数据最终输出格式为: 商品一级分类代码(item_first_cate_cd),商品一级分类名称(item_first_cate_name),商品二级分类代码(item_second_cate_cd),商品二级分类名称(item_second_cate_name),销量(sale_qtty),排名(top)。
注释:
sale_ord_dt : 为销售订单订购日期,比如2016年11月11日
sale_ord_valid_flag : 有效订单的标志,标志位为1表示有效订单,0表示无效订单
dp : 分区的标志,在where中需要指定
代码:
select A.item_first_cate_cd,A.item_first_cate_name,A.item_second_cate_cd,A.item_second_cate_name,sum(A.sale_qtty) as sum_sal,
row_number() over(distribute by A.item_first_cate_cd,A.item_first_cate_name sort by sum(A.sale_qtty) desc) as top
from gdm_m04_ord_det_sum A
join (
select item_first_cate_cd,sum(sale_qtty) as sal_sum
from gdm_m04_ord_det_sum
where sale_ord_dt = '2014-11-11' and sale_ord_valid_flag = '1' and dp = 'ACTIVE'
group by item_first_cate_cd
order by sal_sum desc –这里不要用sum(sale_qtty)
limit 1
) B
on A.item_first_cate_cd = B.item_first_cate_cd
where sale_ord_dt = '2014-11-11' and sale_ord_valid_flag = '1' and dp = 'ACTIVE'
group by A.item_first_cate_cd,A.item_first_cate_name,A.item_second_cate_cd,A.item_second_cate_name
实例4:现有订单表 fdm_pek_orders_chain。
问题1:2014年12月15日(createdate)所产生的有效(yn=1)
订单最多的省份ID(uprovince)是哪个?
问题2:该省份有效订单最多的城市(ucity)TOP10是哪些?降序列出。
要求结果字段为:省份ID,城市ID,该城市的有效订单量。
要求:sql内容请复制一份到下方 (12)
问题1代码:
select uprovince,count(*) as sal_sum
from fdm_pek_orders_chain
where createdate like '2014-12-15%' and yn = '1' and dp = 'ACTIVE'
group by uprovince
order by sal_sum desc
limit 1
问题2代码:
select A.uprovince,A.ucity,count(*) as sal_sum
from fdm_pek_orders_chain A
join (
select uprovince,count(*) as sal_sum
from fdm_pek_orders_chain
where createdate like '2014-12-15%' and yn = '1' and dp = 'ACTIVE'
group by uprovince
order by sal_sum desc
limit 1
) B
on A.uprovince = B.uprovince
where yn = '1' and dp = 'ACTIVE' --此处没有加时间
group by A.uprovince,A.ucity
order by sal_sum desc
limit 10
实例5:线性维表的实现方式
有一张分类表exam_dim(id int ,father_id int, name string). 其中father_id 是id的父编号,当father_id为0时表示省份,现需要实现线性表,dim_area(province_id,province_name,city_id,city_name,county_id,county_name),
请写出生成此表的查询sql,并在IDE中创建任务。
要求数据结果字段:province_id,province_name,city_id,city_name,county_id,county_name。
代码:(本质上就是3张表的连接)
select t1.id as province_id,t1.name as province_name ,t2.id as city_id ,t2.name as city_name,
t3.id as county_id,t3.name as county_name
from (
select *
from exam_dim
where father_id = 0 --father_id = 0
) t1
join exam_dim t2
on t1.id = t2.father_id --t1.id = t2.father_id
join exam_dim t3
on t2.id = t3.father_id --t2.id = t3.father_id
从上面可以看出,线性维表实现很简单。
OK!