一、 分组
group by 多个字段进行分组时,分组字段全部相同时才能进行分组
注意多个分组不是一个字段分组之后另一个字段再进行分组,而是把整个分组字段看做整理,分组字段整体相同,才能进行分组。多个字段分组,与单个字段分组然后进行分组统计注意不要混淆。
注意:表的本质只是数据的集合。
例:查询身份证重复信息,一个身份证对应车辆超过3台即提取,提取维度:被保险人、被保险人证件号、车架号、保单止期月份、渠道、业务来源,时间为投保时间/核保时间/签单时间为2018年2-10月。
select distinct
InsuredName 被保险人,IdentifyNumber 被保险人证件号,vinceno 车架号,to_char(enddate,'yyyy/mm') 保单止期月份,
case when a.TRANSNAME ='A' then '综合渠道'
when a.TRANSNAME ='B' then '车商渠道'
when a.TRANSNAME ='C' then '银保渠道'
when a.TRANSNAME ='D' then '代理渠道'
when a.TRANSNAME ='E' then '经纪重客渠道'
when a.TRANSNAME ='F' then '电商渠道'
when a.TRANSNAME ='G' then '农网渠道'
when a.TRANSNAME ='H' then '综拓渠道'
else '其它渠道'
end as 渠道 ,'('||a.businesssource||')'||(select m.codename from prpdchannel m where m.codecode=a.businesssource ) 业务来源
from renewalpolicy_view a where a.IdentifyNumber in
(SELECT IdentifyNumber FROM
(SELECT vinceno,IdentifyNumber FROM renewalpolicy_view a where a.InputDate between
TO_DATE('2018-02-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2018-10-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss')
group by a.vinceno,IdentifyNumber) group by IdentifyNumber having count(1)>3)
and InputDate between
TO_DATE('2018-02-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2018-10-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss') and a.UseNature='309001'
- 注意分组,有相同的字段才能进行分组,单一字段也是。
SELECT IdentifyNumber FROM
(SELECT vinceno,IdentifyNumber FROM renewalpolicy_view a where a.InputDate between
TO_DATE('2018-02-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2018-10-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss')
group by a.vinceno,IdentifyNumber) group by IdentifyNumber having count(1)>3
车与身份证的第一次分组,去除了两个字段同时相等的情况;
第二次时即便存在这样的情况:
(1)车1 身份证1
(2)车1 身份证2
(3)车2 身份证1
(1)(3)同一身份证,成为一组 ,count(1)分组统计是对存在相等的字段里,对应的字段进行统计,组内统计,不是组外;
也不存在这样的情况
(1)车1 身份证1
(2)车1 身份证2
(3)车1 身份证1
因为第一次分组时已经把相同的两个字段归为一组;提取的字段时我们需要的字段,不是表里的所有字段,多个分组中,提取字段,只能是分组字段。
二、 分区函数Partition By的与row_number()的用法
- 摘录:http://www.cnblogs.com/linJie1930906722/p/6036053.html
例:获取每个班的前1(几)名
select * from
( select a.*,row_number() over(partition by Grade orderby Score desc) as Sequence
from Student a) T
where T.Sequence<=1
拆解:
(1)不分班后按学生成绩排名
select a.*,row_number() over(orderby Score desc) as Sequence from Student a
(2)分班后按学生成绩排名
select a.*,row_number() over(partition by Grade orderby Score desc) as Sequence
from Student a
(3)获取每个班的前1(几)名
select * from
( select a.*,row_number() over(partition by Grade orderby Score desc) as Sequence
from Student a) T
where T.Sequence<=1
三、分组,分区排序,排序后取前几位
例子: 提取贵州分公司保险止期为2019年1月-3月的家用车相关数据,具体如下:随机调取贵分各业务员20台家用车(以车架号统计)对应的客户联系电话及客户电话验真结果,跟进业务员工号及名称调取以下维度,保单号,车架号,使用性质,保险止期,客户电话(不是召回的)
注:
(1)确保业务员下有>=20辆车(当然不重复)
(2)满足第一 个条件的基础上,根据车架号排序,调取前20辆车
select distinct t.policyno,t.vinceno,'家庭自用车', trunc(t.enddate),b.phonenumber1
from
(select * from
(select bb.*, row_number() over(partition by bb.makeusercode order by bb.vinceno desc) as Sequence from
(select a.policyno,a.vinceno,a.usenature,a.enddate,a.insuredname,a.identifynumber,a.makeusercode
from wfflowmain a where a.usenature = '309001' and substr(a.makecomcode,0,2)= '52'and a.enddate between
TO_DATE('2019-01-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2019-03-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss') and a.makeusercode in (select bb.makeusercode from
(select c.vinceno,c.makeusercode from wfflowmain c where c.usenature = '309001' and substr(c.makecomcode,0,2)= '52'and c.enddate between
TO_DATE('2019-01-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2019-03-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss')
group by c.vinceno,c.makeusercode ) bb group by makeusercode having count(1) >= 20)) bb)T
where T.Sequence<=20)t
left join (select * from prprinsured where validstatus = '1')b on t.identifynumber = b.identifynumber and t.insuredname = b.insuredname
where b.phonesource1 not like('%召回%')
四、无对应
1、例子:脱落数据:提取2016.10-17.1承保,但对应车辆在2017.10-18.1期间无有效保单的数据,保单维度,保单号、保单止期、是否新车、机构(6位)
select a.policyno,
substr(a.makecomcode, 0, 6),
trunc(a.enddate) 保单止期,
case
when months_between(trunc(a.startdate),
to_date(a.firstTime, 'yyyy-mm-dd')) < 9 then
'是'
else
'否'
end 是否新车
from wfflowmain a
where a.enddate between
TO_DATE('2016-10-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2017-01-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss')
and not exists
(select 1
from wfflowmain b
where b.enddate between
TO_DATE('2017-10-01 00:00:00', 'yyyy:mm:dd hh24:mi:ss') and
TO_DATE('2018-01-31 23:59:59', 'yyyy:mm:dd hh24:mi:ss')
and a.vinceno = b.vinceno)