获取每个账户最大连续逾期期数

工具:Oracle lag()与lead()函数

lag与lead函数是跟偏移量相关的两个分析函数,通过这两个函数可以在一次查询中取出同一字段的前N行的数据(lag)和后N行的数据(lead)作为独立的列,从而更方便地进行进行数据过滤。这种操作可以代替表的自联接,并且LAG和LEAD有更高的效率。

over()表示 lag()与lead()操作的数据都在over()的范围内,里面可以使用partition by 语句(用于分组) order by 语句(用于排序)。

partition by a order by b表示以a字段进行分组,再以b字段进行排序,对数据进行查询。

  例如:lead(field, num, defaultvalue) field需要查找的字段,num往后查找的num行的数据,defaultvalue没有符合条件的默认值。

示例:

账号

逾期日期

逾期标志

acc01

20180101

Y

acc01

20180201

Y

acc01

20180301

Y

acc01

20180401

Y

acc01

20180501

N

acc01

20180601

N

acc01

20180701

Y

acc01

20180801

Y

acc01

20180901

N

acc01

20181001

Y

acc01

20181101

N

acc01

20181201

N

acc02

20180115

Y

acc02

20180215

N

acc02

20180315

Y

acc02

20180415

Y

acc02

20180515

N

acc02

20180615

Y

acc02

20180715

Y

acc02

20180815

N

acc02

20180915

N

acc02

20181015

Y

acc02

20181115

Y

acc02

20181215

Y

 

处理逻辑

1.

用leg()获取对应上一条数据,使用"ROW_FIRST"标记首位行

用lead()获取对应上一条数据,使用"ROW_LAST"标记尾行

2.

当本期为首行时,如果本期逾期标志=Y,连续逾期标志=START_OVER,否则不标记

当本期为尾行时,如果本期逾期标志=Y,上期逾期标志=Y,连续逾期标志=END_OVER

                                    上期逾期标志=N,连续逾期标志=ONE_OVER

                              如果本期逾期标志=N,连续逾期标志=不标记

当本期为首行,同时也是尾行时,连续逾期标志=ONE_OVER

当本期为中间行时,如果本期逾期标志=Y,上期逾期标志=N,连续逾期标志=START_OVER

                                如果本期逾期标志=N,上期逾期标志=Y,连续逾期标志=END_OVER

3.

筛选掉连续逾期标志=NO_FALG的数据,用leg()按日期排序,将逾期开始日期和逾期结束日期拼接到同一行。

4.

当连续逾期标志=ONE_FLAG时,逾期期数=1

当连续逾期标志=END_FLAG时,逾期期数=逾期开始日期-逾期结束日期

 

账号

逾期日期

逾期标志

上期逾期日期

上期逾期标志

本期+上期

连续逾期标志

acc01

20180101

Y

 

 

Y-

START_OVER

acc01

20180201

Y

20180101

Y

Y-Y

 

acc01

20180301

Y

20180201

Y

Y-Y

 

acc01

20180401

Y

20180301

Y

Y-Y

 

acc01

20180501

N

20180401

Y

N-Y

END_OVER

acc01

20180601

N

20180501

N

N-N

 

acc01

20180701

Y

20180601

N

Y-N

START_OVER

acc01

20180801

Y

20180701

Y

Y-Y

 

acc01

20180901

N

20180801

Y

N-Y

END_OVER

acc01

20181001

Y

20180901

N

Y-N

START_OVER

acc01

20181101

N

20181001

Y

N-Y

END_OVER

acc01

20181201

N

20181101

N

N-N

 

 

 

 

 

N

-N

 

acc02

20180115

N

 

 

N-

 

acc02

20180215

N

20180115

Y

N-Y

END_OVER

acc02

20180315

Y

20180215

N

Y-N

START_OVER

acc02

20180415

Y

20180315

Y

Y-Y

 

acc02

20180515

N

20180415

Y

N-Y

END_OVER

acc02

20180615

Y

20180515

N

Y-N

START_OVER

acc02

20180715

Y

20180615

Y

Y-Y

 

acc02

20180815

N

20180715

Y

N-Y

END_OVER

acc02

20180915

N

20180815

N

N-N

 

acc02

20181015

Y

20180915

N

Y-N

START_OVER

acc02

20181115

Y

20181015

Y

Y-Y

 

acc02

20181215

Y

20181115

Y

Y-Y

 

 

 

 

 

Y

-Y

 

 

脚本

 

SELECT

ACC_NO,

OVER_DT_F,

OVER_DT,

CASE WHEN SERIAL_OVER_FLAG='ONE_OVER' THEN 1

     WHEN SERIAL_OVER_FLAG='END_OVER' THEN OVER_DT-OVER_DT_F

     ELSE 'ERROR'

END

FROM (

                   SELECT

                   ACC_NO,

                   OVER_DT,

                   SERIAL_OVER_FLAG,

                   LAG(OVER_DT,1,OVER_DT)OVER(PARTITION BY ACC_NO ORDER BY OVER_DT) AS OVER_DT_F

                   FROM (

                                     SELECT

                                     ACC_NO,

                                     OVER_DT,

                                     CASE WHEN OVER_FLAG_F='ROW_FIRST' AND OVER_FLAG='Y' THEN 'START_OVER'

                                          WHEN OVER_FLAG_F='ROW_FIRST' AND OVER_FLAG_B='ROW_LAST' THEN 'ONE_OVER'

                                          WHEN OVER_FLAG_B='ROW_LAST'  AND OVER_FALG='Y' AND OVER_FLAG_F='Y' THEN 'END_OVER'

                                          WHEN OVER_FLAG_B='ROW_LAST'  AND OVER_FALG='Y' AND OVER_FLAG_F='N' THEN 'ONE_OVER'

                                          WHEN OVER_FALG='Y' AND OVER_FLAG_F='N' THEN 'STAR_OVER'

                                          WHEN OVER_FLAG='N' AND OVER_FALG_F='Y' THEN 'END_OVER'

                                          ELSE 'NO_FALG'

                                     END AS SERIAL_OVER_FLAG

                                     FROM (

                                                        SELECT

                                                        ACC_NO,

                                                        OVER_DT,

                                                        OVER_FLAG,

                                                       LAG(OVER_FLAG,1,'ROW_FIRST')OVER(PARTITION BY ACC_NO ORDER BY OVER_DT) AS OVER_FLAG_F,  --用leg()获取对应上一条数据,使用"ROW_FIRST"标记首位行

                                                        LEAD(OVER_FLAG,1,'ROW_LAST')OVER(PARTITION BY ACC_NO ORDER BY OVER_DT) AS OVER_FLAG_B  --用lead()获取对应上一条数据,使用"ROW_LAST"标记尾行

                                                        FROM ACC_TAB

                                     )

                   )

                   WHERE SERIAL_OVER_FALG<>'NO_FALG'

) WHERE SERIAL_OVER_FLAG<>'START_OVER'

;

 

你可能感兴趣的:(获取每个账户最大连续逾期期数)