【DB2】LISTAGG函数中元素的去重(DISTINCT)

Table of Contents

1. 原始数据

2. DB2 10.5及以前版本的实现

3. DB2 11.1及以后版本的表示方法


LISTAGG 函数用于将多个字符串元素,汇集成一个大的字符串,可以将这些字符串元素以某个分隔符隔开。而常常需要考虑在汇集成大的字符串时,去除那些重复的字符串元素。

在DB2 11.1及之后的版本中,LISTAGG函数提供了使用DISTINCT关键值来支持这种情况,参考网页:https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0058709.html。

而在DB2 10.5及之前的版本,LISTAGG函数还不能支持DISTINCT的处理,参考网页:https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0058709.html。

这里先给出在DB2 10.5及之前版本的实现方法,然后再用DB2 11.1及之后版本中的DISTINCT形式来实现。

1. 原始数据

首先看示例的原始数据:

SELECT *
FROM (VALUES 
  (10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
  (20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
  (20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN')) AS emp(deptno,job)
ORDER BY deptno, job
WITH ur;
DEPTNO|JOB      
------|---------
    10|MANAGER  
    10|PRESIDENT
    10|PRESIDENT
    20|ANALYST  
    20|CLERK    
    20|MANAGER  
    30|CLERK    
    30|MANAGER  
    30|SALESMAN 

2. DB2 10.5及以前版本的实现

现在将deptno作为主键来做汇集,同一个deptno的所有job经过LISTAGG函数汇集成一个大的字符串,去除其中重复的元素:

WITH 
  emp(deptno,job) AS(
	SELECT *
	FROM (VALUES 
	  (10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
	  (20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
	  (20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
),
  fld(deptno, job) AS(
	SELECT 
	  deptno,
	  DECODE(ROW_NUMBER () OVER (PARTITION BY deptno,job), 1, job) job
	FROM emp 
)
SELECT 
  deptno,
  LISTAGG(job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM fld
GROUP BY deptno
WITH ur;
DEPTNO|JOBS                  
------|----------------------
    10|PRESIDENT,MANAGER     
    20|MANAGER,CLERK,ANALYST 
    30|SALESMAN,MANAGER,CLERK

此方法使用DECODE函数和ROW_NUMBER窗口表达式相结合实现了去重。而后面的的LISTAGG会实现这些元素的排序,这里是倒序。下面再来看看两个字段的情况:

WITH 
  emp(deptno,job) AS(
	SELECT *
	FROM (VALUES 
	  (10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
	  (20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
	  (20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
),
  fld(deptno, job) AS(
	SELECT 
	  DECODE(ROW_NUMBER () OVER (PARTITION BY deptno), 1, deptno) deptno,
	  DECODE(ROW_NUMBER () OVER (PARTITION BY job), 1, job) job
	FROM emp 
)
SELECT 
  LISTAGG(deptno, ',') WITHIN GROUP (ORDER BY deptno) deptnos,
  LISTAGG(job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM fld
WITH ur;
DEPTNOS |JOBS                                    
--------|----------------------------------------
10,20,30|SALESMAN,PRESIDENT,MANAGER,CLERK,ANALYST

3. DB2 11.1及以后版本的表示方法

如果使用了DB2 11.1及以后版本的DISTINCT 表达式,则实现非常简单。

WITH 
  emp(deptno,job) AS(
	SELECT *
	FROM (VALUES 
	  (10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
	  (20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
	  (20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
)
SELECT 
  deptno, 
  LISTAGG(DISTINCT job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM emp 
GROUP BY deptno
WITH ur;
DEPTNO|JOBS                  
------|----------------------
    10|PRESIDENT,MANAGER     
    20|MANAGER,CLERK,ANALYST 
    30|SALESMAN,MANAGER,CLERK

然后是两个字段同时处理的情况:

WITH 
  emp(deptno,job) AS(
	SELECT *
	FROM (VALUES 
	  (10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
	  (20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
	  (20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
)
SELECT 
  LISTAGG(DISTINCT deptno, ',') deptnos,
  LISTAGG(DISTINCT job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM emp 
WITH ur;
DEPTNOS |JOBS                                    
--------|----------------------------------------
10,20,30|SALESMAN,PRESIDENT,MANAGER,CLERK,ANALYST

 

你可能感兴趣的:(DB2)