Table of Contents
1. 原始数据
2. DB2 10.5及以前版本的实现
3. DB2 11.1及以后版本的表示方法
LISTAGG 函数用于将多个字符串元素,汇集成一个大的字符串,可以将这些字符串元素以某个分隔符隔开。而常常需要考虑在汇集成大的字符串时,去除那些重复的字符串元素。
在DB2 11.1及之后的版本中,LISTAGG函数提供了使用DISTINCT关键值来支持这种情况,参考网页:https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0058709.html。
而在DB2 10.5及之前的版本,LISTAGG函数还不能支持DISTINCT的处理,参考网页:https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0058709.html。
这里先给出在DB2 10.5及之前版本的实现方法,然后再用DB2 11.1及之后版本中的DISTINCT形式来实现。
首先看示例的原始数据:
SELECT *
FROM (VALUES
(10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
(20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
(20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN')) AS emp(deptno,job)
ORDER BY deptno, job
WITH ur;
DEPTNO|JOB
------|---------
10|MANAGER
10|PRESIDENT
10|PRESIDENT
20|ANALYST
20|CLERK
20|MANAGER
30|CLERK
30|MANAGER
30|SALESMAN
现在将deptno作为主键来做汇集,同一个deptno的所有job经过LISTAGG函数汇集成一个大的字符串,去除其中重复的元素:
WITH
emp(deptno,job) AS(
SELECT *
FROM (VALUES
(10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
(20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
(20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
),
fld(deptno, job) AS(
SELECT
deptno,
DECODE(ROW_NUMBER () OVER (PARTITION BY deptno,job), 1, job) job
FROM emp
)
SELECT
deptno,
LISTAGG(job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM fld
GROUP BY deptno
WITH ur;
DEPTNO|JOBS
------|----------------------
10|PRESIDENT,MANAGER
20|MANAGER,CLERK,ANALYST
30|SALESMAN,MANAGER,CLERK
此方法使用DECODE函数和ROW_NUMBER窗口表达式相结合实现了去重。而后面的的LISTAGG会实现这些元素的排序,这里是倒序。下面再来看看两个字段的情况:
WITH
emp(deptno,job) AS(
SELECT *
FROM (VALUES
(10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
(20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
(20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
),
fld(deptno, job) AS(
SELECT
DECODE(ROW_NUMBER () OVER (PARTITION BY deptno), 1, deptno) deptno,
DECODE(ROW_NUMBER () OVER (PARTITION BY job), 1, job) job
FROM emp
)
SELECT
LISTAGG(deptno, ',') WITHIN GROUP (ORDER BY deptno) deptnos,
LISTAGG(job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM fld
WITH ur;
DEPTNOS |JOBS
--------|----------------------------------------
10,20,30|SALESMAN,PRESIDENT,MANAGER,CLERK,ANALYST
如果使用了DB2 11.1及以后版本的DISTINCT 表达式,则实现非常简单。
WITH
emp(deptno,job) AS(
SELECT *
FROM (VALUES
(10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
(20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
(20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
)
SELECT
deptno,
LISTAGG(DISTINCT job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM emp
GROUP BY deptno
WITH ur;
DEPTNO|JOBS
------|----------------------
10|PRESIDENT,MANAGER
20|MANAGER,CLERK,ANALYST
30|SALESMAN,MANAGER,CLERK
然后是两个字段同时处理的情况:
WITH
emp(deptno,job) AS(
SELECT *
FROM (VALUES
(10,'PRESIDENT'),(30,'MANAGER'),(10,'PRESIDENT'),
(20,'ANALYST'),(30,'CLERK'),(20,'MANAGER'),
(20,'CLERK'),(10,'MANAGER'),(30,'SALESMAN'))
)
SELECT
LISTAGG(DISTINCT deptno, ',') deptnos,
LISTAGG(DISTINCT job, ',') WITHIN GROUP (ORDER BY job DESC) jobs
FROM emp
WITH ur;
DEPTNOS |JOBS
--------|----------------------------------------
10,20,30|SALESMAN,PRESIDENT,MANAGER,CLERK,ANALYST