行转列,通用做法是使用 case when 。today,介绍一下新的方法 -- crosstab
crosstab 函数被用来生成 pivot
(扭曲,转动)展示,即通过横向而不是下拉展示(行转列)。如下面这个例子:
源表:
row1 val11
row1 val12
row1 val13
...
row2 val21
row2 val22
row2 val23
...
期望的结果:
row1 val11 val12 val13 ...
row2 val21 val22 val23 ...
...
tablefunc
PostgreSQL 提供了一个 tablefunc 模块,内置了多个函数,其中就有 crosstab(交叉表,又叫行转列,或者长表转宽表)
--prerequisite: 安装插件tablefunc
create extension tablefunc;
crosstab声明该函数返回setof record,因此必须在FROM在调用SELECT语句的子句中定义输出列的实际名称和类型,例如:
SELECT * FROM crosstab('...') AS ct(row_name text, _col1 text, _col2 text);
产生一个类似于的集合:
row_name _col1 _col2
----------+------------+------------
row1 val1 val2
row2 val3 val4
需求:在动态行转列不清楚生成的宽表有多少列的情况下,或者列太多,手动敲很不友好,那么可以借助创建函数(pivotcode)来简化工作:
创建函数pivotcode:
---pivotcode (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar)
---pivotcode(tablename:要透视的源表的名称,rowc:要成为行的源表中的列的名称,colc:想成为列的源表中的列的名称,cellc:确定将如何创建单元格值的聚合表达式,celldatatype:单元格所需的数据类型)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
create or replace function pivotcode (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct ''_''||'||colc||'||'' '||celldatatype||''','','' order by ''_''||'||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as newtable (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$
创建表并插入数据:
create table table_to_pivot (
rowname varchar,
colname varchar,
cellval numeric
);
insert into table_to_pivot values ('row1','col1',11);
insert into table_to_pivot values ('row1','col2',12);
insert into table_to_pivot values ('row1','col3',13);
insert into table_to_pivot values ('row2','col1',21);
insert into table_to_pivot values ('row2','col2',22);
insert into table_to_pivot values ('row2','col3',23);
insert into table_to_pivot values ('row3','col1',31);
insert into table_to_pivot values ('row3','col2',32);
insert into table_to_pivot values ('row3','col3',33);
insert into table_to_pivot values ('row1','col1',141);
insert into table_to_pivot values ('row1','col2',122);
insert into table_to_pivot values ('row4','col3',131);
insert into table_to_pivot values ('row1','col1',211);
insert into table_to_pivot values ('row3','col2',222);
insert into table_to_pivot values ('row5','col3',163);
insert into table_to_pivot values ('row7','col8',316);
insert into table_to_pivot values ('row5','col2',324);
insert into table_to_pivot values ('row33','col3',233);
使用函数pivotcode:
select pivotcode('table_to_pivot','rowname','colname','max(cellval)','integer')
执行 pivotcode 生成的 SQL :
select * from crosstab (
'select rowname,colname,max(cellval) from table_to_pivot group by 1,2 order by 1,2',
'select distinct colname from table_to_pivot order by 1'
)
as newtable (
rowname varchar,_col1 integer,_col2 integer,_col3 integer,_col8 integer
);
运行SQL得到透视表:
函数pivotcode,有以下缺点:
如果返回的列有空格,需要自己手工修改
可能会有大量的 Null 值,而不是 0
返回的是一个 SQL,您需要复制出来后自己再执行
update版(实现pivotcode函数返回 json数据):
直接通过pivotcode函数返回结果,而不是返回一个待执行的 SQL 语句
-- PL/pgSQL code to create pivot tables with automatic column names
-- prerequisite: install the tablefunc module
create extension if not exists tablefunc;
-- tablename: name of source table you want to pivot
-- rowc: the name of the column in source table you want to be the rows
-- colc: the name of the column in source table you want to be the columns
-- cellc: an aggregate expression determining how the cell values will be created
-- celldatatype: desired data type for the cells
create or replace function pivot (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar)
returns json language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
stmt TEXT;
result json;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct ''_''||'||colc||'||'' '||celldatatype||''','','' order by ''_''||'||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as newtable (
'||rowc||' varchar,'||columnlist||'
)';
stmt = E' select array_to_json(array_agg(row_to_json(t))) from ('|| dynsql2 ||') t ';
execute stmt into result;
return result;
end
$$
select pivot('table_to_pivot','rowname','colname','max(cellval)','integer')
[
{"rowname":"row1","_col1":211,"_col2":122,"_col3":13,"_col8":null},
{"rowname":"row2","_col1":21,"_col2":22,"_col3":23,"_col8":null},
{"rowname":"row3","_col1":31,"_col2":222,"_col3":33,"_col8":null},
{"rowname":"row33","_col1":null,"_col2":null,"_col3":233,"_col8":null},
{"rowname":"row4","_col1":null,"_col2":null,"_col3":131,"_col8":null},
{"rowname":"row5","_col1":null,"_col2":324,"_col3":163,"_col8":null},
{"rowname":"row7","_col1":null,"_col2":null,"_col3":null,"_col8":316}
]
参考文献 :
postgresql11:https://www.postgresql.org/docs/11/tablefunc.html
Pivot函数: https://madlib.apache.org/docs/latest/group__grp__pivot.html
pivotmytable: https://github.com/jtornero/pivotmytable
crosstabview: https://wiki.postgresql.org/wiki/Crosstabview
dynamic_pivot: https://postgresql.verite.pro/blog/2018/06/19/crosstab-pivot.html