MySQL的group_concat 函数经常实用,功能蛮强大的。
MySQL创建表:
create table kpi(emp_no varchar(8),performance varchar(32),month varchar(32));
clickhouse 创建表:
create table kpi(emp_no varchar(8),performance varchar(32),month varchar(32)) engine=Memory;
分别在MySQL和Clickhouse中插入数据:
insert into kpi values (10,'A','2020-01'),(10,'A','2020-02'),(10,'C','2020-03'),(10,'B','2020-04'),(10,'A','2020-05'),(10,'A','2020-06');
insert into kpi values (20,'A','2020-01'),(20,'B','2020-02'),(20,'C','2020-03'),(20,'C','2020-04'),(20,'A','2020-05'),(20,'D','2020-06');
insert into kpi values (30,'C','2020-03'),(30,'C','2020-04'),(30,'B','2020-05'),(30,'B','2020-06');
数据查询如下:
MySQL的group_concat 功能展示:
mysql> select emp_no,group_concat(performance order by month separator '-') kpi_list,group_concat(distinct performance order by month separator '-') kpi_uniq,group_concat(distinct performance order by month desc separator '-') kpi_uniq_desc from kpi group by emp_no;
+--------+-------------+----------+---------------+
| emp_no | kpi_list | kpi_uniq | kpi_uniq_desc |
+--------+-------------+----------+---------------+
| 10 | A-A-C-B-A-A | A-C-B | B-C-A |
| 20 | A-B-C-C-A-D | A-B-C-D | D-C-B-A |
| 30 | C-C-B-B | C-B | B-C |
+--------+-------------+----------+---------------+
3 rows in set (0.00 sec)
kpi_list:按照月份依次显示每个月的绩效
kpi_uniq:上半年获得的绩效 等级(绩效去重)
kpi_uniq_desc :去重后的绩效反向排序
Clickhouse的Group_concat 实现:
SELECT
emp_no,
groupArray(performance) AS kpi_asc,
arrayStringConcat(kpi_asc, '-') AS kpi_list,
arrayReverse(kpi_asc) AS kpi_desc,
groupUniqArray(performance) AS kpis,
arraySort(kpis) AS kpi_uniq,
countEqual(kpi_asc, 'A') AS A_cnt,
countEqual(kpi_asc, 'B') AS B_cnt,
countEqual(kpi_asc, 'C') AS C_cnt,
countEqual(kpi_asc, 'D') AS D_cnt
FROM kpi
GROUP BY emp_no
ORDER BY emp_no ASC
┌─emp_no─┬─kpi_asc───────────────────┬─kpi_list────┬─kpi_desc──────────────────┬─kpis──────────────┬─kpi_uniq──────────┬─A_cnt─┬─B_cnt─┬─C_cnt─┬─D_cnt─┐
│ 10 │ ['A','A','C','B','A','A'] │ A-A-C-B-A-A │ ['A','A','B','C','A','A'] │ ['B','A','C'] │ ['A','B','C'] │ 4 │ 1 │ 1 │ 0 │
│ 20 │ ['A','B','C','C','A','D'] │ A-B-C-C-A-D │ ['D','A','C','C','B','A'] │ ['B','A','D','C'] │ ['A','B','C','D'] │ 2 │ 1 │ 2 │ 1 │
│ 30 │ ['C','C','B','B'] │ C-C-B-B │ ['B','B','C','C'] │ ['B','C'] │ ['B','C'] │ 0 │ 2 │ 2 │ 0 │
└────────┴───────────────────────────┴─────────────┴───────────────────────────┴───────────────────┴───────────────────┴───────┴───────┴───────┴───────┘
3 rows in set. Elapsed: 0.007 sec.
结论:要等同于MySQL中的group_concat 功能需要使用到Clickhouse中的几个函数组合使用:
groupArray 行专列
groupUniqArray
等同于mysql中的 group_concat(distinct ..)
arrayStringConcat
等同于group_concat 子语句中的separator '-' 语句
countEqual 则统计数组中元素出现的次数,MySQL实现成功要麻烦点。