ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)

前言

初始化

测试数据生成

  • 创建库

    CREATE DATABASE qbit;
  • 创建表

    CREATE TABLE download (
      userid UInt32,
      when DateTime,
      size Float32
    ) ENGINE = MergeTree 
    PARTITION BY toYYYYMM(when)
    ORDER BY(userid, when);
  • 插入数据

    INSERT INTO qbit.download
    SELECT
      rand() % 5,
      now() + number * 60 as when,
      rand() % 100000000
    FROM system.numbers
    LIMIT 10000;
  • 查看数据

    SELECT * FROM qbit.download LIMIT 10;

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第1张图片

  • 查看每天的下载量(扫描 10000 条数据

    SELECT
      userid,
      toYYYYMMDD(when) AS day,
      count() AS cnt,
      sum(size) AS size
    FROM qbit.download
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第2张图片

SummingMergeTree 准实时看板

方式一(view1)

  • 创建视图 view1

    CREATE MATERIALIZED VIEW qbit.view1
    ENGINE = SummingMergeTree((cnt, size))
    ORDER BY (userid, day) POPULATE AS
    SELECT
      userid,
      toYYYYMMDD(when) AS day,
      toInt64(1) AS cnt,
      toFloat64(size) AS size
    FROM qbit.download

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第3张图片

  • 查询视图 view1(可以看到跟前面的查询结果是一致的)

    SELECT *
    FROM qbit.view1
    ORDER BY cnt DESC
    LIMIT 10

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第4张图片

方式二(view2)

  • 创建视图 view2

    CREATE MATERIALIZED VIEW qbit.view2
    ENGINE = SummingMergeTree
    PARTITION BY day
    ORDER BY (userid, day) POPULATE AS
    SELECT
      userid,
      toYYYYMMDD(when) AS day,
      count() AS cnt,
      sum(size) AS size
    FROM qbit.download
    GROUP BY
      userid,
      day

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第5张图片

  • 查询视图 view2(可以看到跟前面的查询结果是一致的)

    SELECT *
    FROM qbit.view2
    ORDER BY cnt DESC
    LIMIT 10

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第6张图片

AggregatingMergeTree 准实时看板

  • 创建视图(view3)

    CREATE MATERIALIZED VIEW qbit.view3
    ENGINE = AggregatingMergeTree
    PARTITION BY day
    ORDER BY (userid, day) POPULATE AS
    SELECT
      userid,
      toYYYYMMDD(when) AS day,
      countState() AS cnt,
      sumState(size) AS size
    FROM qbit.download
    GROUP BY
      userid,
      day

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第7张图片

  • 查询视图 view3(可以看到跟前面的查询结果是一致的)

    SELECT
      userid,
      day,
      countMerge(cnt) AS cnt,
      sumMerge(size) AS size
    FROM qbit.view3
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第8张图片

插入更新数据

  • 再插入 20000 条数据

    INSERT INTO qbit.download SELECT
      rand() % 5,
      now() + (number * 60) AS when,
      rand() % 100000000
    FROM system.numbers
    LIMIT 20000
  • 查看底表数据量
    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第9张图片
  • 直接查询底表统计

    SELECT
      userid,
      toYYYYMMDD(when) AS day,
      count() AS cnt,
      sum(size) AS size
    FROM qbit.download
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第10张图片

  • 查询 view1 统计

    SELECT *
    FROM qbit.view1
    ORDER BY cnt DESC
    LIMIT 10

    查询结果与直接查询底表不一致,OPTIMIZE 后与底表查询结果一致

    OPTIMIZE TABLE qbit.view1 FINAL

    如果不想强制合并汇总,也可以改变查询语句

    SELECT
      userid,
      day,
      sum(cnt) AS cnt,
      sum(size) AS size
    FROM qbit.view1
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10

    可以直接 select count 验证,即使没有合并汇总,视图的数据条数也会比底表的数据条数少很多
    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第11张图片

  • 查询 view2 统计

    SELECT *
    FROM qbit.view2
    ORDER BY cnt DESC
    LIMIT 10

    查询结果与直接查询底表不一致,OPTIMIZE 后与底表查询结果一致

    OPTIMIZE TABLE qbit.view2 FINAL

    如果不想强制合并汇总,也可以改变查询语句

    SELECT
      userid,
      day,
      sum(cnt) AS cnt,
      sum(size) AS size
    FROM qbit.view2
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10
  • 查询 view3 统计。AggregatingMergeTree 本身用了 group by,所以不用 OPTIMIZE 可以得到最终结果。

    SELECT
      userid,
      day,
      countMerge(cnt) AS cnt,
      sumMerge(size) AS size
    FROM qbit.view3
    GROUP BY
      userid,
      day
    ORDER BY cnt DESC
    LIMIT 10
    

    ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit)_第12张图片

参考文献

本文出自 qbit snap

你可能感兴趣的:(ClickHouse 之 SummingMergeTree/AggregatingMergeTree 实现准实时看板(qbit))