通过定义一段select查询语句,从base表进行数据查询,形成一张物化视图表。物化视图表的数据独立于base表储存。对base表进行数据的插入,更新、删除也会原子性的同步到物化视图表。进行数据查询时,Doris引擎会自动以最优方式进行查询,可能查询base表,可能查询物化视图表
我们都指导rollup主要有两个功能,第一对Aggregate表提供一种更粗粒度的聚合;第二提供一种key,提高前缀索引的命中率
但是rollup不能对Duplicate表进行聚合操作。所以就有了物化视图,物化视图能对所有表模型提供一种更粗粒度的聚合,而且支持更多的聚合函数;而且也能提供一种key,提高前缀索引的命中率
创建物化视图的select语句,如果完全和我们进行数据查询的select语句一样,则物化视图表就不能被其它数据查询select语句使用
所以最好对创建物化视图的select语句进行抽象,能被多个数据查询的select语句命中,达到一个时间和空间的平衡
创建物化视图的语法:
create materialized view mv_name as
select select_expr[, select_expr, ......]
from table_name
[group by column_name[, column_name, ......]]
[order by column_name[, column_name, ......]]
[properties ('key' = 'value', ......)]
说明如下:
支持的聚合函数有:
创建物化视图是一个异步的操作
示例:
创建base表
mysql> create table advertiser_record(
-> click_date date,
-> advertiser varchar(32),
-> channel varchar(16),
-> user_id int
-> ) distributed by hash(click_date)
-> properties('replication_num' = '3');
Query OK, 0 rows affected (0.35 sec)
mysql>
创建物化视图。这里base表的tablet正在被调度,所以需要等调度完,才能创建物化视图成功
mysql> create materialized view advertiser_view as
-> select advertiser, channel, bitmap_union(to_bitmap(user_id))
-> from advertiser_record
-> group by advertiser, channel;
ERROR 1105 (HY000): errCode = 2, detailMessage = table [advertiser_record] is not stable. Some tablets of this table may not be healthy or are being scheduled. You need to repair the table first or stop cluster balance. See 'help admin;'.
mysql>
mysql> create materialized view advertiser_view as select advertiser, channel, bitmap_union(to_bitmap(user_id)) from advertiser_record group by advertiser, channel;
Query OK, 0 rows affected (1.58 sec)
mysql>
查看物化视图创建的进度
mysql> show alter table materialized view from test_db;
+-------+------------------------+---------------------+---------------------+------------------------+------------------+----------+---------------+----------+------+----------+---------+
| JobId | TableName | CreateTime | FinishTime | BaseIndexName | RollupIndexName | RollupId | TransactionId | State | Msg | Progress | Timeout |
+-------+------------------------+---------------------+---------------------+------------------------+------------------+----------+---------------+----------+------+----------+---------+
| 15398 | advertiser_record | 2022-08-02 16:32:23 | 2022-08-02 16:32:53 | advertiser_record | advertiser_view | 15399 | 1020 | FINISHED | | NULL | 86400 |
+-------+------------------------+---------------------+---------------------+------------------------+------------------+----------+---------------+----------+------+----------+---------+
1 rows in set (0.02 sec)
mysql>
查看当前表都有哪些物化视图,以及他们的表结构都是什么样的
mysql> desc advertiser_record all;
+-------------------+---------------+----------------------+-------------+------+-------+---------+--------------+---------+
| IndexName | IndexKeysType | Field | Type | Null | Key | Default | Extra | Visible |
+-------------------+---------------+----------------------+-------------+------+-------+---------+--------------+---------+
| advertiser_record | DUP_KEYS | click_date | DATE | Yes | true | NULL | | true |
| | | advertiser | VARCHAR(32) | Yes | true | NULL | | true |
| | | channel | VARCHAR(16) | Yes | false | NULL | NONE | true |
| | | user_id | INT | Yes | false | NULL | NONE | true |
| | | | | | | | | |
| advertiser_view | AGG_KEYS | advertiser | VARCHAR(32) | Yes | true | NULL | | true |
| | | channel | VARCHAR(16) | Yes | true | NULL | | true |
| | | to_bitmap(`user_id`) | BITMAP | No | false | | BITMAP_UNION | true |
+-------------------+---------------+----------------------+-------------+------+-------+---------+--------------+---------+
8 rows in set (0.07 sec)
mysql>
mysql> drop materialized view if exists advertiser_view on advertiser_record;
Query OK, 0 rows affected (0.05 sec)
mysql>
物化视图聚合 | 查询中聚合 |
---|---|
sum | sum |
min | min |
max | max |
count | count |
bitmap_union | bitmap_union, bitmap_union_count, count(distinct) |
hll_union | hll_raw_agg, hll_union_agg, ndv, approx_count_distinct |
在Doris中,count(distinct)聚合的结果和bitmap_union_count聚合的结果是完全一致的。而bitmap_union_count等于bitmap_union的结果求 count, 所以如果查询中涉及到count(distinct),则通过创建带bitmap_union聚合的物化视图可加快查询
对于物化视图的聚合函数bitmap_union和hll_union,select查询匹配到物化视图后,会将select查询的聚合函数转换成物化视图的聚合函数bitmap_union和hll_union
通过explain命令来检查当前查询是否使用了物化视图。注意要先往表中插入一条数据,不然不知道从那个表查询数据最优
mysql> insert into advertiser_record(click_date, advertiser, channel, user_id) values(date('2022-08-02'), 'car_advertiser', '小程序', 1);
Query OK, 1 row affected (2.79 sec)
{'label':'insert_f3aa0cfc2c8741bd-b567f96aff6f6629', 'status':'VISIBLE', 'txnId':'1021'}
mysql>
mysql> explain select advertiser, channel, count(distinct user_id)
-> from advertiser_record
-> group by advertiser, channel;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: `advertiser` | `channel` | bitmap_union_count(`default_cluster:test_db`.`advertiser_record`.`mv_bitmap_union_user_id`) |
| PARTITION: UNPARTITIONED |
| |
| VRESULT SINK |
......省略部分......
| 0:VOlapScanNode |
| TABLE: advertiser_record(advertiser_view), PREAGGREGATION: ON |
| partitions=1/1, tablets=10/10, tabletList=15400,15404,15408 ... |
| cardinality=0, avgRowSize=48.0, numNodes=1 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
41 rows in set (0.04 sec)
mysql>
看VOlapScanNode的TABLE部分,这里命中了advertiser_view物化视图。如果命中base表,则是TABLE: advertiser_record(advertiser_record)