mysql>
mysql> alter table table1 add rollup rollup_city(name, city_code, pv);
Query OK, 0 rows affected (0.01 sec)
mysql>
mysql>
mysql> show alter table rollup;
+-------+-----------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+
| JobId | TableName | CreateTime | FinishTime | BaseIndexName | RollupIndexName | RollupId | TransactionId | State | Msg | Progress | Timeout |
+-------+-----------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+
| 10741 | table1 | 2021-10-24 14:22:50 | 2021-10-24 14:23:15 | table1 | rollup_city | 10742 | 15 | FINISHED | | NULL | 86400 |
+-------+-----------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+
2 rows in set (0.00 sec)
mysql>
CANCEL ALTER TABLE ROLLUP FROM table1;
mysql>
mysql> desc table1 all;
+-------------+---------------+-----------+-------------+------+-------+---------+-------+---------+
| IndexName | IndexKeysType | Field | Type | Null | Key | Default | Extra | Visible |
+-------------+---------------+-----------+-------------+------+-------+---------+-------+---------+
| table1 | AGG_KEYS | id | INT | Yes | true | 0 | | true |
| | | name | VARCHAR(32) | Yes | true | | | true |
| | | city_code | INT | Yes | true | NULL | | true |
| | | pv | BIGINT | Yes | false | 0 | SUM | true |
| | | | | | | | | |
| rollup_city | AGG_KEYS | name | VARCHAR(32) | Yes | true | | | true |
| | | city_code | INT | Yes | true | NULL | | true |
| | | pv | BIGINT | Yes | false | 0 | SUM | true |
+-------------+---------------+-----------+-------------+------+-------+---------+-------+---------+
8 rows in set (0.01 sec)
mysql>
select name, city_code, sum(pv) from table1 group by name, city_code
select city_code, name, sum(pv) from table1 group by city_code, name
和select city_code, sum(pv) from table1 group by city_code
select a.name, a.city_code, sum(a.pv) from table1 a join table2 b on a.name = b.name group by a.name, a.city_code
因为查询涉及到的列位于不同表,不符合条件2。如select a.name, a.city_code, sum(a.pv) from table1 a left join table1 b on a.name = b.name group by a.name, a.city_code
,表b因为不符合条件1,不能命中rollup;表a能命中rollupexplain select_sql
进行分析,如下所示:mysql>
mysql> explain select name, city_code, sum(pv) from table1 group by name, city_code;
+-----------------------------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: `name` | `city_code` | sum(`pv`) |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 4:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: `name`, `city_code` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 04 |
| UNPARTITIONED |
| |
| 3:AGGREGATE (merge finalize) |
| | output: sum( sum(`pv`)) |
| | group by: `name`, `city_code` |
| | |
| 2:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| HASH_PARTITIONED: `name`, `city_code` |
| |
| 1:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(`pv`) |
| | group by: `name`, `city_code` |
| | |
| 0:OlapScanNode |
| TABLE: table1 |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: rollup_city |
| tabletRatio=10/10 |
| tabletList=10743,10747,10751,10755,10759,10763,10767,10771,10775,10779 |
| cardinality=5 |
| avgRowSize=518.6 |
| numNodes=3 |
+-----------------------------------------------------------------------------+
45 rows in set (0.00 sec)
mysql>
通过PREAGGREGATION来查看是否有聚合, 查看rollup看命中哪个rollup
对于Duplicate表添加rollup没有聚合的效果,只是添加了几个key, 间接添加了一种命中前缀索引情况
Doris的数据存储是一种有序的数据结构,可以按照指定的列(key)进行排序存储。稀疏索引数据也是排序的,用索引定位,在数据中做二分查找
key1例子
Field | Type | Key | Bytes |
---|---|---|---|
id | bigint | true | 8 |
age | bigint | true | 8 |
weight | int | true | 4 |
name | varchar(32) | true | 32 |
msg | varchar(128) | false | 128 |
所以前缀索引的构成为: id + age + weight + name的前12个Byte
key2例子
Field | Type | Key | Bytes |
---|---|---|---|
name | varchar(32) | true | 32 |
id | bigint | true | 8 |
age | bigint | true | 8 |
weight | int | true | 4 |
msg | varchar(128) | false | 128 |
所以前缀索引的构成为: name的前20个Byte
在join中on的条件和where的条件的区别: 先对on的条件进行条件过滤,再生成join中间临时表,最后再进行where条件过滤
Apache Doris的前缀索引应用于on和where,且条件表达式需要是=、<、>、<=、>=、in、between,逻辑表达式需要是and
这里我们只以where进行讲解,on同理:对where中的第一个条件字段和前缀索引的第一个字段进行比较,如果相同,则匹配上,继续往下比较,如果不相同,则未匹配上,停止比较,后面的字段匹配原理和第一个字段一样,下面我们以例子来讲解
假如对于一张表tb1,我们有如下前缀索引
Base(k1 ,k2, k3, k4, k5, k6, k7)
rollup_index(k1 ,k2, k5)
select * from tb1 where k2 = xxx
,未匹配上select * from tb1 where k1 = xxx and k2 < xxx and k4 = xxx
,匹配Base的k1、k2select * from tb1 where k1 = xxx and k2 > xxx and k3 in(xxx)
,匹配Base的k1、k2、k3select * from tb1 where k1 = xxx and k2 <= xxx and k5 between xxx and k6 = xxx
,匹配rollup_index的k1、k2、k5select * from tb1 where k1 = xxx and k2 >= xxx and k5 = xxx
,完全匹配rollup_indexselect * from tb1 where k1 = xxx and k2 = xxx
,匹配Base的k1、k2,谁先创建优先匹配谁select * from tb1 where k1 = xxx and k2 = xxx and k3 = xxx and k4 not in xxx
,匹配Base的k1、k2、k3select * from tb1 where (k1 = xxx and k2 = xxx) or k3 = xxx
,未匹配上