从Mysql8.0.17开始,支持在json列上添加多值索引。多值索引会为一条记录添加多条索引记录,查找时,通过索引能快速定位到记录。
要使用多值索引,先通过select version()看一下版本是否支持。
Mysql的json类型是schemaless的,可以插入任意符合json格式的数据,包括数组和对象。
例如,有一张用户表,包括用户标签(tag)、扩展信息(extInfo)都是json类型。tag存放的是字符串数组,extInfo存放的是一个对象,包含age(数值)和address(字符串)。
create table demo_user
(
name varchar(64),
tag json,
extInfo json
);
在不加索引的情况下,可以插入任意符合json格式的数据,每行记录的内容都可以不一样。
insert into demo_user(name, tag, extInfo) values ('小明', '["t1","t2"]', '{"age": 20, "address": "a1"}');
insert into demo_user(name, tag, extInfo) values ('小红', '["t1","t2"]', '{"age": 20, "address": 20}');
insert into demo_user(name, tag, extInfo) values ('小李', '["t1","t2"]', '{"age": "aaa", "address": "a1"}');
insert into demo_user(name, tag, extInfo) values ('小花', '{"age": "aaa", "address": "a1"}', '{"age": "aaa", "address": "a1"}');
select * from demo_user;
+------+-------------------------------+-------------------------------+
|name |tag |extInfo |
+------+-------------------------------+-------------------------------+
|小明 |["t1", "t2"] |{"age": 20, "address": "a1"} |
|小红 |["t1", "t2"] |{"age": 20, "address": 20} |
|小李 |["t1", "t2"] |{"age": "aaa", "address": "a1"}|
|小花 |{"age": "aaa", "address": "a1"}|{"age": "aaa", "address": "a1"}|
+------+-------------------------------+-------------------------------+
还没添加索引,索引按照tag、age、adress查询是会走全表扫描。
explain select * from demo_user where json_contains(tag, '"t1"');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
explain select * from demo_user where json_contains(extInfo->'$.age', '20');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"');
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where|
+--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
添加多值索引的方式和添加普通索引类似,可以通过create table、alter table或create index添加索引,唯一的区别是对索引列的描述有差别。json是一个schemaless,在创建索引的时候要明确要为哪个key建立索引,并且告知列是什么类型。
通过cast(expr AS type [ARRAY])的方式将json转成指定类型的ARRAY,column是需要加索引的json列,type是json里key的类型。
type的可选值有很多,我们需要用到字符串和整形,其他的在官方文档里有描述[cast函数]
-- 将tag里的值转成字符类型
alter table demo_user add index tag( (cast(tag as char(64) array)) );
-- 将extInfo里的age转成无符号整形
alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) );
-- 将extInfo里的address转成字符类型
alter table demo_user add index address( (cast(extInfo->'$.address' as char(255) array)) );
这时候,因为之前插入的数据可能无法转成对应的类型,就会在建立索引的时候报错。
例如,在给tag添加索引时,因为有第四条记录的tag值是{"age": "aaa", "address": "a1"},所以提示JSON Object类型无法转成array类型。
alter table demo_user add index tag( (cast(tag as char(64) array)) );
---
[42000][1235] This version of MySQL doesn't yet support 'CAST-ing JSON OBJECT type to array'
给age添加索引时,提示数据截断,因为第三条里age值为"aaa",转成整数是为空。
alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) );
---
[22001][3903] Data truncation: Invalid JSON value for CAST for functional index 'age'.
当我们把数据修正后就能正常添加上索引,再看看执行explain看一下查询是否走索引:
explain select * from demo_user where json_contains(tag, '"t1"');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |range|tag |tag|259 |null|4 |100 |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
explain select * from demo_user where json_contains(extInfo->'$.age', '20');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |range|age |age|9 |null|2 |100 |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"');
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type |possible_keys|key |key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |range|address |address|1023 |null|3 |100 |Using where|
+--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
可以看到3个查询语句都走到了索引。
除了json_contains能走索引,member of()和json_overlaps()也能用到索引,但是这几个函数的功能是不一样的,具体要查询文档说明。
explain select * from demo_user where 't1' member of (tag);
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+
|id|select_type|table |partitions|type|possible_keys|key|key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |ref |tag |tag|259 |const|4 |100 |Using where|
+--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+
explain select * from demo_user where JSON_OVERLAPS(tag, '"t1"');
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra |
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
|1 |SIMPLE |demo_user|null |range|tag |tag|259 |null|4 |100 |Using where|
+--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
json列能添加多值索引,提高了json的查询性能。同时因为索引要指定列的类型,索引列就不能使用任意类型,这会失去一部分schemaless的能力。需要注意的是,使用cast函数添加索引时,要选择合适type,不然会出现类型转换失败的异常。
JSON类型文档:MySQL :: MySQL 8.3 Reference Manual :: 11.5 The JSON Data Type
多值索引文档:MySQL :: MySQL 8.3 Reference Manual :: 13.1.15 CREATE INDEX Statement
cast函数文档:MySQL :: MySQL 8.3 Reference Manual :: 12.10 Cast Functions and Operators