JSON(JavaScript Object Notation)主要用于互联网应用服务之间的数据交换。MySQL 支持JSON 对象
和JSON 数组
两种类型,JSON 类型是从 MySQL 5.7 版本开始支持的功能(强烈建议MySQL8下使用JSON,性能更佳
),MySQL中使用JSON有以下好处:
无须预定义字段
:字段可以无限拓展,避免了ALTER ADD COLUMN的操作,使用更加灵活。
处理稀疏字段
:避免了稀疏字段的NULL值,避免冗余存储。
支持索引
:相比于字符串格式的JSON,JSON类型支持索引做特定的查询优化。
总体而言,JSON 类型比较适合存储一些列不固定、修改较少、相对静态的数据
,比如用户画像、商品标签、接口数据等,原先大家可能会用Mongo
来存储此类数据,现在MySQL也支持使用了。
/*建表SQL*/
CREATE TABLE student (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
info JSON DEFAULT NULL
);
/*插入json对象,包含多组键值对*/
mysql> INSERT student (info) VALUES ('{"sex": "F", "age": 13, "city": "beijing"}');
mysql> INSERT student (info) VALUES ('{"sex": "M", "age": 14, "city": "suzhou"}');
mysql> INSERT student (info) VALUES ('{"sex": "F", "age": 23, "city": "shenzhen"}');
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*插入JSON数组,一般有如下2种形式,也可以为更加复杂的嵌套JSON*/
mysql> INSERT student (info) VALUES ('[1,2,3,4]');
mysql> INSERT student (info) VALUES ('[{"sex": "M"},{"sex":"F", "city":"nanjing"}]');
mysql> select * from student where id in (4,5);
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
因为支持了新的JSON类型,MySQL 配套提供了丰富的 JSON 字段处理函数,用于方便地操作 JSON 数据,最常见的就是函数 JSON_EXTRACT
,它用来从 JSON 数据中提取所需要的字段内容。
2.1. 提取JSON对象
主要是JSON_EXTRACT(根据键提取值)
,JSON_UNQUOTE(去除最外侧的双引号)
,或者->
和 ->>
表达式。
/*JSON_EXTRACT适用于json对象提取*/
mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*JSON数组无法直接用JSON_EXTRACT提取数据*/
mysql> select id,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.sex')) ex,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.age')) age,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.city')) city
from student;
+----+------+------+----------+
| id | ex | age | city |
+----+------+------+----------+
| 1 | F | 13 | beijing |
| 2 | M | 14 | suzhou |
| 3 | F | 23 | shenzhen |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+----------+
/*JSON_UNQUOTE不加则不会取出最外侧的双引号*/
mysql> select id,JSON_EXTRACT(info,'$.sex') sex from student;
+----+------+
| id | sex |
+----+------+
| 1 | "F" |
| 2 | "M" |
| 3 | "F" |
| 4 | NULL |
| 5 | NULL |
+----+------+
MySQL 还提供了->
和 ->>
表达式,和上述 SQL 效果一样,其中->
能得到提取结果但是不去除外面的双引号,与JSON_EXTRACT对应, ->>
能得到提取结果且会去除外面的符号,与 JSON_UNQUOTE(JSON_EXTRACT)对应。
->
= JSON_EXTRACT
->>
= JSON_UNQUOTE(JSON_EXTRACT)
mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*->>去除最外侧双引号*/
mysql> select id,info->>'$.sex' sex,info->>'$.age' age,info->>'$.city' city from student;
+----+------+------+----------+
| id | sex | age | city |
+----+------+------+----------+
| 1 | F | 13 | beijing |
| 2 | M | 14 | suzhou |
| 3 | F | 23 | shenzhen |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+----------+
/*->不去除最外层双引号*/
mysql> select id,info->'$.sex' sex,info->'$.age' age,info->'$.city' city from student;
+----+------+------+------------+
| id | sex | age | city |
+----+------+------+------------+
| 1 | "F" | 13 | "beijing" |
| 2 | "M" | 14 | "suzhou" |
| 3 | "F" | 23 | "shenzhen" |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+------------+
2.2. 提取JSON数组
mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*使用JSON_EXTRACT提取数组*/
mysql> select id,JSON_EXTRACT(info, '$[0]') first from student;
+----+---------------------------------------------+
| id | first |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | 1 |
| 5 | {"sex": "M"} |
+----+---------------------------------------------+
/*使用->>符号提取数组,如果索引位置不存在则返回NULl*/
mysql> select id,info->>'$[0]' first,info->>'$[1]' second from student;
+----+---------------------------------------------+---------------------------------+
| id | first | second |
+----+---------------------------------------------+---------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} | NULL |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} | NULL |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} | NULL |
| 4 | 1 | 2 |
| 5 | {"sex": "M"} | {"sex": "F", "city": "nanjing"} |
+----+---------------------------------------------+---------------------------------+
/*使用->>符号提取数组中的值*/
mysql> select id,info->>'$[0].sex' first from student;
+----+-------+
| id | first |
+----+-------+
| 1 | F |
| 2 | M |
| 3 | F |
| 4 | NULL |
| 5 | M |
+----+-------+
提取JSON后不能用新命名的字段做筛选过滤,需要调用把JSON函数或者符号再写一遍。
3.1. JSON对象过滤
这种方式与原先的非JSON类型条件过滤类似,写法比较简单明了,但是只能用于过滤JSON对象,无法过滤JSON数组。
mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*筛选sex是F,age大于14的*/
mysql> select id,info from student WHERE info->>'$.age' > 14 and info->>'$.sex' = 'F';
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
3.2. JSON函数过滤(高级)
MySQL8中针对JOSN类型,新增了部分JSON函数用于数据过滤(包括JSON对象和JSON数组):
MEMBER OF
:匹配某个元素是否存在,返回1表示元素存在,返回0表示元素不存在。
JSON_CONTAINS
:对JSON数组检查一个元素或者多个元素是否存在,对于JSON对象检查指定KEY是否有某个值。
JSON_OVERLAP
:比较两个JSON数组是否至少有一个元素一致,如果是返回1,否则返回0,如果是JSON对象,判断是否是有一对key value一致。
以上函数可以在前面加上NOT关键字就可以取反
MEMBER OF
/*MEMBER OF对JSON数组使用*/
mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*数组中包含3,对嵌套数组不生效*/
mysql> SELECT * FROM student WHERE 3 MEMBER OF(info);
+----+--------+
| id | info |
+----+--------+
| 3 | [1, 3] |
+----+--------+
/*NOT取反,数组中不包含3,对嵌套数组不生效*/
mysql> SELECT * FROM student WHERE not 3 MEMBER OF(info);
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
+----+----------------+
/*过滤包含[3,4]子数组,JSON_ARRAY(3,4)会生成一个JSON数组[3,4]*/
mysql> SELECT * FROM student WHERE JSON_ARRAY(3, 4) MEMBER OF(info);
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
+----+----------------+
/*MEMBER OF对JSON对象使用*/
mysql> SELECT * FROM student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*过滤出age包含13的数据,对于JSON对象来说,value相当于只有一个元素的数组,因此相当于是个等值匹配*/
mysql> SELECT * FROM student2 WHERE 13 MEMBER OF(info->'$.age');
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+
JSON_CONTAINS
mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*JSON_CONTAINS过滤JSON数组,同时包含2和6*/
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '[2, 6]');
+----+-----------+
| id | info |
+----+-----------+
| 2 | [2, 5, 6] |
+----+-----------+
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*JSON_CONTAINS过滤JSON对象,字符串要在内加一层双引号,数值不需要*/
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '"F"', '$.sex');
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '13', '$.age');
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+
JSON_OVERLAP
mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*JSON数组中包含3或5元素*/
mysql> SELECT * FROM student WHERE JSON_OVERLAPS(info, '[3, 5]');
+----+-----------+
| id | info |
+----+-----------+
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+-----------+
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*JSON对象中包含age=13或sex='F'的键值对*/
mysql> select * from student where JSON_OVERLAPS(info,'{"age": 13,"sex": "F"}');
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
修改数据主要是JSON_SET
,JSON_INSERT
,JSON_REPLACE
三个方法,同时也支持完整的JSON列更新,但是不建议,因为需要把整个JSON拼成一个字符串,相对复杂,而前三种JSON方法可以只针对某个key做更新,相对简单。
JSON_SET:替换现有key的值,插入不存在的key的值。
JSON_INSERT:插入不存在的key的值,已经存在的不修改。
JSON_REPLACE:只替换已存在的key的值,不存在的不做插入。
删除使用JSON_REMOVE
,在JSON对象中指定key删除,或在JSON数组中指定下标删除。
mysql> select * from student where id = 1;
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+
/*完整的列更新*/
mysql> update student set info='{"age": 14, "sex": "F", "city": "beijing"}' where id = 1;
mysql> select * from student where id = 1;
+----+--------------------------------------------+------+
| id | info | age |
+----+--------------------------------------------+------+
| 1 | {"age": 14, "sex": "F", "city": "beijing"} | 14 |
+----+--------------------------------------------+------+
/*JSON_SET,不存在则插入,有则替换*/
mysql> UPDATE student SET info = JSON_SET(info, "$.city", 'wuxi', "$.height", 123) WHERE id = 1;
mysql> select * from student where id = 1;
+----+--------------------------------------------------------+
| id | info |
+----+--------------------------------------------------------+
| 1 | {"age": 14, "sex": "F", "city": "wuxi", "height": 123} |
+----+--------------------------------------------------------+
/*JSON_INSERT,只会插入不存在的值*/
mysql> select * from student where id = 2;
+----+-------------------------------------------+
| id | info |
+----+-------------------------------------------+
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
+----+-------------------------------------------+
mysql> UPDATE student SET info = JSON_INSERT(info, '$.city', 'wuxi', '$.height', 123) WHERE id = 2;
mysql> select * from student where id = 2;
+----+----------------------------------------------------------+
| id | info |
+----+----------------------------------------------------------+
| 2 | {"age": 14, "sex": "M", "city": "suzhou", "height": 123} |
+----+----------------------------------------------------------+
/*JSON_REPLACE,只会替换已有值*/
mysql> select * from student where id = 3;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
mysql> UPDATE student SET info = JSON_REPLACE(info, '$.city', 'wuxi', '$.height', 123) WHERE id = 3;
mysql> select * from student where id = 3;
+----+-----------------------------------------+
| id | info |
+----+-----------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "wuxi"} |
+----+-----------------------------------------+
/*在JSON对象中指定key删除*/
mysql> select * from student where id = 3;
+----+-----------------------------------------+
| id | info |
+----+-----------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "wuxi"} |
+----+-----------------------------------------+
mysql> UPDATE student SET info = JSON_REMOVE(info, '$.age') WHERE id = 3;
mysql> select * from student where id = 3;
+----+------------------------------+
| id | info |
+----+------------------------------+
| 3 | {"sex": "F", "city": "wuxi"} |
+----+------------------------------+
/*在JSON数组中指定下标删除*/
mysql> select * from student where id in (4,5);
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
mysql> UPDATE student SET info = JSON_REMOVE(info, '$[1]') WHERE id in (4,5);
mysql> select * from student where id in (4,5);
+----+----------------+
| id | info |
+----+----------------+
| 4 | [1, 3, 4] |
| 5 | [{"sex": "M"}] |
+----+----------------+
5.1. 虚拟列索引
当 JSON 数据量非常大,用户希望对 JSON 数据进行有效检索时,可以利用 MySQL 的函数索引功能对 JSON 中的某个字段进行索引,具体方式是先创建一个虚拟列,再对虚拟列创建索引
。
/*原本执行计划走的全表*/
mysql> explain select * from student where info->>"$.age" = 13 ;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
/*创建age虚拟列*/
mysql> ALTER TABLE student ADD COLUMN age INT as (info->>"$.age");
/*创建age索引*/
mysql> create index idx_age on student(age);
/*执行计划走新加的索引*/
mysql> explain select * from student where info->>"$.age" = 13 ;
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | student | NULL | ref | idx_age | idx_age | 5 | const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+
5.2. Multi-Valued Index
从MySQL8.0.17开始,InnoDB支持多值索引(Multi-Valued Index)
。多值索引是在存储JSON数组
的列上定义的辅助索引,对于MEMBER OF,JSON_CONTAINS,JSON_OVERLAPS 等函数可以利用多值索引进行性能优化。
JSON对象的使用
(多值索引是官方针对JSON数组的辅助索引,但是根据实践也可以针对JSON对象使用,但是只适用于member of`函数)
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*没创建多值索引前走的是全表扫描*/
mysql> explain select * from student where 13 member of(info->'$.age');
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
/*创建info->'$.age'的多值索引*/
mysql> alter table student add index idx_age((cast(info->'$.age' as unsigned array)));
/*memberof函数可以走多值索引*/
mysql> explain select * from student where 13 member of(info->'$.age');
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ref | idx_age | idx_age | 9 | const | 1 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
/*指定JSON对象过滤无法利用多值索引*/
mysql> explain select * from student where info->'$.age' = 13;
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
/*JSON_CONTAINS函数无法利用多值索引过滤JSON对象*/
mysql> explain SELECT * FROM student2 where JSON_CONTAINS(info, '13', '$.age');
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
/*JSON_OVERLAPS函数也无法利用多值索引过滤JSON对象*/
mysql> explain select * from student where JSON_OVERLAPS(info,'{"age": 13}');
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------
JSON数组的使用
:三种JSON函数都可以利用多值索引。
mysql> select * from student;
+----+--------------+
| id | info |
+----+--------------+
| 1 | [1, 2, 5, 9] |
| 2 | [2, 5, 6, 8] |
| 3 | [5, 3, 8, 9] |
| 4 | [1, 2, 7, 8] |
+----+--------------+
/*没创建多值索引前走的是全表扫描*/
mysql> EXPLAIN SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
/*创建info列的多值索引,注意如果这个地方改为idx_info((cast((info->"$") as unsigned array))),则后续所有的函数都要是info->"$",否则走不了索引*/
mysql> ALTER TABLE student ADD INDEX idx_info((cast(info as unsigned array)));
/*JSON_CONTAINS函数可以走新建的多值索引*/
mysql> EXPLAIN SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | range | idx_info | idx_info | 9 | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
/*JSON_CONTAINS函数可以走新建的多值索引*/
mysql> explain SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | range | idx_info | idx_info | 9 | NULL | 1 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
/*MEMBER OF函数可以走新建的多值索引*/
mysql> explain SELECT * FROM student WHERE 3 MEMBER OF(info);
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ref | idx_info | idx_info | 9 | const | 1 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+
这里列举一个MySQL JSON最常见的应用场景,即用户画像(也就是给用户打标签)
,大家可以作为参考,看下实际业务场景中是否有此类需求。
错误的设计
:用逗号作为分隔符存储用户标签。
正确的设计
:用JSON存储用户标签。
利用函数MEMBER OF、JSON_CONTAINS、JSON_OVERLAP进行用户画像的搜索。
--查询80后且爱看电影的用户
select * from usertag where json_contains(userTags,'[2,10]');
--查询画像为80后、90后或常看电影的用户
select * from usertag where json_overlaps(userTags,'[2,3,10]');
7.1. sysbench
我们这里用sysbench压测工具,对原表和JSON表(原表的k,c,pad三个列用一个json格式的info列替代,另外添加k的虚拟列索引
)分别进行压测,看下JSON的性能表现。
--服务器:8核64G SSD
--压测工具:sysbench(10张表,单表10w数据,8线程,4个压测场景 )
--原表
CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`k` int NOT NULL DEFAULT '0',
`c` char(120) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
`pad` char(60) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
--json表
CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`info` json DEFAULT NULL,
`k` int GENERATED ALWAYS AS (json_unquote(json_extract(`info`,_utf8mb4'$.k'))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `idx_k` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
--场景一:point-select 根据主键索引查询JSON
SELECT c FROM sbtest1 WHERE id=?
SELECT info->'$.c' FROM sbtest1_json WHERE id=?
--场景二:select_random_points 根据虚拟列索引查询(json类型用prepare会导致索引失效,所以这里用虚拟列替代)
SELECT id, k, c, pad FROM sbtest1 WHERE k in(?)
SELECT id, info FROM sbtest1_json WHERE k in(?)
--场景三:oltp_update_index 根据主键更新JSON
UPDATE sbtest1 SET k=k+1 WHERE id=?
UPDATE sbtest1_json SET info = JSON_SET(info, "$.k", convert(info->'$.k',signed) + 1) WHERE id=?
--场景四:oltp_insert json数据写入
INSERT INTO sbtest1 (id, k, c, pad) VALUES (?,?,?,?)
INSERT INTO sbtest1_json (id, info) VALUES (?,'{"k":?,"c":?,"pad":?}')
场景 | 原表(qps) | JSON表(qps) |
point-select | 100973 | 92521 |
select_random_points | 77922 | 77331 |
oltp_update_index | 1332 | 1301 |
oltp_insert | 1420 | 1400 |
总结:整体性能测试下来,JSON性能与原表相比无特大出入。
7.2. 大量KV测试
JSON列中包含大量KV的读写性能测试:
--表结构
CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`info` json DEFAULT NULL,
`col1` int GENERATED ALWAYS AS (json_unquote(json_extract(`info`,_utf8mb4'$.col1'))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `idx_col1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
--场景二:select_random_points 根据虚拟列索引查询
select id,info->'$.col1',info->'$.col2',info->'$.col3',info->'$.col4',info->'$.col5' FROM sbtest1 WHERE col1 in (?)
--场景三:oltp_update_index 根据主键更新JSON
UPDATE sbtest1 SET info = JSON_SET(info, "$.col1", convert(info->'$.col1',signed) + 1) WHERE id=?
场景 | 100列 | 200列 | 300列 |
select_random_points | 76494 | 75519 | 74766 |
oltp_update_index | 1013 | 927 | 869 |
总结:JSON读写性能会随着KV数增加而略微下降。