MySQL JSON数据类型

一. JSON数据类型

JSON(JavaScript Object Notation)主要用于互联网应用服务之间的数据交换。MySQL 支持JSON 对象JSON 数组两种类型,JSON 类型是从 MySQL 5.7 版本开始支持的功能(强烈建议MySQL8下使用JSON,性能更佳),MySQL中使用JSON有以下好处:

  • 无须预定义字段:字段可以无限拓展,避免了ALTER ADD COLUMN的操作,使用更加灵活。

  • 处理稀疏字段:避免了稀疏字段的NULL值,避免冗余存储。

  • 支持索引:相比于字符串格式的JSON,JSON类型支持索引做特定的查询优化。

总体而言,JSON 类型比较适合存储一些列不固定、修改较少、相对静态的数据,比如用户画像、商品标签、接口数据等,原先大家可能会用Mongo来存储此类数据,现在MySQL也支持使用了。

/*建表SQL*/
CREATE TABLE student (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
info JSON DEFAULT NULL
);
/*插入json对象,包含多组键值对*/
mysql> INSERT student (info) VALUES ('{"sex": "F", "age": 13, "city": "beijing"}');
mysql> INSERT student (info) VALUES ('{"sex": "M", "age": 14, "city": "suzhou"}');
mysql> INSERT student (info) VALUES ('{"sex": "F", "age": 23, "city": "shenzhen"}');
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*插入JSON数组,一般有如下2种形式,也可以为更加复杂的嵌套JSON*/
mysql> INSERT student (info) VALUES ('[1,2,3,4]');
mysql> INSERT student (info) VALUES ('[{"sex": "M"},{"sex":"F", "city":"nanjing"}]');
mysql> select * from student where id in (4,5);
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+

二. JSON数据查询

因为支持了新的JSON类型,MySQL 配套提供了丰富的 JSON 字段处理函数,用于方便地操作 JSON 数据,最常见的就是函数 JSON_EXTRACT,它用来从 JSON 数据中提取所需要的字段内容。

2.1. 提取JSON对象

主要是JSON_EXTRACT(根据键提取值)JSON_UNQUOTE(去除最外侧的双引号),或者-> 和 ->> 表达式。

/*JSON_EXTRACT适用于json对象提取*/
mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*JSON数组无法直接用JSON_EXTRACT提取数据*/
mysql> select id,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.sex')) ex,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.age')) age,
JSON_UNQUOTE(JSON_EXTRACT(info,'$.city')) city
from student;
+----+------+------+----------+
| id | ex | age | city |
+----+------+------+----------+
| 1 | F | 13 | beijing |
| 2 | M | 14 | suzhou |
| 3 | F | 23 | shenzhen |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+----------+
/*JSON_UNQUOTE不加则不会取出最外侧的双引号*/
mysql> select id,JSON_EXTRACT(info,'$.sex') sex from student;
+----+------+
| id | sex |
+----+------+
| 1 | "F" |
| 2 | "M" |
| 3 | "F" |
| 4 | NULL |
| 5 | NULL |
+----+------+

MySQL 还提供了-> 和 ->> 表达式,和上述 SQL 效果一样,其中->能得到提取结果但是不去除外面的双引号,与JSON_EXTRACT对应, ->>能得到提取结果且会去除外面的符号,与 JSON_UNQUOTE(JSON_EXTRACT)对应。

  • -> = JSON_EXTRACT

  • ->> = JSON_UNQUOTE(JSON_EXTRACT)

mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*->>去除最外侧双引号*/
mysql> select id,info->>'$.sex' sex,info->>'$.age' age,info->>'$.city' city from student;
+----+------+------+----------+
| id | sex | age | city |
+----+------+------+----------+
| 1 | F | 13 | beijing |
| 2 | M | 14 | suzhou |
| 3 | F | 23 | shenzhen |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+----------+
/*->不去除最外层双引号*/
mysql> select id,info->'$.sex' sex,info->'$.age' age,info->'$.city' city from student;
+----+------+------+------------+
| id | sex | age | city |
+----+------+------+------------+
| 1 | "F" | 13 | "beijing" |
| 2 | "M" | 14 | "suzhou" |
| 3 | "F" | 23 | "shenzhen" |
| 4 | NULL | NULL | NULL |
| 5 | NULL | NULL | NULL |
+----+------+------+------------+

2.2. 提取JSON数组

mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*使用JSON_EXTRACT提取数组*/
mysql> select id,JSON_EXTRACT(info, '$[0]') first from student;
+----+---------------------------------------------+
| id | first |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | 1 |
| 5 | {"sex": "M"} |
+----+---------------------------------------------+
/*使用->>符号提取数组,如果索引位置不存在则返回NULl*/
mysql> select id,info->>'$[0]' first,info->>'$[1]' second from student;
+----+---------------------------------------------+---------------------------------+
| id | first | second |
+----+---------------------------------------------+---------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} | NULL |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} | NULL |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} | NULL |
| 4 | 1 | 2 |
| 5 | {"sex": "M"} | {"sex": "F", "city": "nanjing"} |
+----+---------------------------------------------+---------------------------------+
/*使用->>符号提取数组中的值*/
mysql> select id,info->>'$[0].sex' first from student;
+----+-------+
| id | first |
+----+-------+
| 1 | F |
| 2 | M |
| 3 | F |
| 4 | NULL |
| 5 | M |
+----+-------+

三. JSON数据过滤

提取JSON后不能用新命名的字段做筛选过滤,需要调用把JSON函数或者符号再写一遍。

3.1. JSON对象过滤

这种方式与原先的非JSON类型条件过滤类似,写法比较简单明了,但是只能用于过滤JSON对象,无法过滤JSON数组。

mysql> select * from student;
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+
/*筛选sex是F,age大于14的*/
mysql> select id,info from student WHERE info->>'$.age' > 14 and info->>'$.sex' = 'F';
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+

3.2. JSON函数过滤(高级)

MySQL8中针对JOSN类型,新增了部分JSON函数用于数据过滤(包括JSON对象和JSON数组):

  • MEMBER OF:匹配某个元素是否存在,返回1表示元素存在,返回0表示元素不存在。

  • JSON_CONTAINS:对JSON数组检查一个元素或者多个元素是否存在,对于JSON对象检查指定KEY是否有某个值。

  • JSON_OVERLAP:比较两个JSON数组是否至少有一个元素一致,如果是返回1,否则返回0,如果是JSON对象,判断是否是有一对key value一致。

以上函数可以在前面加上NOT关键字就可以取反

MEMBER OF

/*MEMBER OF对JSON数组使用*/
mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*数组中包含3,对嵌套数组不生效*/
mysql> SELECT * FROM student WHERE 3 MEMBER OF(info);
+----+--------+
| id | info |
+----+--------+
| 3 | [1, 3] |
+----+--------+
/*NOT取反,数组中不包含3,对嵌套数组不生效*/
mysql> SELECT * FROM student WHERE not 3 MEMBER OF(info);
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
+----+----------------+
/*过滤包含[3,4]子数组,JSON_ARRAY(3,4)会生成一个JSON数组[3,4]*/
mysql> SELECT * FROM student WHERE JSON_ARRAY(3, 4) MEMBER OF(info);
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
+----+----------------+
/*MEMBER OF对JSON对象使用*/
mysql> SELECT * FROM student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*过滤出age包含13的数据,对于JSON对象来说,value相当于只有一个元素的数组,因此相当于是个等值匹配*/
mysql> SELECT * FROM student2 WHERE 13 MEMBER OF(info->'$.age');
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+

JSON_CONTAINS

mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*JSON_CONTAINS过滤JSON数组,同时包含2和6*/
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '[2, 6]');
+----+-----------+
| id | info |
+----+-----------+
| 2 | [2, 5, 6] |
+----+-----------+
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*JSON_CONTAINS过滤JSON对象,字符串要在内加一层双引号,数值不需要*/
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '"F"', '$.sex');
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
mysql> SELECT * FROM student WHERE JSON_CONTAINS(info, '13', '$.age');
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+

JSON_OVERLAP
​​​​​​​

mysql> select * from student;
+----+----------------+
| id | info |
+----+----------------+
| 1 | [1, 2, [3, 4]] |
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+----------------+
/*JSON数组中包含3或5元素*/
mysql> SELECT * FROM student WHERE JSON_OVERLAPS(info, '[3, 5]');
+----+-----------+
| id | info |
+----+-----------+
| 2 | [2, 5, 6] |
| 3 | [1, 3] |
+----+-----------+
mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+
/*JSON对象中包含age=13或sex='F'的键值对*/
mysql> select * from student where JSON_OVERLAPS(info,'{"age": 13,"sex": "F"}');
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+

四. JSON数据修改

修改数据主要是JSON_SETJSON_INSERTJSON_REPLACE三个方法,同时也支持完整的JSON列更新,但是不建议,因为需要把整个JSON拼成一个字符串,相对复杂,而前三种JSON方法可以只针对某个key做更新,相对简单。

  • JSON_SET:替换现有key的值,插入不存在的key的值。

  • JSON_INSERT:插入不存在的key的值,已经存在的不修改。

  • JSON_REPLACE:只替换已存在的key的值,不存在的不做插入。

删除使用JSON_REMOVE,在JSON对象中指定key删除,或在JSON数组中指定下标删除。

mysql> select * from student where id = 1;
+----+--------------------------------------------+
| id | info |
+----+--------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
+----+--------------------------------------------+

/*完整的列更新*/
mysql> update student set info='{"age": 14, "sex": "F", "city": "beijing"}' where id = 1;
mysql> select * from student where id = 1;
+----+--------------------------------------------+------+
| id | info | age |
+----+--------------------------------------------+------+
| 1 | {"age": 14, "sex": "F", "city": "beijing"} | 14 |
+----+--------------------------------------------+------+

/*JSON_SET,不存在则插入,有则替换*/
mysql> UPDATE student SET info = JSON_SET(info, "$.city", 'wuxi', "$.height", 123) WHERE id = 1;
mysql> select * from student where id = 1;
+----+--------------------------------------------------------+
| id | info |
+----+--------------------------------------------------------+
| 1 | {"age": 14, "sex": "F", "city": "wuxi", "height": 123} |
+----+--------------------------------------------------------+


/*JSON_INSERT,只会插入不存在的值*/
mysql> select * from student where id = 2;
+----+-------------------------------------------+
| id | info |
+----+-------------------------------------------+
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
+----+-------------------------------------------+

mysql> UPDATE student SET info = JSON_INSERT(info, '$.city', 'wuxi', '$.height', 123) WHERE id = 2;

mysql> select * from student where id = 2;
+----+----------------------------------------------------------+
| id | info |
+----+----------------------------------------------------------+
| 2 | {"age": 14, "sex": "M", "city": "suzhou", "height": 123} |
+----+----------------------------------------------------------+


/*JSON_REPLACE,只会替换已有值*/
mysql> select * from student where id = 3;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+

mysql> UPDATE student SET info = JSON_REPLACE(info, '$.city', 'wuxi', '$.height', 123) WHERE id = 3;

mysql> select * from student where id = 3;
+----+-----------------------------------------+
| id | info |
+----+-----------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "wuxi"} |
+----+-----------------------------------------+


/*在JSON对象中指定key删除*/
mysql> select * from student where id = 3;
+----+-----------------------------------------+
| id | info |
+----+-----------------------------------------+
| 3 | {"age": 23, "sex": "F", "city": "wuxi"} |
+----+-----------------------------------------+

mysql> UPDATE student SET info = JSON_REMOVE(info, '$.age') WHERE id = 3;

mysql> select * from student where id = 3;
+----+------------------------------+
| id | info |
+----+------------------------------+
| 3 | {"sex": "F", "city": "wuxi"} |
+----+------------------------------+


/*在JSON数组中指定下标删除*/
mysql> select * from student where id in (4,5);
+----+-------------------------------------------------+
| id | info |
+----+-------------------------------------------------+
| 4 | [1, 2, 3, 4] |
| 5 | [{"sex": "M"}, {"sex": "F", "city": "nanjing"}] |
+----+-------------------------------------------------+

mysql> UPDATE student SET info = JSON_REMOVE(info, '$[1]') WHERE id in (4,5);

mysql> select * from student where id in (4,5);
+----+----------------+
| id | info |
+----+----------------+
| 4 | [1, 3, 4] |
| 5 | [{"sex": "M"}] |
+----+----------------+

五. JSON索引使用

5.1. 虚拟列索引

当 JSON 数据量非常大,用户希望对 JSON 数据进行有效检索时,可以利用 MySQL 的函数索引功能对 JSON 中的某个字段进行索引,具体方式是先创建一个虚拟列,再对虚拟列创建索引

/*原本执行计划走的全表*/

mysql> explain select * from student where info->>"$.age" = 13 ;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+


/*创建age虚拟列*/

mysql> ALTER TABLE student ADD COLUMN age INT as (info->>"$.age");


/*创建age索引*/

mysql> create index idx_age on student(age);


/*执行计划走新加的索引*/

mysql> explain select * from student where info->>"$.age" = 13 ;
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | student | NULL | ref | idx_age | idx_age | 5 | const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+

5.2. Multi-Valued Index

从MySQL8.0.17开始,InnoDB支持多值索引(Multi-Valued Index)。多值索引是在存储JSON数组的列上定义的辅助索引,对于MEMBER OF,JSON_CONTAINS,JSON_OVERLAPS 等函数可以利用多值索引进行性能优化。

JSON对象的使用(多值索引是官方针对JSON数组的辅助索引,但是根据实践也可以针对JSON对象使用,但是只适用于member of`函数)

mysql> select * from student;
+----+---------------------------------------------+
| id | info |
+----+---------------------------------------------+
| 1 | {"age": 13, "sex": "F", "city": "beijing"} |
| 2 | {"age": 14, "sex": "M", "city": "suzhou"} |
| 3 | {"age": 23, "sex": "F", "city": "shenzhen"} |
+----+---------------------------------------------+


/*没创建多值索引前走的是全表扫描*/
mysql> explain select * from student where 13 member of(info->'$.age');
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+


/*创建info->'$.age'的多值索引*/
mysql> alter table student add index idx_age((cast(info->'$.age' as unsigned array)));


/*memberof函数可以走多值索引*/
mysql> explain select * from student where 13 member of(info->'$.age');
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ref | idx_age | idx_age | 9 | const | 1 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------------+


/*指定JSON对象过滤无法利用多值索引*/
mysql> explain select * from student where info->'$.age' = 13;
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+


/*JSON_CONTAINS函数无法利用多值索引过滤JSON对象*/
mysql> explain SELECT * FROM student2 where JSON_CONTAINS(info, '13', '$.age');
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+


/*JSON_OVERLAPS函数也无法利用多值索引过滤JSON对象*/
mysql> explain select * from student where JSON_OVERLAPS(info,'{"age": 13}');
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------

JSON数组的使用:三种JSON函数都可以利用多值索引。

mysql> select * from student;
+----+--------------+
| id | info |
+----+--------------+
| 1 | [1, 2, 5, 9] |
| 2 | [2, 5, 6, 8] |
| 3 | [5, 3, 8, 9] |
| 4 | [1, 2, 7, 8] |
+----+--------------+


/*没创建多值索引前走的是全表扫描*/
mysql> EXPLAIN SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+


/*创建info列的多值索引,注意如果这个地方改为idx_info((cast((info->"$") as unsigned array))),则后续所有的函数都要是info->"$",否则走不了索引*/
mysql> ALTER TABLE student ADD INDEX idx_info((cast(info as unsigned array)));


/*JSON_CONTAINS函数可以走新建的多值索引*/
mysql> EXPLAIN SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | range | idx_info | idx_info | 9 | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+


/*JSON_CONTAINS函数可以走新建的多值索引*/
mysql> explain SELECT * FROM student WHERE JSON_CONTAINS(info, '[3]');
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | range | idx_info | idx_info | 9 | NULL | 1 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-------------+


/*MEMBER OF函数可以走新建的多值索引*/
mysql> explain SELECT * FROM student WHERE 3 MEMBER OF(info);
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | student | NULL | ref | idx_info | idx_info | 9 | const | 1 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------------+

六. JSON应用场景

这里列举一个MySQL JSON最常见的应用场景,即用户画像(也就是给用户打标签),大家可以作为参考,看下实际业务场景中是否有此类需求。

错误的设计:用逗号作为分隔符存储用户标签。

MySQL JSON数据类型_第1张图片

正确的设计:用JSON存储用户标签。

MySQL JSON数据类型_第2张图片

利用函数MEMBER OF、JSON_CONTAINS、JSON_OVERLAP进行用户画像的搜索。

--查询80后且爱看电影的用户
select * from usertag where json_contains(userTags,'[2,10]');
--查询画像为80后、90后或常看电影的用户
select * from usertag where json_overlaps(userTags,'[2,3,10]');

七. JSON性能测试

7.1. sysbench

我们这里用sysbench压测工具,对原表和JSON表(原表的k,c,pad三个列用一个json格式的info列替代,另外添加k的虚拟列索引)分别进行压测,看下JSON的性能表现。

--服务器:8核64G SSD

--压测工具:sysbench(10张表,单表10w数据,8线程,4个压测场景 )

--原表

CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`k` int NOT NULL DEFAULT '0',
`c` char(120) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
`pad` char(60) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

--json表

CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`info` json DEFAULT NULL,
`k` int GENERATED ALWAYS AS (json_unquote(json_extract(`info`,_utf8mb4'$.k'))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `idx_k` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

--场景一:point-select 根据主键索引查询JSON

SELECT c FROM sbtest1 WHERE id=?

SELECT info->'$.c' FROM sbtest1_json WHERE id=?

--场景二:select_random_points 根据虚拟列索引查询(json类型用prepare会导致索引失效,所以这里用虚拟列替代)

SELECT id, k, c, pad FROM sbtest1 WHERE k in(?)

SELECT id, info FROM sbtest1_json WHERE k in(?)

--场景三:oltp_update_index 根据主键更新JSON ​​​​​​​

UPDATE sbtest1 SET k=k+1 WHERE id=?

UPDATE sbtest1_json SET info = JSON_SET(info, "$.k", convert(info->'$.k',signed) + 1) WHERE id=?

--场景四:oltp_insert json数据写入

INSERT INTO sbtest1 (id, k, c, pad) VALUES (?,?,?,?)

INSERT INTO sbtest1_json (id, info) VALUES (?,'{"k":?,"c":?,"pad":?}')
场景 原表(qps) JSON表(qps)
point-select 100973 92521
select_random_points 77922 77331
oltp_update_index 1332 1301
oltp_insert 1420 1400

总结:整体性能测试下来,JSON性能与原表相比无特大出入。

7.2. 大量KV测试

JSON列中包含大量KV的读写性能测试:

--表结构

CREATE TABLE `sbtest1` (
`id` int NOT NULL AUTO_INCREMENT,
`info` json DEFAULT NULL,
`col1` int GENERATED ALWAYS AS (json_unquote(json_extract(`info`,_utf8mb4'$.col1'))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `idx_col1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

--场景二:select_random_points 根据虚拟列索引查询

select id,info->'$.col1',info->'$.col2',info->'$.col3',info->'$.col4',info->'$.col5' FROM sbtest1 WHERE col1 in (?)

--场景三:oltp_update_index 根据主键更新JSON

UPDATE sbtest1 SET info = JSON_SET(info, "$.col1", convert(info->'$.col1',signed) + 1) WHERE id=?
场景 100列 200列 300列
select_random_points 76494 75519 74766
oltp_update_index 1013 927 869

总结:JSON读写性能会随着KV数增加而略微下降。

你可能感兴趣的:(数据库,mysql,json,数据库)