作者:俊达
在数据库设计和管理过程中,表结构中的Comment字段乱码可能由DDL实际编码和character_set_client设置不一致引起。下面是一个详细的例子:
$ cat test_comment_utf8.sql
create table test_comment_utf8(a int comment '列列列列列AAA')
comment 'BBBB表表表表表BBBB';
$ od -t x1 test_comment_utf8.sql
0000000 63 72 65 61 74 65 20 74 61 62 6c 65 20 74 65 73
0000020 74 5f 63 6f 6d 6d 65 6e 74 5f 75 74 66 38 28 61
0000040 20 69 6e 74 20 63 6f 6d 6d 65 6e 74 20 27 e5 88
0000060 97 e5 88 97 e5 88 97 e5 88 97 e5 88 97 41 41 41
0000100 27 29 20 63 6f 6d 6d 65 6e 74 20 27 42 42 42 42
0000120 e8 a1 a8 e8 a1 a8 e8 a1 a8 e8 a1 a8 e8 a1 a8 42
0000140 42 42 42 27 3b 0a
0000146
上述SQL文件以UTF8编码,其中“列”的utf8编码为e5 88 97,“表”的utf8编码为e8 a1 a8。现在,将character_set_client设置为gbk,执行上述SQL:
mysql> set character_set_client=gbk;
Query OK, 0 rows affected (0.00 sec)
mysql> drop table test_comment_utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> source test_comment_utf8.sql;
Query OK, 0 rows affected (0.00 sec)
表和字段的注释出现了乱码,元数据存储的数据也不对:
mysql> show create table test_comment_utf8\G
*************************** 1. row ***************************
Table: test_comment_utf8
Create Table: CREATE TABLE `test_comment_utf8` (
`a` int DEFAULT NULL COMMENT '鍒楀垪鍒楀垪鍒桝AA'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
COMMENT='BBBB琛ㄨ〃琛ㄨ〃琛˙BBB'
1 row in set (0.00 sec)
mysql> select table_name, table_comment, hex(table_comment) from information_schema.tables where table_name = 'test_comment_utf8';
+-------------------+-------------------------+--------------------------------------------------------------+
| TABLE_NAME | TABLE_COMMENT | hex(table_comment) |
+-------------------+-------------------------+--------------------------------------------------------------+
| test_comment_utf8 | BBBB琛ㄨ〃琛ㄨ〃琛˙BBB | 42424242E7909BE384A8E38083E7909BE384A8E38083E7909BCB99424242 |
+-------------------+-------------------------+--------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select column_name, column_comment, hex(column_comment) from information_schema.columns where table_name = 'test_comment_utf8';
+-------------+--------------------+------------------------------------------------------+
| COLUMN_NAME | COLUMN_COMMENT | hex(column_comment) |
+-------------+--------------------+------------------------------------------------------+
| a | 鍒楀垪鍒楀垪鍒桝AA | E98D92E6A580E59EAAE98D92E6A580E59EAAE98D92E6A19D4141 |
+-------------+--------------------+------------------------------------------------------+
1 row in set (0.01 sec)
mysql> set names 'utf8';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show create table test_comment_utf8\G
*************************** 1. row ***************************
Table: test_comment_utf8
Create Table: CREATE TABLE `test_comment_utf8` (
`a` int DEFAULT NULL COMMENT '鍒楀垪鍒楀垪鍒桝AA'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
COMMENT='BBBB琛ㄨ〃琛ㄨ〃琛˙BBB'
1 row in set (0.00 sec)
mysql> set names 'gbk';
Query OK, 0 rows affected (0.00 sec)
mysql> show create table test_comment_utf8\G
*************************** 1. row ***************************
Table: test_comment_utf8
Create Table: CREATE TABLE `test_comment_utf8` (
`a` int DEFAULT NULL COMMENT '列列列列列AAA'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
COMMENT='BBBB表表表表表BBBB'
1 row in set (0.00 sec)
mysql> set names 'utf8';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show create table test_comment_utf8\G
*************************** 1. row ***************************
Table: test_comment_utf8
Create Table: CREATE TABLE `test_comment_utf8` (
`a` int DEFAULT NULL COMMENT '閸掓鍨崚妤鍨崚妗滱A'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
COMMENT='BBBB鐞涖劏銆冪悰銊ㄣ冪悰藱BBB'
1 row in set (0.00 sec)
mysql> set names 'gbk';
Query OK, 0 rows affected (0.00 sec)
mysql> show create table test_comment_utf8\G
*************************** 1. row ***************************
Table: test_comment_utf8
Create Table: CREATE TABLE `test_comment_utf8` (
`a` int DEFAULT NULL COMMENT '鍒楀垪鍒楀垪鍒桝AA'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
COMMENT='BBBB琛ㄨ〃琛ㄨ〃琛˙BBB'
1 row in set (0.00 sec)
在数据库设计和表创建过程中,确保正确设置字符集相关参数是至关重要的。如果在建表时字符集相关参数设置不正确,可能导致表的元数据出现乱码,包括表名、列名、表和列的注释,以及存储过程中的字符。为了维护数据的完整性和可读性,应该确保在整个数据库生命周期中都使用一致的字符集设置。在处理不同字符集的情况下,谨慎选择并正确配置字符集相关参数,以防止出现乱码问题。及时检查和修复字符集不一致的情况,是确保数据库正常运作的关键步骤。
更多技术信息请查看云掣官网https://yunche.pro/?t=yrgw