问题是MySQL在进行字符串比较时会忽略尾随空格.看到
http://dev.mysql.com/doc/refman/5.7/en/char.html
All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.
…
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains ‘a’, an attempt to store ‘a ‘ causes a duplicate-key error.
(此信息适用于5.7;对于8.0,此信息已更改,请参见下文)
like运算符的部分给出了这种行为的一个例子(并表明喜欢跟踪空格):
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
不幸的是,UNIQUE索引似乎使用标准的字符串比较来检查是否已经有这样的值,因此忽略了尾随空格.
这与使用VARCHAR或CHAR无关,在这两种情况下插入都被拒绝,因为唯一检查失败.如果有一种方法可以使用类似语义进行UNIQUE检查,那么我不知道.
你可以做的是将值存储为VARBINARY:
mysql> create table test_ws ( `value` varbinary(255) UNIQUE );
Query OK, 0 rows affected (0.13 sec)
mysql> insert into test_ws (`value`) VALUES ('a');
Query OK, 1 row affected (0.08 sec)
mysql> insert into test_ws (`value`) VALUES ('a ');
Query OK, 1 row affected (0.06 sec)
mysql> SELECT CONCAT( '(', value, ')' ) FROM test_ws;
+---------------------------+
| CONCAT( '(', value, ')' ) |
+---------------------------+
| (a) |
| (a ) |
+---------------------------+
2 rows in set (0.00 sec)
您最好不要在此列上按字母顺序排序,因为排序将发生在字节值上,而不是用户期望的(大多数用户,无论如何).
另一种方法是修补MySQL并编写自己的归类,类型为NO PAD.不确定是否有人想这样做,但如果你这样做,请告诉我;)
Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD.
Unicode collations based on UCA versions later than 4.0.0 include the version in the collation name. Thus, utf8mb4_unicode_520_ci is based on UCA 5.2.0 weight keys, whereas utf8mb4_0900_ai_ci is based on UCA 9.0.0 weight keys.
所以,如果你尝试:
create table test_ws ( `value` varbinary(255) UNIQUE )
character set utf8mb4 collate utf8mb4_0900_ai_ci;
您可以插入带有和不带尾随空格的值
您可以找到所有可用的NO PAD排序规则:
show collation where Pad_attribute='NO PAD';