声明:本文关于MySQL中文乱码问题的解决方案均基于Windows 10操作系统,如果是Linux系统会有较多不适用之处,请谨慎参考。
一、MySQL中文乱码情况
1. sqlDevelper远程登陆MySQL客户端中文乱码
sqlDeveloper操作MySQL中文乱码
2. command-line登陆本地MySQL客户端中文乱码
控制台操作MySQL中文乱码
二、MySQL中文乱码产生原因
Windwos中文系统默认的字符编码集是gbk(扩展国标码,包括简体中文、繁体中文、朝鲜语、日本语等东亚语言),Linux系统默认的字符编码集为utf-8。然而当我们在安装MySQL的时候往往没注意,直接使用了MySQL的默认字符编码集latin1(仅支持英语即其它西欧语言),这就是导致我们在使用MySQL数据库查看记录时出现中文乱码的根本原因所在。
默认 MySQL Server Instance Configuration
三、MySQL中文乱码可能解决方案
1. 重新安装MySQL
说说产生这种想法的原因:通过google搜索,尝试了大致10种左右互联网上提到的解决MySQL中文乱码方法,但是没有一种是能够完美解决我遇到问题的合适方案。经过不知道多少次修改MySQL配置文件my.ini文件的字符集,总是出现莫名奇妙的问题,用一句话说就是"按下葫芦起了瓢"。问题详细描述见 修改MySQL配置文件
我们重新配置数据库,当然这种重新配置数据库实例是不会删除原来已存在的数据库数据,只是修改一下原有的数据库配置信息。
自定义 MySQL Server Instance Configuration 图1
自定义MySQL Server Instance Configuration 图2
我们选择reconfigurate如上MySQL Server Instance Configuration图2所示,可以选择Best Support For Multilingualism(对多语言的最佳支持)或者Manual Selected Default Character Set/Collation(手动选择默认字符集/排序方式)中选择gbk或utf-8。本次选择的字符集是:utf-8
然后经过与原来配置MySQL相同的步骤,然后启动MySQL服务(即运行mysqld)。接下去运行MySQL客户端,查看相应数据库情况,观察中文乱码问题是否解决。
MySQL客户端中文乱码
不幸的是我们获得的仍然乱码。真的是我们的方案有问题?分析下本次尝试失败的原因: 虽然重新设置了默认的字符集为utf-8,即修改了Index.xml预编译字符集配置文件,也修改了my.ini启动配置文件(包括default-charater-set=utf-8和default-set-server=utf-8),但是需要读者明白的是原来数据库的字符集已经设定,并没有被改变。
上述理论如何才能判断是正确的呢?那我就尝试一下重新建立一个MySQL数据库,数据库命名为teacher,表命名为therinfo。
呵呵,又出错了。正当黔驴技穷的时候,又想到了命令行并不支持utf-8格式(中文版windows的命令行默认支持GBK或ascii)。那么明显,如果我们从命令行输入中文字符的时候,在MySQL服务器看来就是驴唇不对马嘴,存的内容天知道是什么玩意。为了验证中文windows命令行真的不支持utf-8这个理论,我又尝试了JDBC操作MySQL数据库,然后再命令行输出。输入情况如下所示。
JDBC操作数据库在中文版windows命令行输出数据库内容
可以输出,只不过输出的中文是乱码。显然这是命令行数据库客户端与数据库服务器端所支持编码不同导致。那么可以说,用JDBC操作数据库后,在MySQL数据中的内容确实是utf-8数据格式的,但是在命令行输出看成了GBK编码,导致仍然是乱码。
(此部分建议先看MySQL原始配置文件)这样看来,似乎只要把客户端编码格式改为GBK就能够顺利解决乱码问题。为了避免修改配置文件出现的错误,我特意查看了Index.xml预编译字符集是否支持GBK 。查询结果如下所示。存在gbk编码,说明重新启动MySQL服务过程中是绝对不会出现1067错误的。
修改的情况为[mysql] default-character-set=gbk,然后保持[mysqld] default-server-set=utf-8不变或改成gbk编码。显示的结果为正确,问题顺利解决。
命令行客户端中文乱码问题解决结果
从上述步骤中,我们小结一下情况:
a) [mysql] default-character-set 是客户端默认字符集。如果采用命令行作为客户端,此字符集必须和命令行默认字符集能够匹配,否则出现中文乱码。
b) [mysql]和[mysqld]数据字符集可以不同,因为MySQL服务器配置文件Index.xml如果有预编译编码,服务器是能够识别客户端数据编码并转换成服务器端编码储存在相应的表中。关键是要在Index.xml配置文件中存在此预编译字符集。
基于这套经过实践验证的理论,开始尝试解决sqlDevelopor远程登陆MySQL客户端出现中文乱码问题。关于sqlDevelopor远程登陆MySQL数据库步骤详情参考文章《sqlDeveloper远程登陆MySQL数据管理系统》。必须明确一点,sqlDeveloper首先连接MySQL客户端,然后通过客户端再连接MySQL服务器。因此,必须保证sqlDeveloper和MySQL客户端字符编码集相同。
首先查看sqldevelopor的默认字符编码集:[帮助]-->[关于]-->[属性]-->[file.encoding],查到sqlDevelopor默认显示字符编码集为GBK。情况具体如下图所示。
Oracle SQL Developer 默认字符编码集
Oracle SQL Developer客户端和其要远程登陆的Command line客户端编码相同,均为gbk,因此数据通讯以及显示没有问题。查询结果如下图所示。
Oracle SQL Developer客户端中文乱码问题解决结果
2. 修改MySQL原始配置文件
# MySQL Server Instance Configuration File
# 翻译:MySQL服务器实例配置文件 # ---------------------------------------------------------------------- # Generated by the MySQL Server Instance Configuration Wizard # 翻译:由MySQL服务器实例配置向导自动生成 # # Installation Instructions
# 翻译:安装指南 # ---------------------------------------------------------------------- # # On Linux you can copy this file to /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options # (@localstatedir@ for this installation) or to # ~/.my.cnf to set user-specific options. # # On Windows you should keep this file in the installation directory # of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To # make sure the server reads the config file use the startup option # "--defaults-file". # 翻译:在Windows中必须将该文件(my.ini)保存在MySQL服务器的安装目录中。
# 确保服务器在使用启动项"--defaults-file"时入读该配置文件。
# # To run the server from the command line, execute this in a # command line shell, e.g.
# 翻译:为了能命令行运行服务器,需在命令行shell中执行如下命令。 # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # 注意:不同的安装路径,参数字符串内容一般不同。主要是找到my.ini绝对路径即可。
# # To install the server as a Windows service manually, execute this in a # command line shell, e.g.
# 翻译:为了能够将MySQL服务器手动安装成Windows服务,需在命令行shell中执行以下命令。 # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # And then execute this in a command line shell to start the server, e.g. # 翻译:然后在命令行shell中执行如下命令启动MySQL服务器。 # net start MySQLXY
#
# # Guildlines for editing this file
# 翻译:编辑本配置文件的指导路线 # ---------------------------------------------------------------------- # # In this file, you can use all long options that the program supports. # If you want to know the options a program supports, start the program # with the "--help" option. # # More detailed information about the individual options can also be # found in the manual. # # # CLIENT SECTION
# 翻译:客户端部分 # ---------------------------------------------------------------------- # # The following options will be read by MySQL client applications. # Note that only client applications shipped by MySQL are guaranteed # to read this section. If you want your own MySQL client program to # honor these values, you need to specify it as an option during the # MySQL client library initialization.
#
# [client] port=3306
[mysql] default-character-set=latin1 # 注意:此处需修改为gbk或utf-8 # SERVER SECTION
# 翻译:服务器端部分 # ---------------------------------------------------------------------- # # The following options will be read by the MySQL Server. Make sure that # you have installed the server correctly (see above) so it reads this # file. # 翻译:下列选项将会被MySQL服务器读取。
# 为了MySQL服务器能够读取本文件,确保已经正确安装MySQL服务器。 [mysqld] # The TCP/IP Port the MySQL Server will listen on
# 翻译:MySQL服务器将会监听的TCP/IP端口 port=3306 #Path to installation directory. All paths are usually resolved relative to this.
# 翻译:安装目录路径。所有的路径解析通常相对的就是安装目录路径。 basedir="C:/Program Files/MySQL/MySQL Server 5.5/" # 注意:basedir基础路径,即所有相对路径的基础路径。不同安装情况,具体有略微区别。
#Path to the database root
# 翻译:数据库root用户路径 datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/" # The default character set that will be used when a new schema or table is # created and no character set is defined
# 翻译:当一个新的schema或者tableb被创建,并且没有字符集被定义时将使用如下默认字符集定义。 character-set-server=latin1 # 注意:此处需要改动为gbk或utf-8。
# The default storage engine that will be used when create new tables when
# 翻译:在创建新的表时,如下默认储存引擎将会被使用 default-storage-engine=INNODB # 注意:INNODB是MySQL的数据库引擎之一,为MySQL AB公司发布binary标准之一,原本由Innobase Oy公司所开发,后被Oracle并购。
# Set the SQL mode to strict sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" # The maximum amount of concurrent sessions the MySQL server will # allow. One of these connections will be reserved for a user with # SUPER privileges to allow the administrator to login even if the # connection limit has been reached. max_connections=100 # Query cache is used to cache SELECT results and later return them # without actual executing the same query once again. Having the query # cache enabled may result in significant speed improvements, if your # have a lot of identical queries and rarely changing tables. See the # "Qcache_lowmem_prunes" status variable to check if the current value # is high enough for your load. # Note: In case your tables change very often or if your queries are # textually different every time, the query cache may result in a # slowdown instead of a performance improvement. query_cache_size=0 # The number of open tables for all threads. Increasing this value # increases the number of file descriptors that mysqld requires. # Therefore you have to make sure to set the amount of open files # allowed to at least 4096 in the variable "open-files-limit" in # section [mysqld_safe] table_cache=256 # Maximum size for internal (in-memory) temporary tables. If a table # grows larger than this value, it is automatically converted to disk # based table This limitation is for a single table. There can be many # of them. tmp_table_size=18M # How many threads we should keep in a cache for reuse. When a client # disconnects, the client's threads are put in the cache if there aren't # more than thread_cache_size threads from before. This greatly reduces # the amount of thread creations needed if you have a lot of new # connections. (Normally this doesn't give a notable performance # improvement if you have a good thread implementation.) thread_cache_size=8 #*** MyISAM Specific options # The maximum size of the temporary file MySQL is allowed to use while # recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE. # If the file-size would be bigger than this, the index will be created # through the key cache (which is slower). myisam_max_sort_file_size=100G # If the temporary file used for fast index creation would be bigger # than using the key cache by the amount specified here, then prefer the # key cache method. This is mainly used to force long character keys in # large tables to use the slower key cache method to create the index. myisam_sort_buffer_size=35M # Size of the Key Buffer, used to cache index blocks for MyISAM tables. # Do not set it larger than 30% of your available memory, as some memory # is also required by the OS to cache rows. Even if you're not using # MyISAM tables, you should still set it to 8-64M as it will also be # used for internal temporary disk tables. key_buffer_size=25M # Size of the buffer used for doing full table scans of MyISAM tables. # Allocated per thread, if a full scan is needed. read_buffer_size=64K read_rnd_buffer_size=256K # This buffer is allocated when MySQL needs to rebuild the index in # REPAIR, OPTIMZE, ALTER table statements as well as in LOAD DATA INFILE # into an empty table. It is allocated per thread so be careful with # large settings. sort_buffer_size=256K #*** INNODB Specific options *** # Use this option if you have a MySQL server with InnoDB support enabled # but you do not plan to use it. This will save memory and disk space # and speed up some things. # skip-innodb # Additional memory pool that is used by InnoDB to store metadata # information. If InnoDB requires more memory for this purpose it will # start to allocate it from the OS. As this is fast enough on most # recent operating systems, you normally do not need to change this # value. SHOW INNODB STATUS will display the current amount used. innodb_additional_mem_pool_size=2M # If set to 1, InnoDB will flush (fsync) the transaction logs to the # disk at each commit, which offers full ACID behavior. If you are # willing to compromise this safety, and you are running small # transactions, you may set this to 0 or 2 to reduce disk I/O to the # logs. Value 0 means that the log is only written to the log file and # the log file flushed to disk approximately once per second. Value 2 # means the log is written to the log file at each commit, but the log # file is only flushed to disk approximately once per second. innodb_flush_log_at_trx_commit=1 # The size of the buffer InnoDB uses for buffering log data. As soon as # it is full, InnoDB will have to flush it to disk. As it is flushed # once per second anyway, it does not make sense to have it very large # (even with long transactions). innodb_log_buffer_size=1M # InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and # row data. The bigger you set this the less disk I/O is needed to # access data in tables. On a dedicated database server you may set this # parameter up to 80% of the machine physical memory size. Do not set it # too large, though, because competition of the physical memory may # cause paging in the operating system. Note that on 32bit systems you # might be limited to 2-3.5G of user level memory per process, so do not # set it too high. innodb_buffer_pool_size=47M # Size of each log file in a log group. You should set the combined size # of log files to about 25%-100% of your buffer pool size to avoid # unneeded buffer pool flush activity on log file overwrite. However, # note that a larger logfile size will increase the time needed for the # recovery process. innodb_log_file_size=24M # Number of threads allowed inside the InnoDB kernel. The optimal value # depends highly on the application, hardware as well as the OS # scheduler properties. A too high value may lead to thread thrashing. innodb_thread_concurrency=10
当默认字符集/排序方式为latin1时,修改配置文件中遇到的问题集锦:
2.1)MySQL默认字符集,客户端、服务器端均设置为gbk
[mysql] default-charactter-set=gbk [mysqld] default-set-server=gbk
关闭MySQL服务后,再次启动MySQL服务正常。但是输入SQL语句SELECT * FROM XXX后显示的仍然为乱码。
2.2)MySQL默认字符集,客户端设置为utf-8,服务器设置为gbk
[mysql] default-charactter-set=utf-8 [mysqld] default-set-server=gbk
关闭MySQL服务后,再次启动MySQL服务正常。但是在进入命令行客户端时报错,报错信息如下所示:
mysql: Character set 'utf-8' is not a compiled character set and is not specified in the 'C:\Program Files\MySQL\MySQL Server 5.5\share\charsets\Index.xml' file ERROR 2019 (HY000): Can't initialize character set utf-8 (path: C:\Program Files\MySQL\MySQL Server 5.5\share\charsets\)
mysql客户端(即用mysql -u root -p登录的客户端):字符集'utf-8'非已编译过字符集,在Index.xml配置文件中没有被指定。错误的原因是无法初始化'utf-8'。
2.3)MySQL默认字符集,客户端设置为utf-8,服务器设置为utf-8
[mysql] default-charactter-set=utf-8 [mysqld] default-set-server=utf-8
关闭MySQL服务后,再次尝试启动MySQL服务,此时无法启动Windows MySQL服务,出现错误1067。连查询的是否为乱码的机会都没有给。
MySQL服务启动失败
2.1~2.3问题分析详见 MySQL预编译字符集问题
注意:本文所有修改配置文件的问题都是基于在修改配置文件重新启动(先关闭,再启动。启动的方式手动或自动)MySQL服务(即启动mysqld程序)。如下图所示:
MySQL服务
3. 命令行修改MySQL字符集
如果只需临时使用MySQL,还可以在命令行临时修改配置。这种方式在每次重新启动命令行后都必须重新修改,因此不推荐。google搜索出来的解决方案,很大一部分采用这种治标不治本的方法,当然这也不失为一种解决方案,因此文本也将其列为一种方法。再次强调:本文不推荐此方法。
四、修改配置文件问题及解决方案
1. 修改配置文件my.ini权限问题
在修改完毕后,保存过程中出现"拒绝访问"问题。由于MySQL默认被安装在C盘,因此首先查看了权限,发现权限是运行写入的。就在这时候灵机一动,大不了把文件移到桌面修改,改完之后再放到相应文件夹下,这就省事多了。
2. MySQL预编译字符集问题
如果用默认的安装,即默认的字符集为latin1时,诸如gbk和utf-8在Index.xml是没有被定义的。当时我在没有重新配置MySQL前,就想在网上下载一份utf-8的Index.xml,可惜没找到。当经过我重新配置MySQL后,采用utf-8后,有了utf-8和gbk等字符集。采用utf-8后Index.xml如下所示情况。
1 <?xml version='1.0' encoding="utf-8"?> 2 3 <charsets max-id="99"> 4 5 <copyright> 6 Copyright (c) 2003-2005 MySQL AB 7 Use is subject to license terms 8 9 This program is free software; you can redistribute it and/or modify 10 it under the terms of the GNU General Public License as published by 11 the Free Software Foundation; version 2 of the License. 12 13 This program is distributed in the hope that it will be useful, 14 but WITHOUT ANY WARRANTY; without even the implied warranty of 15 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 GNU General Public License for more details. 17 18 You should have received a copy of the GNU General Public License 19 along with this program; if not, write to the Free Software 20 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA 21 </copyright> 22 23 <description> 24 This file lists all of the available character sets. 25 To make maintaining easier please: 26 - keep records sorted by collation number. 27 - change charsets.max-id when adding a new collation. 28 </description> 29 30 <charset name="big5"> 31 <family>Traditional Chinese</family> 32 <description>Big5 Traditional Chinese</description> 33 <alias>big-5</alias> 34 <alias>bigfive</alias> 35 <alias>big-five</alias> 36 <alias>cn-big5</alias> 37 <alias>csbig5</alias> 38 <collation name="big5_chinese_ci" id="1" order="Chinese"> 39 <flag>primary</flag> 40 <flag>compiled</flag> 41 </collation> 42 <collation name="big5_bin" id="84" order="Binary"> 43 <flag>binary</flag> 44 <flag>compiled</flag> 45 </collation> 46 </charset> 47 48 <charset name="latin2"> 49 <family>Central European</family> 50 <description>ISO 8859-2 Central European</description> 51 <alias>csisolatin2</alias> 52 <alias>iso-8859-2</alias> 53 <alias>iso-ir-101</alias> 54 <alias>iso_8859-2</alias> 55 <alias>iso_8859-2:1987</alias> 56 <alias>l2</alias> 57 <collation name="latin2_czech_cs" id="2" order="Czech" flag="compiled"/> 58 <collation name="latin2_general_ci" id="9" flag="primary"> 59 <order>Hungarian</order> 60 <order>Polish</order> 61 <order>Romanian</order> 62 <order>Croatian</order> 63 <order>Slovak</order> 64 <order>Slovenian</order> 65 <order>Sorbian</order> 66 </collation> 67 <collation name="latin2_hungarian_ci" id="21" order="Hungarian"/> 68 <collation name="latin2_croatian_ci" id="27" order="Croatian"/> 69 <collation name="latin2_bin" id="77" order="Binary" flag="binary"/> 70 </charset> 71 72 <charset name="dec8"> 73 <family>Western</family> 74 <description>DEC West European</description> 75 <collation name="dec8_bin" id="69" order="Binary" flag="binary"/> 76 <collation name="dec8_swedish_ci" id="3" flag="primary"> 77 <order>Dutch</order> 78 <order>English</order> 79 <order>French</order> 80 <order>German Duden</order> 81 <order>Italian</order> 82 <order>Latin</order> 83 <order>Portuguese</order> 84 <order>Spanish</order> 85 </collation> 86 </charset> 87 88 <charset name="cp850"> 89 <family>Western</family> 90 <description>DOS West European</description> 91 <alias>850</alias> 92 <alias>cspc850multilingual</alias> 93 <alias>ibm850</alias> 94 <collation name="cp850_general_ci" id="4" flag="primary"> 95 <order>Dutch</order> 96 <order>English</order> 97 <order>French</order> 98 <order>German Duden</order> 99 <order>Italian</order> 100 <order>Latin</order> 101 <order>Portuguese</order> 102 <order>Spanish</order> 103 </collation> 104 <collation name="cp850_bin" id="80" order="Binary" flag="binary"/> 105 </charset> 106 107 <charset name="latin1"> 108 <family>Western</family> 109 <description>cp1252 West European</description> 110 <alias>csisolatin1</alias> 111 <alias>iso-8859-1</alias> 112 <alias>iso-ir-100</alias> 113 <alias>iso_8859-1</alias> 114 <alias>iso_8859-1:1987</alias> 115 <alias>l1</alias> 116 <alias>latin1</alias> 117 <collation name="latin1_german1_ci" id="5" order="German Duden"/> 118 <collation name="latin1_swedish_ci" id="8" order="Finnish, Swedish"> 119 <flag>primary</flag> 120 <flag>compiled</flag> 121 </collation> 122 <collation name="latin1_danish_ci" id="15" order="Danish"/> 123 <collation name="latin1_german2_ci" id="31" order="German Phonebook" flag="compiled"/> 124 <collation name="latin1_spanish_ci" id="94" order="Spanish"/> 125 <collation name="latin1_bin" id="47" order="Binary"> 126 <flag>binary</flag> 127 <flag>compiled</flag> 128 </collation> 129 <collation name="latin1_general_ci" id="48"> 130 <order>Dutch</order> 131 <order>English</order> 132 <order>French</order> 133 <order>German Duden</order> 134 <order>Italian</order> 135 <order>Latin</order> 136 <order>Portuguese</order> 137 <order>Spanish</order> 138 </collation> 139 <collation name="latin1_general_cs" id="49"> 140 <order>Dutch</order> 141 <order>English</order> 142 <order>French</order> 143 <order>German Duden</order> 144 <order>Italian</order> 145 <order>Latin</order> 146 <order>Portuguese</order> 147 <order>Spanish</order> 148 </collation> 149 </charset> 150 151 <charset name="hp8"> 152 <family>Western</family> 153 <description>HP West European</description> 154 <alias>hproman8</alias> 155 <collation name="hp8_bin" id="72" order="Binary" flag="binary"/> 156 <collation name="hp8_english_ci" id="6" flag="primary"> 157 <order>Dutch</order> 158 <order>English</order> 159 <order>French</order> 160 <order>German Duden</order> 161 <order>Italian</order> 162 <order>Latin</order> 163 <order>Portuguese</order> 164 <order>Spanish</order> 165 </collation> 166 </charset> 167 168 <charset name="koi8r"> 169 <family>Cyrillic</family> 170 <description>KOI8-R Relcom Russian</description> 171 <alias>koi8-r</alias> 172 <alias>cskoi8r</alias> 173 <collation name="koi8r_general_ci" id="7" order="Russian" flag="primary"/> 174 <collation name="koi8r_bin" id="74" order="Binary" flag="binary"/> 175 </charset> 176 177 <charset name="swe7"> 178 <family>Western</family> 179 <description>7bit Swedish</description> 180 <alias>iso-646-se</alias> 181 <collation name="swe7_swedish_ci" id="10" order="Swedish" flag="primary"/> 182 <collation name="swe7_bin" id="82" order="Binary" flag="binary"/> 183 </charset> 184 185 <charset name="ascii"> 186 <family>Western</family> 187 <description>US ASCII</description> 188 <alias>us</alias> 189 <alias>us-ascii</alias> 190 <alias>csascii</alias> 191 <alias>iso-ir-6</alias> 192 <alias>iso646-us</alias> 193 <collation name="ascii_general_ci" id="11" order="English" flag="primary"/> 194 <collation name="ascii_bin" id="65" order="Binary" flag="binary"/> 195 </charset> 196 197 <charset name="ujis"> 198 <family>Japanese</family> 199 <description>EUC-JP Japanese</description> 200 <alias>euc-jp</alias> 201 <collation name="ujis_japanese_ci" id="12" order="Japanese"> 202 <flag>primary</flag> 203 <flag>compiled</flag> 204 </collation> 205 <collation name="ujis_bin" id="91" order="Japanese"> 206 <flag>binary</flag> 207 <flag>compiled</flag> 208 </collation> 209 </charset> 210 211 <charset name="sjis"> 212 <family>Japanese</family> 213 <description>Shift-JIS Japanese</description> 214 <alias>s-jis</alias> 215 <alias>shift-jis</alias> 216 <alias>x-sjis</alias> 217 <collation name="sjis_japanese_ci" id="13" order="Japanese"> 218 <flag>primary</flag> 219 <flag>compiled</flag> 220 </collation> 221 <collation name="sjis_bin" id="88" order="Binary"> 222 <flag>binary</flag> 223 <flag>compiled</flag> 224 </collation> 225 </charset> 226 227 <charset name="cp1251"> 228 <family>Cyrillic</family> 229 <description>Windows Cyrillic</description> 230 <alias>windows-1251</alias> 231 <alias>ms-cyr</alias> 232 <alias>ms-cyrillic</alias> 233 <collation name="cp1251_bulgarian_ci" id="14"> 234 <order>Belarusian</order> 235 <order>Bulgarian</order> 236 <order>Macedonian</order> 237 <order>Russian</order> 238 <order>Serbian</order> 239 <order>Mongolian</order> 240 <order>Ukrainian</order> 241 </collation> 242 <collation name="cp1251_ukrainian_ci" id="23" order="Ukrainian"/> 243 <collation name="cp1251_bin" id="50" order="Binary" flag="binary"/> 244 <collation name="cp1251_general_ci" id="51" flag="primary"> 245 <order>Belarusian</order> 246 <order>Bulgarian</order> 247 <order>Macedonian</order> 248 <order>Russian</order> 249 <order>Serbian</order> 250 <order>Mongolian</order> 251 <order>Ukrainian</order> 252 </collation> 253 <collation name="cp1251_general_cs" id="52"> 254 <order>Belarusian</order> 255 <order>Bulgarian</order> 256 <order>Macedonian</order> 257 <order>Russian</order> 258 <order>Serbian</order> 259 <order>Mongolian</order> 260 <order>Ukrainian</order> 261 </collation> 262 </charset> 263 264 <charset name="hebrew"> 265 <family>Hebrew</family> 266 <description>ISO 8859-8 Hebrew</description> 267 <alias>csisolatinhebrew</alias> 268 <alias>iso-8859-8</alias> 269 <alias>iso-ir-138</alias> 270 <collation name="hebrew_general_ci" id="16" order="Hebrew" flag="primary"/> 271 <collation name="hebrew_bin" id="71" order="Binary" flag="binary"/> 272 </charset> 273 274 <charset name="tis620"> 275 <family>Thai</family> 276 <description>TIS620 Thai</description> 277 <alias>tis-620</alias> 278 <collation name="tis620_thai_ci" id="18" order="Thai"> 279 <flag>primary</flag> 280 <flag>compiled</flag> 281 </collation> 282 <collation name="tis620_bin" id="89" order="Binary"> 283 <flag>binary</flag> 284 <flag>compiled</flag> 285 </collation> 286 </charset> 287 288 <charset name="euckr"> 289 <family>Korean</family> 290 <description>EUC-KR Korean</description> 291 <alias>euc_kr</alias> 292 <alias>euc-kr</alias> 293 <collation name="euckr_korean_ci" id="19" order="Korean"> 294 <flag>primary</flag> 295 <flag>compiled</flag> 296 </collation> 297 <collation name="euckr_bin" id="85"> 298 <flag>binary</flag> 299 <flag>compiled</flag> 300 </collation> 301 </charset> 302 303 <charset name="latin7"> 304 <family>Baltic</family> 305 <description>ISO 8859-13 Baltic</description> 306 <alias>BalticRim</alias> 307 <alias>iso-8859-13</alias> 308 <alias>l7</alias> 309 <collation name="latin7_estonian_cs" id="20"> 310 <order>Estonian</order> 311 </collation> 312 <collation name="latin7_general_ci" id="41"> 313 <order>Latvian</order> 314 <order>Lithuanian</order> 315 <flag>primary</flag> 316 </collation> 317 <collation name="latin7_general_cs" id="42"> 318 <order>Latvian</order> 319 <order>Lithuanian</order> 320 </collation> 321 <collation name="latin7_bin" id="79" order="Binary" flag="binary"/> 322 </charset> 323 324 <charset name="koi8u"> 325 <family>Cyrillic</family> 326 <description>KOI8-U Ukrainian</description> 327 <alias>koi8-u</alias> 328 <collation name="koi8u_general_ci" id="22" order="Ukranian" flag="primary"/> 329 <collation name="koi8u_bin" id="75" order="Binary" flag="binary"/> 330 </charset> 331 332 <charset name="gb2312"> 333 <family>Simplified Chinese</family> 334 <description>GB2312 Simplified Chinese</description> 335 <alias>chinese</alias> 336 <alias>iso-ir-58</alias> 337 <collation name="gb2312_chinese_ci" id="24" order="Chinese"> 338 <flag>primary</flag> 339 <flag>compiled</flag> 340 </collation> 341 <collation name="gb2312_bin" id="86"> 342 <flag>binary</flag> 343 <flag>compiled</flag> 344 </collation> 345 </charset> 346 347 <charset name="greek"> 348 <family>Greek</family> 349 <description>ISO 8859-7 Greek</description> 350 <alias>csisolatingreek</alias> 351 <alias>ecma-118</alias> 352 <alias>greek8</alias> 353 <alias>iso-8859-7</alias> 354 <alias>iso-ir-126</alias> 355 <collation name="greek_general_ci" id="25" order="Greek" flag="primary"/> 356 <collation name="greek_bin" id="70" order="Binary" flag="binary"/> 357 </charset> 358 359 <charset name="cp1250"> 360 <family>Central European</family> 361 <description>Windows Central European</description> 362 <alias>ms-ce</alias> 363 <alias>windows-1250</alias> 364 <collation name="cp1250_general_ci" id="26" flag="primary"> 365 <order>Hungarian</order> 366 <order>Polish</order> 367 <order>Romanian</order> 368 <order>Croatian</order> 369 <order>Slovak</order> 370 <order>Slovenian</order> 371 <order>Sorbian</order> 372 </collation> 373 <collation name="cp1250_croatian_ci" id="44"> 374 <order>Croatian</order> 375 </collation> 376 <collation name="cp1250_polish_ci" id="99"> 377 <order>Polish</order> 378 </collation> 379 <collation name="cp1250_czech_cs" id="34" order="Czech"> 380 <flag>compiled</flag> 381 </collation> 382 <collation name="cp1250_bin" id="66" order="Binary" flag="binary"/> 383 </charset> 384 385 <charset name="gbk"> 386 <family>East Asian</family> 387 <description>GBK Simplified Chinese</description> 388 <alias>cp936</alias> 389 <collation name="gbk_chinese_ci" id="28" order="Chinese"> 390 <flag>primary</flag> 391 <flag>compiled</flag> 392 </collation> 393 <collation name="gbk_bin" id="87" order="Binary"> 394 <flag>binary</flag> 395 <flag>compiled</flag> 396 </collation> 397 </charset> 398 399 <charset name="cp1257"> 400 <family>Baltic</family> 401 <description>Windows Baltic</description> 402 <alias>WinBaltRim</alias> 403 <alias>windows-1257</alias> 404 <collation name="cp1257_lithuanian_ci" id="29" order="Lithuanian"/> 405 <collation name="cp1257_bin" id="58" order="Binary" flag="binary"/> 406 <collation name="cp1257_general_ci" id="59" flag="primary"> 407 <order>Latvian</order> 408 <order>Lithuanian</order> 409 </collation> 410 <!--collation name="cp1257_ci" id="60"/--> 411 <!--collation name="cp1257_cs" id="61"/--> 412 </charset> 413 414 <charset name="latin5"> 415 <family>South Asian</family> 416 <description>ISO 8859-9 Turkish</description> 417 <alias>csisolatin5</alias> 418 <alias>iso-8859-9</alias> 419 <alias>iso-ir-148</alias> 420 <alias>l5</alias> 421 <alias>latin5</alias> 422 <alias>turkish</alias> 423 <collation name="latin5_turkish_ci" id="30" order="Turkish" flag="primary"/> 424 <collation name="latin5_bin" id="78" order="Binary" flag="binary"/> 425 </charset> 426 427 <charset name="armscii8"> 428 <family>South Asian</family> 429 <description>ARMSCII-8 Armenian</description> 430 <alias>armscii-8</alias> 431 <collation name="armscii8_general_ci" id="32" order="Armenian" flag="primary"/> 432 <collation name="armscii8_bin" id="64" order="Binary" flag="binary"/> 433 </charset> 434 435 <charset name="utf8"> 436 <family>Unicode</family> 437 <description>UTF-8 Unicode</description> 438 <alias>utf-8</alias> 439 <collation name="utf8_general_ci" id="33"> 440 <flag>primary</flag> 441 <flag>compiled</flag> 442 </collation> 443 <collation name="utf8_bin" id="83"> 444 <flag>binary</flag> 445 <flag>compiled</flag> 446 </collation> 447 </charset> 448 449 <charset name="ucs2"> 450 <family>Unicode</family> 451 <description>UCS-2 Unicode</description> 452 <collation name="ucs2_general_ci" id="35"> 453 <flag>primary</flag> 454 <flag>compiled</flag> 455 </collation> 456 <collation name="ucs2_bin" id="90"> 457 <flag>binary</flag> 458 <flag>compiled</flag> 459 </collation> 460 </charset> 461 462 <charset name="cp866"> 463 <family>Cyrillic</family> 464 <description>DOS Russian</description> 465 <alias>866</alias> 466 <alias>csibm866</alias> 467 <alias>ibm866</alias> 468 <alias>DOSCyrillicRussian</alias> 469 <collation name="cp866_general_ci" id="36" order="Russian" flag="primary"/> 470 <collation name="cp866_bin" id="68" order="Binary" flag="binary"/> 471 </charset> 472 473 <charset name="keybcs2"> 474 <family>Central European</family> 475 <description>DOS Kamenicky Czech-Slovak</description> 476 <collation name="keybcs2_general_ci" id="37" order="Czech" flag="primary"/> 477 <collation name="keybcs2_bin" id="73" order="Binary" flag="binary"/> 478 </charset> 479 480 <charset name="macce"> 481 <family>Central European</family> 482 <description>Mac Central European</description> 483 <alias>MacCentralEurope</alias> 484 <collation name="macce_general_ci" id="38" flag="primary"> 485 <order>Hungarian</order> 486 <order>Polish</order> 487 <order>Romanian</order> 488 <order>Croatian</order> 489 <order>Slovak</order> 490 <order>Slovenian</order> 491 <order>Sorbian</order> 492 </collation> 493 <collation name="macce_bin" id="43" order="Binary" flag="binary"/> 494 </charset> 495 496 <charset name="macroman"> 497 <family>Western</family> 498 <description>Mac West European</description> 499 <alias>Mac</alias> 500 <alias>Macintosh</alias> 501 <alias>csmacintosh</alias> 502 <collation name="macroman_general_ci" id="39" flag="primary"> 503 <order>Dutch</order> 504 <order>English</order> 505 <order>French</order> 506 <order>German Duden</order> 507 <order>Italian</order> 508 <order>Latin</order> 509 <order>Portuguese</order> 510 <order>Spanish</order> 511 </collation> 512 <collation name="macroman_bin" id="53" order="Binary" flag="binary"/> 513 <!--collation name="macroman_ci" id="54"/--> 514 <!--collation name="macroman_ci_ai" id="55"/--> 515 <!--collation name="macroman_cs" id="56"/--> 516 </charset> 517 518 <charset name="cp852"> 519 <family>Central European</family> 520 <description>DOS Central European</description> 521 <alias>852</alias> 522 <alias>cp852</alias> 523 <alias>ibm852</alias> 524 <collation name="cp852_general_ci" id="40" flag="primary"> 525 <order>Hungarian</order> 526 <order>Polish</order> 527 <order>Romanian</order> 528 <order>Croatian</order> 529 <order>Slovak</order> 530 <order>Slovenian</order> 531 <order>Sorbian</order> 532 </collation> 533 <collation name="cp852_bin" id="81" order="Binary" flag="binary"/> 534 </charset> 535 536 <charset name="cp1256"> 537 <family>Arabic</family> 538 <description>Windows Arabic</description> 539 <alias>ms-arab</alias> 540 <alias>windows-1256</alias> 541 <collation name="cp1256_bin" id="67" order="Binary" flag="binary"/> 542 <collation name="cp1256_general_ci" id="57" order="Arabic" flag="primary"> 543 <order>Arabic</order> 544 <order>Persian</order> 545 <order>Pakistani</order> 546 <order>Urdu</order> 547 </collation> 548 </charset> 549 550 <charset name="geostd8"> 551 <family>South Asian</family> 552 <description>GEOSTD8 Georgian</description> 553 <collation name="geostd8_general_ci" id="92" order="Georgian" flag="primary"/> 554 <collation name="geostd8_bin" id="93" order="Binary" flag="binary"/> 555 </charset> 556 557 <charset name="binary"> 558 <description>Binary pseudo charset</description> 559 <collation name="binary" id="63" order="Binary"> 560 <flag>primary</flag> 561 <flag>compiled</flag> 562 </collation> 563 </charset> 564 565 <charset name="cp932"> 566 <family>Japanese</family> 567 <description>SJIS for Windows Japanese</description> 568 <alias>ms_cp932</alias> 569 <alias>sjis_cp932</alias> 570 <alias>sjis_ms</alias> 571 <collation name="cp932_japanese_ci" id="95" order="Japanese"> 572 <flag>primary</flag> 573 <flag>compiled</flag> 574 </collation> 575 <collation name="cp932_bin" id="96" order="Binary"> 576 <flag>binary</flag> 577 <flag>compiled</flag> 578 </collation> 579 </charset> 580 581 <charset name="eucjpms"> 582 <family>Japanese</family> 583 <description>UJIS for Windows Japanese</description> 584 <alias>eucjpms</alias> 585 <alias>eucJP_ms</alias> 586 <alias>ujis_ms</alias> 587 <alias>ujis_cp932</alias> 588 <collation name="eucjpms_japanese_ci" id="97" order="Japanese"> 589 <flag>primary</flag> 590 <flag>compiled</flag> 591 </collation> 592 <collation name="eucjpms_bin" id="98" order="Japanese"> 593 <flag>binary</flag> 594 <flag>compiled</flag> 595 </collation> 596 </charset> 597 598 </charsets>
五、思考总结
经过大致3~4天的理论积累以及实践操作,终于完成了这篇文章。可以这么说吧,这篇博客可能是我查询文献最多以及写作时间最长的一篇。处女座的我,最受不了网上那些只告诉你这么干,不告诉为什么这么干的,或者只是照搬照抄没有探索精神的文章。写下本文望能与后来者共勉。
文章的排版可能不是特别合理,因为一直没有搞定博客园html的跳转问题,略微抱歉。
1. 字符编码
utf-8字符集包含GBK字符集,但是这种包含关系并不是说utf-8的字符集包含的GBK字符集完全是一致的,而是指GBK的所有字符集都能够在utf-8中找到相应的映射。例如,在GBK中中文字符占2字节,而在变长字符集utf-8中中文字符占3字节。不同字符集在MySQL数据管理系统内部能够自动映射,但是在客户端中显示时必须匹配合适的字符编码集,否则无法正确显示,即出现乱码问题。
2. mysql与mysqld区别
mysql指客户端;mysqld=mysql daemon指mysql守护进程,即mysql后台服务器端。当我们在命令行输入mysqld -install后,即安装完成mysqld,此时在服务中出现MySQL服务。启动MySQL服务,等价于启动mysqld,即启动mysql服务器。此后,才能在mysql客户端登陆操作mysql数据库。
3. 数据管理系统字符集层次
MySQL数据库字符集具有多层次。在配置文件my.ini中仅定义default-character-set、default-set-server,然而MySQL服务器中却存在多层次的字符集结构。
character-set-client 客户端发送给服务器SQL语句字符集
character-set-connection sokect连接字符集
character-set-results 客户端通过SELECT语句显示数据库查询结果字符集
这三种字符集一般是与default-character-set字符集相同的。当然也可以自定义设置。
character-set-database 数据库储存的数据字符集
character-set-server 服务器字符集
character-set-system 系统字符集
这三种字符集一般是与default-set-server字符集相同的。当然也可以自定义设置。
character-set-filesystem 文件系统字符集
MySQL数据库字符集
可以通过在命令行模糊查询:SHOW variables LIKE 'char%'; 显示上述图所示类似内容。
本文系作者原创,转载需要通过作者本人同意。