All strings sent from the JDBC driver to the server are converted automatically from native Java Unicode form to the client character encoding, including all queries sent usingStatement.execute()
,Statement.executeUpdate()
,Statement.executeQuery()
as well as allPreparedStatement
andCallableStatement
parameters with the exclusion of parameters set usingsetBytes()
,setBinaryStream()
,setAsciiStream()
,setUnicodeStream()
andsetBlob()
.
In MySQL Server 4.1 and higher, Connector/J supports a single character encoding between client and server, and any number of character encodings for data returned by the server to the client inResultSets
.
Prior to MySQL Server 4.1, Connector/J supported a single character encoding per connection, which could either be automatically detected from the server configuration, or could be configured by the user through theuseUnicode
andcharacterEncoding
properties.
The character encoding between client and server is automatically detected upon connection. You specify the encoding on the server using thecharacter_set_server
for server versions 4.1.0 and newer, andcharacter_set
system variable for server versions older than 4.1.0. The driver automatically uses the encoding specified by the server. For more information, seeSection10.1.3.1, “Server Character Set and Collation”.
For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server withcharacter_set_server=utf8mb4
, and leavecharacterEncoding
out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.
To override the automatically detected encoding on the client side, use thecharacterEncoding
property in the URL used to connect to the server.
To allow multiple character sets to be sent from the client, use the UTF-8 encoding, either by configuringutf8
as the default server character set, or by configuring the JDBC driver to use UTF-8 through thecharacterEncoding
property.
When specifying character encodings on the client side, use Java-style names. The following table lists MySQL character set names and the corresponding Java-style names:
Table22.26.MySQL to Java Encoding Name Translations
ascii |
US-ASCII |
big5 |
Big5 |
gbk |
GBK |
sjis |
SJIS (or Cp932 or MS932 for MySQL Server < 4.1.11) |
cp932 |
Cp932 or MS932 (MySQL Server > 4.1.11) |
gb2312 |
EUC_CN |
ujis |
EUC_JP |
euckr |
EUC_KR |
latin1 |
Cp1252 |
latin2 |
ISO8859_2 |
greek |
ISO8859_7 |
hebrew |
ISO8859_8 |
cp866 |
Cp866 |
tis620 |
TIS620 |
cp1250 |
Cp1250 |
cp1251 |
Cp1251 |
cp1257 |
Cp1257 |
macroman |
MacRoman |
macce |
MacCentralEurope |
utf8 |
UTF-8 |
ucs2 |
UnicodeBig |
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:
The server character set and collation are the values of thecharacter_set_server
andcollation_server
system variables.
The character set and collation of the default database are the values of thecharacter_set_database
andcollation_database
system variables.
Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.
A“connection”is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:
What character set is the statement in when it leaves the client?
The server takes thecharacter_set_client
system variable to be the character set in which statements are sent by the client.
What character set should the server translate a statement to after receiving it?
For this, the server uses thecharacter_set_connection
andcollation_connection
system variables. It converts statements sent by the client fromcharacter_set_client
tocharacter_set_connection
(except for string literals that have an introducer such as_latin1
or_utf8
).collation_connection
is important for comparisons of literal strings. For comparisons of strings with column values,collation_connection
does not matter because columns have their own collation, which has a higher collation precedence.
What character set should the server translate to before shipping result sets or error messages back to the client?
Thecharacter_set_results
system variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names and error messages.
Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settingsfor each connection to the server.
Two statements affect the connection-related character set variables as a group:
SET NAMES '
charset_name
' [COLLATE 'collation_name
']
SET NAMES
indicates what character set the client will use to send SQL statements to the server. Thus,SET NAMES 'cp1251'
tells the server,“future incoming messages from this client are in character setcp1251
.”It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use aSELECT
statement.)
ASET NAMES '
statement is equivalent to these three statements:charset_name
'
SET character_set_client =charset_name
; SET character_set_results =charset_name
; SET character_set_connection =charset_name
;
Settingcharacter_set_connection
tocharset_name
also implicitly setscollation_connection
to the default collation forcharset_name
. It is unnecessary to set that collation explicitly. To specify a particular collation, use the optionalCOLLATE
clause:
SET NAMES 'charset_name
' COLLATE 'collation_name
'
SET CHARACTER SET
charset_name
SET CHARACTER SET
is similar toSET NAMES
but setscharacter_set_connection
andcollation_connection
tocharacter_set_database
andcollation_database
. ASET CHARACTER SET
statement is equivalent to these three statements:charset_name
SET character_set_client =charset_name
; SET character_set_results =charset_name
; SET collation_connection = @@collation_database;
Settingcollation_connection
also implicitly setscharacter_set_connection
to the character set associated with the collation (equivalent to executingSET character_set_connection = @@character_set_database
). It is unnecessary to setcharacter_set_connection
explicitly.
ucs2
,utf16
, andutf32
cannot be used as a client character set, which means that they do not work forSET NAMES
orSET CHARACTER SET
.
The MySQL client programsmysql
,mysqladmin
,mysqlcheck
,mysqlimport
, andmysqlshow
determine the default character set to use as follows:
In the absence of other information, the programs use the compiled-in default character set, usuallylatin1
.
The programs can autodetect which character set to use based on the operating system setting, such as the value of theLANG
orLC_ALL
locale environment variable on Unix systems or the code page setting on Windows systems. For systems on which the locale is available from the OS, the client uses it to set the default character set rather than using the compiled-in default. For example, settingLANG
toru_RU.KOI8-R
causes thekoi8r
character set to be used. Thus, users can configure the locale in their environment for use by MySQL clients.
The OS character set is mapped to the closest MySQL character set if there is no exact match. If the client does not support the matching character set, it uses the compiled-in default. For example,ucs2
is not supported as a connection character set.
C applications can use character set autodetection based on the OS setting by invokingmysql_options()
as follows before connecting to the server:
mysql_options(mysql, MYSQL_SET_CHARSET_NAME, MYSQL_AUTODETECT_CHARSET_NAME);
The programs support a--default-character-set
option, which enables users to specify the character set explicitly to override whatever default the client otherwise determines.
Before MySQL 5.5, in the absence of other information, the MySQL client programs used the compiled-in default character set, usuallylatin1
. An implication of this difference is that if your environment is configured to use a non-latin1
locale, MySQL client programs will use a different connection character set than previously, as though you had issued an implicitSET NAMES
statement. If the previous behavior is required, start the client with the--default-character-set=latin1
option.
When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set thecharacter_set_client
,character_set_results
, andcharacter_set_connection
system variables. In effect, the server performs aSET NAMES
operation using the character set name.
With themysqlclient, to use a character set different from the default, you could explicitly executeSET NAMES
every time you start up. To accomplish the same result more easily, add the--default-character-set
option setting to yourmysqlcommand line or in your option file. For example, the following option file setting changes the three connection-related character set variables set tokoi8r
each time you invokemysql:
[mysql] default-character-set=koi8r
If you are using themysqlclient with auto-reconnect enabled (which is not recommended), it is preferable to use thecharset
command rather thanSET NAMES
. For example:
mysql> charset utf8
Charset changed
Thecharset
command issues aSET NAMES
statement, and also changes the default character set thatmysqluses when it reconnects after the connection has dropped.
Example: Suppose thatcolumn1
is defined asCHAR(5) CHARACTER SET latin2
. If you do not saySET NAMES
orSET CHARACTER SET
, then forSELECT column1 FROM t
, the server sends back all the values forcolumn1
using the character set that the client specified when it connected. On the other hand, if you saySET NAMES 'latin1'
orSET CHARACTER SET latin1
before issuing theSELECT
statement, the server converts thelatin2
values tolatin1
just before sending results back. Conversion may be lossy if there are characters that are not in both character sets.
If you want the server to perform no conversion of result sets or error messages, setcharacter_set_results
toNULL
orbinary
:
SET character_set_results = NULL;
To see the values of the character set and collation system variables that apply to your connection, use these statements:
SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';
You must also consider the environment within which your MySQL applications execute. SeeSection10.1.5, “Configuring the Character Set and Collation for Applications”.
For more information about character sets and error messages, seeSection10.1.6, “Character Set for Error Messages”.
在MySQL中数据库的字符编码和表内字段的编码的要指定为utf8(utf8_general_ci)
pageEncoding="UTF-8"
request.setCharacterEncoding("UTF-8");
设置url为jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8
一般设置完前三步就基本没问题了。如果还不行,就修改mysql的配置文件吧。
## UTF 8 Settings #init-connect=\'SET NAMES utf8\' #collation_server=utf8_unicode_ci #character_set_server=utf8 #skip-character-set-client-handshake #character_sets-dir="D:/xampp/mysql/share/charsets"
重启MySQL,重启Tomcat。
以下为转载
版权声明:本文为博主原创文章,未经博主允许不得转载。