Typo3 乱码问题

UTF8 and TYPO3

To have a real utf8 TYPO3 Installation may be a difficult thing. "Real" in that case means that everything is UTF8: TYPO3 and Database!

To have a working utf8 Database may be difficult, and is not possible in any case (e.g. if you dont have permissions to configure the database.)

Additionaly you have to pay attention if you want to migrate an existing TYPO3 Installation to UTF8, which may be very difficult if you allready have diffrent charsets in your database.

This article would like to provide a deep overview over this subject.

What is charset and UTF8?
Ok just a small information on that topic: Every char is represented just with Bit's and Bytes (=8Bit) in your computer. Every application needs to know the mapping beetween this bitcodes and the characters (=the characterset). Therefore exits diffrent standards like the well known ASCII for example. Most charsets uses one byte for each char, but with such one-byte-charsets it is only possible to code 255 diffrent chars --> this is the reason why there are so many diffrent charsets, because every language may need its own set of chars. The problem gets even bigger if you think of all the chinese and cyrillic languages.

Therefore the Unicode standards were born, in its original intention 3bytes are used to code a char. To be more compatible to other charsets UTF8 was defined: In UTF8 a char could be encoded with one, two or three bytes! The trick is simple: if the first bit of a byte is '0' it means it is a one-byte-char: So it was possible to encode the first 128 chars similar to the well known ISO (or ASCII) charsets.

An example:

'ä' is encoded in ASCII with this bits 11100100 (=228)
it is encoded in UTF8 with this bits: 11000011 10100100 (195 and 164)
This means if you interpret this UTF8 char as ASCII, you will get two chars "ä".

Got it?

Collations
Is a set of rules for comparing characters. So a DBMS can sort and compare stringvalues. ( a

TYPO3 Settings
To set UTF8 support in TYPO3 is simple: Just go to the Installtool and set the option forceCharset to "utf-8"

Mysql Settings
This is the difficult part. MYSQL DBMS has 6 diffrent settings for charactersets.

You can see the actual settings by executing the query:

show variables;
-----
character_set_client            | latin1                          
| character_set_connection        | latin1
| character_set_database          | utf8
| character_set_results           | latin1
| character_set_server            | utf8
| character_set_system            | utf8
| character_sets_dir              | /usr/share/mysql-500/charsets/


Read more: http://dev.mysql.com/doc/refman/5.0/en/charset.html

A.) Configure the mysql charactersets
This needs special rights on the server, you find informations in the mysql-reference. Normaly this has to be done:

Check mysqlserver settings (recompile or start with parameter "--character-set-server" to force utf8)

Check mysqlclient settings: Edit my.cnf and be sure that there is a line like:

[client]
default-character-set=uft8

B.) Force Charset by changing class.t3lib_db.php
Often it is easier to force the charset by executing the query:

SET CHARACTER SET utf8;

So it is neccessary to modify the TYPO3 databaseclass "class.t3lib_db.php" and insert the line:

$this->admin_query('SET CHARACTER SET utf8');

You have to insert this after the mysql_pconnect() round line 897.

It is also possible to use the SQL "SET NAMES utf8", which in addition to the SQL above also sets characterset of the connection. (This may cause problems in some environments). Read more:
http://dev.mysql.com/doc/refman/5.1/de/charset-connection.html
http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html

C.) Force Charset with setDBinit configuration (>TYPO3 4.0)
Since TYPO3 4.0 it is mot necessary to patch the class.t3lib_db.php. You can use the configurationoption "setDBinit":

Typo3 乱码问题

(Thanks "pavel" for the tip)

Change Charset in an existing project
Open a lhell on your TYPO3 server
Make database-backup using mysqldump:
mysqldump -u user -p database > backupfile.sql
Drop the extisting database
Create new database
Be sure this new empty database is utf8. E.g. execute:
ALTER DATABASE databasename DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Set TYPO3 force_charset Option (see above)
Modify the backupfile if required! (*1)
Change charset to utf8, for example by using the external tool recode:
RECODE latin1...utf8 backupfile.sql
Change all the "crate table" statements in this dumpfile. You have to replace "CHARSET=latin1" with "CHARSET=utf8". This can be done by using the commandline-tool sed:
sed 's/CHARSET=latin1/CHARSET=utf8/' backupfile.sql > backupfile_utf8.sql
Insert the changed databasedump:
mysql -u user -p database < backupfile_utf8.sql
There may be some problems with TemplaVoila mapping or with special chars in some plugins. Normaly this could be solved by recoding the relevant templatefiles.

(*1) Note from ries van Twisk:
"What I wanted to mention is you don't have to recode
a MySQL dump since resent versions of MySQL dump
already dumps in utf-8."

Testing
Go to the backend. First check which charset is selected by the browser, it should be "UNICODE (utf-8)".

Then create a new page with special chars e.g. "ähm übung" and save it.
Go to the Tool "phpmyadmin" and search this record in the table "pages", if you see exactly the same title everthing works fine! (If not go to Mysql settings again :-))

你可能感兴趣的:(sql,sql,mysql,PHP,server,Go)