总结:在多数情况下,使用MS932代替SHIFT_JIS,可减少乱码。
-----------------------------------------------------------------------------
参考:http://www.asteria.com/tutorial/asbook320_application_read.html
As we mentioned earlier, Shift_JIS and Windows-31J employ different character sets and codes. This means that you must use different mapping converters when converting between them and Unicode.
The table below gives you the differences between Shift_JIS and Windows-31J at a glance:
・Mapping from Shift_JIS/Windows-31J to Unicode
JIS X 0208 characters | Shift_JIS/Windows-31J codes | Shift_JIS→Unicode | Windows-31J→Unicode |
---|---|---|---|
~ (1-33, WAVE DASH) | 8160 | U+301C | U+FF5E |
∥ (1-34, DOUBLE VERTICAL LINE) | 8161 | U+2016 | U+2225 |
- (1-61, MINUS SIGN) | 817C | U+2212 | U+FF0D |
¢ (1-81, CENT SIGN) | 8191 | U+00A2 | U+FFE0 |
£ (1-82, POUND SIGN) | 8192 | U+00A3 | U+FFE1 |
¬ (2-44, NOT SIGN) | 81CA | U+00AC | U+FFE2 |
IBM extensions | No | Yes | |
NEC extensions | No | Yes |
User-defined characters are mapped into the Unicode Private Use Area as shown in the table below:
Converter | Shift_JIS range | Unicode range |
---|---|---|
Windows-31J | F040~F9FC | E000~E757 |
・Mapping from Unicode to Shift_JIS/Windows-31J
Unicode characters | Unicode codes | Shift_JIS | Windows-31J |
---|---|---|---|
∥ (DOUBLE VERTICAL LINE) | U+2016 | 8161 | × |
- (MINUS SIGN) | U+2212 | 817C | × |
~ (WAVE DASH) | U+301C | 8160 | × |
- (FULLWIDTH HYPHEN-MINUS) | U+FF0D | × | 817C |
~ (FULLWIDTH TILDE) | U+FF5E | × | 8160 |
¢ (FULLWIDTH CENT SIGN) | U+FFE0 | × | 8191 |
£ (FULLWIDTH POUND SIGN) | U+FFE1 | × | 8192 |
¬ (FULLWIDTH NOT SIGN) | U+FFE2 | × | 81CA |
To sum up, Shift_JIS and Windows-31J differ in the following ways:
------------------------------ JDK 源代码摘要 --------------------------------------
134 charset("Shift_JIS", "SJIS",
135 new String[] {
136 // IANA aliases
137 "sjis", // historical
138 "shift_jis",
139 "shift-jis",
140 "ms_kanji",
141 "x-sjis",
142 "csShiftJIS"
143 });
144
145 // The definition of this charset may be overridden by the init method,
146 // below, if the sun.nio.cs.map property is defined.
147 //
148 charset("windows-31j", "MS932",
149 new String[] {
150 "MS932", // JDK historical
151 "windows-932",
152 "csWindows31J"
153 });
154
155 charset("JIS_X0201", "JIS_X_0201",
156 new String[] {
157 "JIS0201", // JDK historical
158 // IANA aliases
159 "JIS_X0201",
160 "X0201",
161 "csHalfWidthKatakana"
162 });
------------------------------