java中文乱码解决方法
在BS模式下写程序,有很多要注意的地方。其中中文编码问题和路径问题是很重要又非常烦人的东西。这段时间由于项目需要从.NET转到java,就遇到了这些问题。经过一段时间的学习,摸索和实践,现整理一下。
项目采用eclipse,走的是大众路线:hibernate+spring+struts。
一,jsp乱码
关于 contentType 和 pageEncoding 的差异 和 中文JSP页的设定技巧:
contentType -- 指定的是JSP页最终 Browser(客户端)所见到的网页内容的编码. 就是 Mozilla的 Character encoding, 或者是 IE6的 encoding. 例如 JSPtw Forum 用的contentType就是 Big5.
pageEncoding -- 指定JSP编写时所用的编码
如果你的是 WIN98, 或 ME 的NOTEPAD记事本编写JSP, 就一定是常用的是Big5 或 gb2312, 如果是用 WIN2k winXP的NOTEPAD时, SAVE时就可以选择不同的编,码, 包括 ANSI(BIG5/GB2312)或 UTF-8 或 UNIONCODE(估是 UCS 16).
因为 JSP要经过 两次的"编码", 第一阶段会用 pageEncoding, 第二阶段会用 utf-8 至utf-8, 第三阶段就是由TOMCAT出来的网页, 用的是contentType.
阶段一是 JSPC的 JSP至JAVA(.java)原码的"翻译", 它会跟据 pageEncoding 的设定读取JSP. 结果是 由指定的 pageEncoding(utf-8,Big5,gb2312)的JSP 翻译成统一的utf-8 JAVA原码(.java). 如果pageEncoding设定错了, 或没设定(预设 ISO8859-1), 出来的 在这个阶段 就已是中文乱码.
阶段二是由 JAVAC的JAVA原码至JAVA BYTECODE的编译. 不论JSP的编写时是用(utf-8,Big5,gb2312),经过阶段一的结果全都是utf-8的ENCODING的JAVA原码. JAVAC用 utf-8的ENCODING读取AVA原码, 编译成字符串是 utf-8 ENCODING的二进制码(.class). 这是 JAVA VIRTUAL MACNHINE对常数字符串在 二进制码(JAVA BYTECODE)内表逹的规范.
阶段三是TOMCAT(或其的application container)加载和执行阶段二得来的JAVA二进制码, 输出的结果( 也就是BROWSER(客户端))见到的. 这时一早隐藏在阶段一和二的参数contentType, 就发挥了功效.
我的解决方法,在每个jsp页面上都做这样处理
1、 在建立JSP页面时应该注意在jsp页面的头部加入一下代码
2、 在HTML代码中的中加入这句
二,后台乱码
由上面知道页面使用GB2312编码形式,而在数据库中一般用的是iso-8859-1字符集存储数据. 而Java程序在处理字符时默认采用统一的ISO-8859-1字符集(体现Java国际化思想),所以在添加数据时,默认的字符集编码是iso-8859-1,而页面采用的是GB2312,所以就出现乱码问题。为解决此问题应在存储的时候把GB2312换转成iso-8859-1。有此时候在读出时也会出现乱码,那么只需反过来就可以了,把iso-8859-1转换成GB2312。
我的解决:
1,直接用函数来解决中文乱码的解决方法:
private static final String inCode = "ISO-8859-1";
private static final String outCode = "gb2312";
/**
* 转换字符串编码ISO-8859-1为gb2312
*
* @param inputString 输入字符串
* @return 转换后的字符串
*/
public static String readString(String inputString){
try {
byte[] tempByte = inputString.getBytes(inCode);
inputString = new String(tempByte,outCode);
}
catch (UnsupportedEncodingException ex) {
throw new RuntimeException("Unsupported encoding type.");
}finally{
return inputString;
}
}
/**
* 转换字符串编码gb2312为ISO-8859-1
*
* @param inputString 输入字符串
* @return 转换后的字符串
*/
public static String writeString(String inputString){
try {
byte[] tempByte = inputString.getBytes(outCode);
inputString = new String(tempByte,inCode);
}
catch (UnsupportedEncodingException ex) {
throw new RuntimeException("Unsupported encoding type.");
}finally{
return inputString;
}
}
2,在调用页面回传值的时候,强行加以转换
1)取出来的时候使用 String str=new String(strName.getBytes(“iso-8859-1”),”GB2312”);
2)在取制值前加上 request.setCharacterEncoding("gb2312");
但在程序中这样写往往就会报“Unhandled exception type UnsupportedEncodingException”错误,可以使用try catch处理一下,问题就解决了。
3,在java程序中都如上面那样处理,如果项目大了,就很不科学了。这时最好是使用过滤器来处理。
具体处理是在web.xml文件中先注册一个服务类。
包com.stonecai.service.SetCharacterEncodingFilter如下:
package com.stonecai.service;
import java.io.IOException;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
/**
*
Example filter that sets the character encoding to be used in parsing the
* incoming request, either unconditionally or only if the client did not
* specify a character encoding. Configuration of this filter is based on
* the following initialization parameters:
*
- encoding - The character encoding to be configured
* for this request, either conditionally or unconditionally based on
* theignore
initialization parameter. This parameter
* is required, so there is no default.
* - ignore - If set to "true", any character encoding
* specified by the client is ignored, and the value returned by the
*selectEncoding()
method is set. If set to "false,
*selectEncoding()
is called only if the
* client has not already specified an encoding. By default, this
* parameter is set to "true".
*
*
*
*
Although this filter can be used unchanged, it is also easy to
* subclass it and make the selectEncoding()
method more
* intelligent about what encoding to choose, based on characteristics of
* the incoming request (such as the values of the Accept-Language
* and User-Agent
headers, or a value stashed in the current
* user's session.
*
* @author Craig McClanahan
* @version $Revision: 1.2 $ $Date: 2006/10/28 08:51:08 $
*/
public class SetCharacterEncodingFilter implements Filter {
// ----------------------------------------------------- Instance Variables
/**
* The default character encoding to set for requests that pass through
* this filter.
*/
protected String encoding = null;
/**
* The filter configuration object we are associated with. If this value
* is null, this filter instance is not currently configured.
*/
protected FilterConfig filterConfig = null;
/**
* Should a character encoding specified by the client be ignored?
*/
protected boolean ignore = true;
// --------------------------------------------------------- Public Methods
/**
* Take this filter out of service.
*/
public void destroy() {
this.encoding = null;
this.filterConfig = null;
}
/**
* Select and set (if specified) the character encoding to be used to
* interpret request parameters for this request.
*
* @param request The servlet request we are processing
* @param result The servlet response we are creating
* @param chain The filter chain we are processing
*
* @exception IOException if an input/output error occurs
* @exception ServletException if a servlet error occurs
*/
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws IOException, ServletException {
// Conditionally select and set the character encoding to be used
if (ignore || (request.getCharacterEncoding() == null)) {
String encoding = selectEncoding(request);
if (encoding != null)
request.setCharacterEncoding(encoding);
}
// Pass control on to the next filter
chain.doFilter(request, response);
}
/**
* Place this filter into service.
*
* @param filterConfig The filter configuration object
*/
public void init(FilterConfig filterConfig) throws ServletException {
this.filterConfig = filterConfig;
this.encoding = filterConfig.getInitParameter("encoding");
String value = filterConfig.getInitParameter("ignore");
if (value == null)
this.ignore = true;
else if (value.equalsIgnoreCase("true"))
this.ignore = true;
else if (value.equalsIgnoreCase("yes"))
this.ignore = true;
else
this.ignore = false;
}
// ------------------------------------------------------ Protected Methods
/**
* Select an appropriate character encoding to be used, based on the
* characteristics of the current request and/or filter initialization
* parameters. If no character encoding should be set, return
*
null
.
*
* The default implementation unconditionally returns the value configured
* by the encoding initialization parameter for this
* filter.
*
* @param request The servlet request we are processing
*/
protected String selectEncoding(ServletRequest request) {
return (this.encoding);
}
}
这样问题就解决了。
java的中文乱码真的很复杂,只能以后在项目实践中在深入的了解,不单单是解决方法,还包括中文的编码等方面的知识。