protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { PrintWriter pw = response.getWriter(); response.setCharacterEncoding("utf-8"); response.setContentType("text/html; charset=utf-8"); pw.print("中文"); }
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException,IOException { PrintWriter pw = response.getWriter(); response.setCharacterEncoding("utf-8"); response.setContentType("text/html; charset=utf-8"); pw.print("中文"); }
输出乱码。为什么呢,已经设置了字符编码啊?难道设置的无效。
在API中找到方法说明:
PrintWriter getWriter() throws IOException Returns a PrintWriter object that can send character text to the client. The PrintWriter uses the character encoding returned by getCharacterEncoding(). If the response's character encoding has not been specified as described in getCharacterEncoding (i.e., the method just returns the default value ISO-8859-1), getWriter updates it to ISO-8859-1.
PrintWriter getWriter() throws IOException Returns a PrintWriter object that can send character text to the client. The PrintWriter uses the character encoding returned by getCharacterEncoding(). If the response's character encoding has not been specified as described in getCharacterEncoding (i.e., the method just returns the default value ISO-8859-1), getWriter updates it to ISO-8859-1.
就是讲,在返回一个PrintWriter对象的时候,charactor encoding就已经确定了,就已经设置好了字符集了。什么时候设置的呢? setCharacterEncoding方法的实现时发现如下代码:
public void setCharacterEncoding(String charset) { if (isCommitted()) return; // Ignore any call from an included servlet if (included) return; // Ignore any call made after the getWriter has been invoked // The default should be used if (usingWriter) return; coyoteResponse.setCharacterEncoding(charset); isCharacterEncodingSet = true; }
public void setCharacterEncoding(String charset) { if (isCommitted()) return; // Ignore any call from an included servlet if (included) return; // Ignore any call made after the getWriter has been invoked // The default should be used if (usingWriter) return; coyoteResponse.setCharacterEncoding(charset); isCharacterEncodingSet = true; }
其中usingWriter 标志为getPrinteWriter方法中设定,可见其控制逻辑为一旦返回了PrintWriter,本函数即不再生效。
ServletOutputStream out = response.getOutputStream();
out.print("中文");
情况1:正常,浏览器按utf-8方式查看
//response.setContentType("text/html; charset=utf-8");
情况2:浏览器缺省按简体中文查看,手动设为utf-8方式查看正常
//response.setCharacterEncoding("utf-8");
说明:这种方式不仅不需要在调用getOutputStream()之前设定字符集,甚至在print输出后设定都有效。
结论:
1.在servlet中输出中文,如果采用PrintWriter方式,需要在调用getPrintWriter()之前调用setContentType 或者 setCharacterEncoding;采用ServletOutputStream方式,不受此限。
2.setContentType 和 setCharacterEncoding两方法中设定characterEncoding的方法对服务器效果一致,不需要反复调用。在输出文本内容时, 采用response.setContentType("text/html; charset=utf-8");似乎更为方便。
3.PrintWriter自身并没有处理编码的职责,它还是应该看成一个装饰器比较好:它就是为了输出更方便而设计的,提供print、 println、printf等便利方法。要设置编码的话,可以在它的底层Writer上设置:(这里以OutputStreamWriter为底层 Writer),参考:
new PrintWriter(new OutputStreamWriter(new FileOutputStream("yourfilepath"), "UTF-8"));