Java实现HTML页面转PDF解决方案

首先,当然是找到能够解析PDF的完美组件,百度和谷歌不约而同的告诉我们。IText是王道。而目前开源的组件中,Itext的确是一个First Choice,如果各位单纯是做把图片转成PDF或者自己写了Velocity或者FreeMarker模板生成了HTML是非常推荐直接用Itext来进行的。而如果,大家像我这样已经有前人写好了HTML页面或者懒得写FreeMarker模板的话。可以直接看下一段。
由于他们已经写好了HTML页面,而且显示已经很完美了。那我要做的就是能完美解析HTML+CSS的PDF生成工具。这时候flying-saucer进入了我的选择范围中。
http://code.google.com/p/flying-saucer/
上面是网址,这个工具托管在GoogleCode上面,作者做他们能够做下面的工作:
Flying Saucer takes XML or XHTML and applies CSS 2.1-compliant stylesheets to it, in order to render to PDF (via iText), images, and on-screen using Swing or SWT。
不难看出工作原理,就是解析XML或者XHTML并且包括css样式表,并且用Swing或者SWT的组件生成PDF的功能。这解决了页面的显示问题。IText自身的一个很严重的问题就是解析CSS有很大的问题。而这个解决了。下面就是用Flying Saucer来实现的代码:

public boolean convertHtmlToPdf(String inputFile, String outputFile)
    throws Exception {

        OutputStream os = new FileOutputStream(outputFile);     
        ITextRenderer renderer = new ITextRenderer();     
        String url = new File(inputFile).toURI().toURL().toString(); 

        renderer.setDocument(url);   

        // 解决中文支持问题     
        ITextFontResolver fontResolver = renderer.getFontResolver();    
        fontResolver.addFont("C:/Windows/Fonts/SIMSUN.TTC", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);     
        //解决图片的相对路径问题
        renderer.getSharedContext().setBaseURL("file:/D:/");
        renderer.layout();    
        renderer.createPDF(os);  

        os.flush();
        os.close();
        return true;
    }

上面这段代码是这样的,输入一个HTML地址URL = inputFile,输入一个要输出的地址,就可以在输出的PDF地址中生成这个PDF。

注意事项:

1.输入的HTML页面必须是标准的XHTML页面。页面的顶上必须是这样的
TML页面的语法必须是非常严谨的,所有标签都必须闭合等等(由于flying-Saucer做了XML解析的工作,不严谨会报错的。),这是对页面的第一个要求。

2.要用到图片的地方写相对路径的形式,323图片位置则必须在Java代码中指定。

renderer.getSharedContext().setBaseURL(“file:/D:/”);

也有另一种方法就是直接在标签中写绝对路径。

3.Flying-Saucer在解析tiff格式的图片的时候会报错。具体原因我还没找到。希望大家能够指点我。

4.如果在页面中有中文字体的话。必须在HTML代码中的样式中写上某种字体的css,并且必须是用英文的,然后在Java代码中写上对应的文件位置。

ITextFontResolver fontResolver = renderer.getFontResolver();

           fontResolver.addFont("C:/Windows/Fonts/SIMSUN.TTC", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);

上面的方法是添加了宋体。也可以添加其他字体。

以上就是解决方案。

下面给出这几个包的下载地址。大家可以直接下载。

http://download.csdn.net/detail/jasonchris/4403777

网上遇到一个问题
I am trying to use flyingsaucer to serve pdf generated from xhtml but I am having trouble getting the servlet example to run.

All the other flyingsaucer examples work fine for me but I need this to work as a servlet to incorporate into a webapp.

The full code for the servlet is as follows

import java.io.*;
import java.net.*;

import javax.servlet.*;
import javax.servlet.http.*;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xhtmlrenderer.pdf.ITextRenderer;

public class PDFServlet extends HttpServlet {

    protected void processRequest(HttpServletRequest request, 
        HttpServletResponse response) throws ServletException, IOException {
        response.setContentType("application/pdf");

        StringBuffer buf = new StringBuffer();
        buf.append("");

        String css = getServletContext().getRealPath("/PDFservlet.css");
        System.out.println("css url 2= " + css);
        // put in some style
        buf.append("+
                "href='"+css+"' media='print'/>");

        buf.append("");
        buf.append("

Quarterly Reports for " + request.getParameter("username")+"

"
); buf.append(""); buf.append(""); // generate sales dataint totalSales = 0; int totalProfit = 0; int totalBonus = 0; for(int i=0; i<10; i++) { int currentSales = (int)(Math.random()*10000); int currentProfit = (int)(currentSales*0.2); int currentBonus = (int)(currentProfit*0.33); buf.append(""); totalSales += currentSales; totalProfit += currentProfit; totalBonus += currentBonus; } buf.append(""); buf.append(""); buf.append("
SalesProfitBonus
"+currentSales+"$"+ currentProfit+"$"+currentBonus+"$
totals
"+totalSales+"$"+ totalProfit+"$"+totalBonus+"$
"
); buf.append(""); buf.append(""); byte[] byteArray = buf.toString().getBytes("ISO-8859-1"); // parse our markup into an xml Document DocumentBuilder builder; try { builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); ByteArrayInputStream baos = new ByteArrayInputStream(byteArray); Document doc = builder.parse(baos); ITextRenderer renderer = new ITextRenderer(); renderer.setDocument(doc, null); OutputStream os = response.getOutputStream(); renderer.layout(); renderer.createPDF(os); os.flush(); os.close(); } catch (Exception ex) { ex.printStackTrace(); } } protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); } protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); } public String getServletInfo() { return "Short description"; } }
Jan 17, 2013 7:55:23 PM org.xhtmlrenderer.util.XRLog log
WARNING: Unhandled exception. IOException on parsing style seet from a Reader; don't know the URI.
java.io.IOException: Stream closed
    at java.io.BufferedInputStream.getInIfOpen(Unknown Source)
    at java.io.BufferedInputStream.read1(Unknown Source)
    at java.io.BufferedInputStream.read(Unknown Source)
    at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
    at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
    at sun.nio.cs.StreamDecoder.read(Unknown Source)
    at java.io.InputStreamReader.read(Unknown Source)
    at org.xhtmlrenderer.css.parser.Lexer.zzRefill(Lexer.java:1527)
    ...
    at org.xhtmlrenderer.context.StyleReference.readAndParseAll(StyleReference.java:122)
    at org.xhtmlrenderer.context.StyleReference.setDocumentContext(StyleReference.java:106)
    at org.xhtmlrenderer.pdf.ITextRenderer.setDocument(ITextRenderer.java:130)
    at org.xhtmlrenderer.pdf.ITextRenderer.setDocument(ITextRenderer.java:106)
    at PDFServlet.processRequest(PDFServlet.java:73)
    at PDFServlet.doGet(PDFServlet.java:75)
    ...

解决方法

String css = getServletContext().getRealPath("/PDFservlet.css");

This is not right. It has to be an URL, not a local disk file system path. IText is attempting to download it by an URL “the usual way”, like as a webbrowser would do.

One of the ways to construct the proper URL would be this:

ringBuffer url = req.getRequestURL();
String base = url.substring(0, url.length() - req.getRequestURI().length() + req.getContextPath().length());
String css = base + "/PDFservlet.css";

你可能感兴趣的:(eclispe,j2ee)