Java实现HTML页面转PDF解决方案

首先,当然是找到能够解析PDF的完美组件,百度和谷歌不约而同的告诉我们。IText是王道。而目前开源的组件中,Itext的确是一个First Choice,如果各位单纯是做把图片转成PDF或者自己写了Velocity或者FreeMarker模板生成了HTML是非常推荐直接用Itext来进行的。而如果,大家像我这样已经有前人写好了HTML页面或者懒得写FreeMarker模板的话。可以直接看下一段。
由于他们已经写好了HTML页面,而且显示已经很完美了。那我要做的就是能完美解析HTML+CSS的PDF生成工具。这时候flying-saucer进入了我的选择范围中。
http://code.google.com/p/flying-saucer/
上面是网址,这个工具托管在GoogleCode上面,作者做他们能够做下面的工作:
Flying Saucer takes XML or XHTML and applies CSS 2.1-compliant stylesheets to it, in order to render to PDF (via iText), images, and on-screen using Swing or SWT。
不难看出工作原理,就是解析XML或者XHTML并且包括css样式表,并且用Swing或者SWT的组件生成PDF的功能。这解决了页面的显示问题。IText自身的一个很严重的问题就是解析CSS有很大的问题。而这个解决了。下面就是用Flying Saucer来实现的代码:

public boolean convertHtmlToPdf(String inputFile, String outputFile)
    throws Exception {

        OutputStream os = new FileOutputStream(outputFile); 
        ITextRenderer renderer = new ITextRenderer(); 
        String url = new File(inputFile).toURI().toURL().toString(); 

        renderer.setDocument(url); 

        // 解决中文支持问题     
        ITextFontResolver fontResolver = renderer.getFontResolver(); 
        fontResolver.addFont("C:/Windows/Fonts/SIMSUN.TTC", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED); 
        //解决图片的相对路径问题
        renderer.getSharedContext().setBaseURL("file:/D:/");
        renderer.layout(); 
        renderer.createPDF(os); 

        os.flush();
        os.close();
        return true;
    }

上面这段代码是这样的,输入一个HTML地址URL = inputFile,输入一个要输出的地址,就可以在输出的PDF地址中生成这个PDF。

注意事项:

1.输入的HTML页面必须是标准的XHTML页面。页面的顶上必须是这样的
TML页面的语法必须是非常严谨的,所有标签都必须闭合等等(由于flying-Saucer做了XML解析的工作,不严谨会报错的。),这是对页面的第一个要求。

2.要用到图片的地方写相对路径的形式,323图片位置则必须在Java代码中指定。

renderer.getSharedContext().setBaseURL(“file:/D:/”);

也有另一种方法就是直接在标签中写绝对路径。

3.Flying-Saucer在解析tiff格式的图片的时候会报错。具体原因我还没找到。希望大家能够指点我。

4.如果在页面中有中文字体的话。必须在HTML代码中的样式中写上某种字体的css,并且必须是用英文的,然后在Java代码中写上对应的文件位置。

ITextFontResolver fontResolver = renderer.getFontResolver();

           fontResolver.addFont("C:/Windows/Fonts/SIMSUN.TTC", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);

上面的方法是添加了宋体。也可以添加其他字体。

以上就是解决方案。

下面给出这几个包的下载地址。大家可以直接下载。

http://download.csdn.net/detail/jasonchris/4403777

网上遇到一个问题
I am trying to use flyingsaucer to serve pdf generated from xhtml but I am having trouble getting the servlet example to run.

All the other flyingsaucer examples work fine for me but I need this to work as a servlet to incorporate into a webapp.

The full code for the servlet is as follows

import java.io.*;
import java.net.*;

import javax.servlet.*;
import javax.servlet.http.*;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xhtmlrenderer.pdf.ITextRenderer;

public class PDFServlet extends HttpServlet {

    protected void processRequest(HttpServletRequest request, 
        HttpServletResponse response) throws ServletException, IOException {
        response.setContentType("application/pdf");

        StringBuffer buf = new StringBuffer();
        buf.append("<html>");

        String css = getServletContext().getRealPath("/PDFservlet.css");
        System.out.println("css url 2= " + css);
        // put in some style
        buf.append("<head><link rel='stylesheet' type='text/css' "+
                "href='"+css+"' media='print'/></head>");

        buf.append("<body>");
        buf.append("<h1>Quarterly Reports for " +
            request.getParameter("username")+"</h1>");

        buf.append("<table cellspacing='0'>");
        buf.append("<tr><th>Sales</th><th>Profit</th><th>Bonus</th></tr>");

        // generate sales data
        int totalSales = 0;
        int totalProfit = 0;
        int totalBonus = 0;
        for(int i=0; i<10; i++) {
            int currentSales = (int)(Math.random()*10000);
            int currentProfit = (int)(currentSales*0.2);
            int currentBonus = (int)(currentProfit*0.33);
            buf.append("<tr><td>"+currentSales+"$</td><td>"+
                currentProfit+"$</td><td>"+currentBonus+"$</td></tr>");
            totalSales  += currentSales;
            totalProfit += currentProfit;
            totalBonus  += currentBonus;
        }

        buf.append("<tr class='total-header'><td colspan='3'>totals</td></tr>");
        buf.append("<tr class='total'><td>"+totalSales+"$</td><td>"+
            totalProfit+"$</td><td>"+totalBonus+"$</td></tr>");
        buf.append("</table>");

        buf.append("</body>");
        buf.append("</html>");

        byte[] byteArray = buf.toString().getBytes("ISO-8859-1"); 

        // parse our markup into an xml Document
        DocumentBuilder builder;
        try {
            builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
            ByteArrayInputStream baos = new ByteArrayInputStream(byteArray); 
            Document doc = builder.parse(baos);

            ITextRenderer renderer = new ITextRenderer();
            renderer.setDocument(doc, null);

            OutputStream os = response.getOutputStream();
            renderer.layout();
            renderer.createPDF(os);
            os.flush();
            os.close();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

    protected void doGet(HttpServletRequest request, 
        HttpServletResponse response) throws ServletException, IOException {
        processRequest(request, response);
    }

    protected void doPost(HttpServletRequest request, 
        HttpServletResponse response) throws ServletException, IOException {
        processRequest(request, response);
    }

    public String getServletInfo() {
        return "Short description";
    }
}
Jan 17, 2013 7:55:23 PM org.xhtmlrenderer.util.XRLog log
WARNING: Unhandled exception. IOException on parsing style seet from a Reader; don't know the URI.
java.io.IOException: Stream closed
    at java.io.BufferedInputStream.getInIfOpen(Unknown Source)
    at java.io.BufferedInputStream.read1(Unknown Source)
    at java.io.BufferedInputStream.read(Unknown Source)
    at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
    at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
    at sun.nio.cs.StreamDecoder.read(Unknown Source)
    at java.io.InputStreamReader.read(Unknown Source)
    at org.xhtmlrenderer.css.parser.Lexer.zzRefill(Lexer.java:1527)
    ...
    at org.xhtmlrenderer.context.StyleReference.readAndParseAll(StyleReference.java:122)
    at org.xhtmlrenderer.context.StyleReference.setDocumentContext(StyleReference.java:106)
    at org.xhtmlrenderer.pdf.ITextRenderer.setDocument(ITextRenderer.java:130)
    at org.xhtmlrenderer.pdf.ITextRenderer.setDocument(ITextRenderer.java:106)
    at PDFServlet.processRequest(PDFServlet.java:73)
    at PDFServlet.doGet(PDFServlet.java:75)
    ...

解决方法

String css = getServletContext().getRealPath("/PDFservlet.css");

This is not right. It has to be an URL, not a local disk file system path. IText is attempting to download it by an URL “the usual way”, like as a webbrowser would do.

One of the ways to construct the proper URL would be this:

ringBuffer url = req.getRequestURL();
String base = url.substring(0, url.length() - req.getRequestURI().length() + req.getContextPath().length());
String css = base + "/PDFservlet.css";

你可能感兴趣的:(java,html,pdf)