Java HTML直接导出PDF
对于java中如何从html中直接导出pdf,有很多的开源代码,这里个人用itext转。
首先需要的包有:core-renderer-1.0.jar
core-renderer-R8pre1.jar
core-renderer.jar
iText-2.0.8.jar
jtidy-4aug2000r7-dev.jar
Tidy.jar
iTextAsian.jar
java代码的话就比较简单了。具体是先用Tidy将html转换为xhtml,将xhtml转换为其它各种格式的。虽然在转化到pdf时也是用的iText。代码如下:
//struts1.x中
else if("Html2Pdf".equalsIgnoreCase(action)){ exportPdfFile("http://localhost:8080/jsp/test.jsp"); return null; } // 导出pdf add by huangt 2012.6.1 public File exportPdfFile(String urlStr) throws BaseException { // String outputFile = this.fileRoot + "/" + // ServiceConstants.DIR_PUBINFO_EXPORT + "/" + getFileName() + ".pdf"; String outputFile = "d:/test3.pdf"; OutputStream os; try { os = new FileOutputStream(outputFile); ITextRenderer renderer = new ITextRenderer(); String str = getHtmlFile(urlStr); renderer.setDocumentFromString(str); ITextFontResolver fontResolver = renderer.getFontResolver(); fontResolver.addFont("C:/WINDOWS/Fonts/SimSun.ttc",BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);// 宋体字 fontResolver.addFont("C:/WINDOWS/Fonts/Arial.ttf",BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);// 宋体字 renderer.layout(); renderer.createPDF(os); System.out.println("转换成功!"); os.flush(); os.close(); return new File(outputFile); } catch (FileNotFoundException e) { // logger.error("不存在文件!" + e.getMessage()); throw new BaseException(e); } catch (DocumentException e) { // logger.error("生成pdf时出错了!" + e.getMessage()); throw new BaseException(e); } catch (IOException e) { // logger.error("pdf出错了!" + e.getMessage()); throw new BaseException(e); } } // 读取页面内容 add by huangt 2012.6.1 public String getHtmlFile(String urlStr) throws BaseException { URL url; try { if (urlStr.indexOf("?") != -1) { urlStr = urlStr + "&locale=" + LocaleContextHolder.getLocale().toString(); } else { urlStr = urlStr + "?locale=" + LocaleContextHolder.getLocale().toString(); } url = new URL(urlStr); URLConnection uc = url.openConnection(); InputStream is = uc.getInputStream(); Tidy tidy = new Tidy(); OutputStream os2 = new ByteArrayOutputStream(); tidy.setXHTML(true); // 设定输出为xhtml(还可以输出为xml) tidy.setCharEncoding(Configuration.UTF8); // 设定编码以正常转换中文 tidy.setTidyMark(false); // 不设置它会在输出的文件中给加条meta信息 tidy.setXmlPi(true); // 让它加上<?xml version="1.0"?> tidy.setIndentContent(true); // 缩进,可以省略,只是让格式看起来漂亮一些 tidy.parse(is, os2); is.close(); // 解决乱码 --将转换后的输出流重新读取改变编码 String temp; StringBuffer sb = new StringBuffer(); BufferedReader in = new BufferedReader(new InputStreamReader( new ByteArrayInputStream( ((ByteArrayOutputStream) os2).toByteArray()), "utf-8")); while ((temp = in.readLine()) != null) { sb.append(temp); } return sb.toString(); } catch (IOException e) { // logger.error("读取客户端网页文本信息时出错了" + e.getMessage()); throw new BaseException(e); } }
为了解决包的问题,加上Maven <!-- pdf导出 -->
<dependency> <groupId>com.lowagie</groupId> <artifactId>itext</artifactId> <version>2.1.7</version> </dependency> <dependency> <groupId>org.xhtmlrenderer.flyingsaucer</groupId> <artifactId>pdf-renderer</artifactId> <version>1.0</version> </dependency> <dependency> <groupId>jtidy</groupId> <artifactId>jtidy</artifactId> <version>4aug2000r7-dev</version> <type>jar</type> <scope>compile</scope> </dependency> <dependency> <groupId>net.sf.barcode4j</groupId> <artifactId>barcode4j-light</artifactId> <version>2.0</version> </dependency> <dependency> <groupId>avalon-framework</groupId> <artifactId>avalon-framework-impl</artifactId> <version>4.2.0</version> </dependency> <!-- pdf -->
另外附上 稍微复杂的PDFUtils.java文件,由于没时间就不做整理解释了!见下载附件!