本文从属于笔者的Java入门与最佳实践系列文章。
DOCX2PDF
将DOCX文档转化为PDF是项目中常见的需求之一,目前主流的方法可以分为两大类,一类是利用各种Office应用进行转换,譬如Microsoft Office、WPS以及LiberOffice,另一种是利用各种语言提供的对于Office文档读取的接口(譬如Apache POI)然后使用专门的PDFGenerator库,譬如IText进行PDF构建。总的来说,从样式上利用Office应用可以保证较好的样式,不过相对而言效率会比较低。其中Microsoft Office涉及版权,不可轻易使用(笔者所在公司就被抓包了),WPS目前使用比较广泛,不过存在超链接截断问题,即超过256个字符的超链接会被截断,LiberOffice的样式排版相对比较随意。而利用POI接口进行读取与生成的方式性能较好,适用于对于格式要求不是很高的情况。另外还有一些封装好的在线工具或者命令行工具,譬如docx2pdf与OfficeToPDF。
MicroSoft Office
本部分的核心代码如下,全部代码参考这里:
private ActiveXComponent oleComponent = null;
private Dispatch activeDoc = null;
private final static String APP_ID = "Word.Application";
// Constants that map onto Word's WdSaveOptions enumeration and that
// may be passed to the close(int) method
public static final int DO_NOT_SAVE_CHANGES = 0;
public static final int PROMPT_TO_SAVE_CHANGES = -2;
public static final int SAVE_CHANGES = -1;
// These constant values determine whether or not tha application
// instance will be displyed on the users screen or not.
public static final boolean VISIBLE = true;
public static final boolean HIDDEN = false;
/**
* Create a new instance of the JacobWordSearch class using the following
* parameters.
*
* @param visibility A primitive boolean whose value will determine whether
* or not the Word application will be visible to the user. Pass true
* to display Word, false otherwise.
*/
public OfficeConverter(boolean visibility) {
this.oleComponent = new ActiveXComponent(OfficeConverter.APP_ID);
this.oleComponent.setProperty("Visible", new Variant(visibility));
}
/**
* Open ana existing Word document.
*
* @param docName An instance of the String class that encapsulates the
* path to and name of a valid Word file. Note that there are a few
* limitations applying to the format of this String; it must specify
* the absolute path to the file and it must not use the single forward
* slash to specify the path separator.
*/
public void openDoc(String docName) {
Dispatch disp = null;
Variant var = null;
// First get a Dispatch object referencing the Documents collection - for
// collections, think of ArrayLists of objects.
var = Dispatch.get(this.oleComponent, "Documents");
disp = var.getDispatch();
// Now call the Open method on the Documents collection Dispatch object
// to both open the file and add it to the collection. It would be possible
// to open a series of files and access each from the Documents collection
// but for this example, it is simpler to store a reference to the
// active document in a private instance variable.
var = Dispatch.call(disp, "Open", docName);
this.activeDoc = var.getDispatch();
}
/**
* There is more than one way to convert the document into PDF format, you
* can either explicitly use a FileConvertor object or call the
* ExportAsFixedFormat method on the active document. This method opts for
* the latter and calls the ExportAsFixedFormat method passing the name
* of the file along with the integer value of 17. This value maps onto one
* of Word's constants called wdExportFormatPDF and causes the application
* to convert the file into PDF format. If you wanted to do so, for testing
* purposes, you could add another value to the args array, a Boolean value
* of true. This would open the newly converted document automatically.
*
* @param filename
*/
public void publishAsPDF(String filename) {
// The code to expoort as a PDF is 17
//Object args = new Object{filename, new Integer(17), new Boolean(true)};
Object args = new Object {
filename, new Integer(17)
} ;
Dispatch.call(this.activeDoc, "ExportAsFixedFormat", args);
}
/**
* Called to close the active document. Note that this method simply
* calls the overloaded closeDoc(int) method passing the value 0 which
* instructs Word to close the document and discard any changes that may
* have been made since the document was opened or edited.
*/
public void closeDoc() {
this.closeDoc(JacobWordSearch.DO_NOT_SAVE_CHANGES);
}
/**
* Called to close the active document. It is possible with this overloaded
* version of the close() method to specify what should happen if the user
* has made changes to the document that have not been saved. There are three
* possible value defined by the following manifest constants;
* DO_NOT_SAVE_CHANGES - Close the document and discard any changes
* the user may have made.
* PROMPT_TO_SAVE_CHANGES - Display a prompt to the user asking them
* how to proceed.
* SAVE_CHANGES - Save the changes the user has made to the document.
*
* @param saveOption A primitive integer whose value indicates how the close
* operation should proceed if the user has made changes to the active
* document. Note that no checks are made on the value passed to
* this argument.
*/
public void closeDoc(int saveOption) {
Object args = {new Integer(saveOption)};
Dispatch.call(this.activeDoc, "Close", args);
}
/**
* Called once processing has completed in order to close down the instance
* of Word.
*/
public void quit() {
Dispatch.call(this.oleComponent, "Quit");
}
WPS
Java调用WPS或pdfcreator的com接口实现doc转pdf
本文的核心代码如下,完整代码查看这里:
@Override
public boolean convert(String word, String pdf) {
File pdfFile = new File(pdf);
File wordFile = new File(word);
boolean convertSuccessfully = false;
ActiveXComponent wps = null;
ActiveXComponent doc = null;
try {
wps = new ActiveXComponent("KWPS.Application");
// Dispatch docs = wps.getProperty("Documents").toDispatch();
// Dispatch d = Dispatch.call(docs, "Open", wordFile.getAbsolutePath(), false, true).toDispatch();
// Dispatch.call(d, "SaveAs", pdfFile.getAbsolutePath(), 17);
// Dispatch.call(d, "Close", false);
doc = wps.invokeGetComponent("Documents")
.invokeGetComponent("Open", new Variant(wordFile.getAbsolutePath()));
try {
doc.invoke("SaveAs",
new Variant(new File("C:\\Users\\lotuc\\Documents\\mmm.pdf").getAbsolutePath()),
new Variant(17));
convertSuccessfully = true;
} catch (Exception e) {
logger.warning("生成PDF失败");
e.printStackTrace();
}
File saveAsFile = new File("C:\\Users\\lotuc\\Documents\\saveasfile.doc");
try {
doc.invoke("SaveAs", saveAsFile.getAbsolutePath());
logger.info("成功另存为" + saveAsFile.getAbsolutePath());
} catch (Exception e) {
logger.info("另存为" + saveAsFile.getAbsolutePath() + "失败");
e.printStackTrace();
}
} finally {
if (doc == null) {
logger.info("打开文件 " + wordFile.getAbsolutePath() + " 失败");
} else {
try {
logger.info("释放文件 " + wordFile.getAbsolutePath());
doc.invoke("Close");
doc.safeRelease();
} catch (Exception e1) {
logger.info("释放文件 " + wordFile.getAbsolutePath() + " 失败");
}
}
if (wps == null) {
logger.info("加载 WPS 控件失败");
} else {
try {
logger.info("释放 WPS 控件");
wps.invoke("Quit");
wps.safeRelease();
} catch (Exception e1) {
logger.info("释放 WPS 控件失败");
}
}
}
return convertSuccessfully;
}
LiberOffice
Convert Microsoft Word to PDF - using Java and LibreOffice (UNO API)
LiberOffice本身提供了一个命令行工具进行转换,在你安装好了LiberOffice之后
/usr/local/bin/soffice --convert-to pdf:writer_pdf_Export /Users/lotuc/Downloads/test.doc
如果有打开的libreoffice实例, 要穿入env选项指定一个工作目录
/usr/local/bin/soffice "-env:UserInstallation=file:///tmp/LibreOffice_Conversion_abc" --convert-to pdf:writer_pdf_Export /Users/lotuc/Downloads/test.doc
首先我们需要安装好LiberOffice,然后将依赖的Jar包添加到classpath中:
Install Libre Office
Create a Java project in your favorite editor and add these to your class path:
[Libre Office Dir]/URE/java/juh.jar
[Libre Office Dir]/URE/java/jurt.jar
[Libre Office Dir]/URE/java/ridl.jar
[Libre Office Dir]/program/classes/unoil.jar
然后我们需要启动一个LiberOffice进程:
import java.util.Date;
import java.io.File;
import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XDesktop;
import com.sun.star.frame.XStorable;
import com.sun.star.lang.XComponent;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.text.XTextDocument;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;
import com.sun.star.util.XReplaceDescriptor;
import com.sun.star.util.XReplaceable;
public class MailMergeExample {
public static void main(String[] args) throws Exception {
// Initialise
XComponentContext xContext = Bootstrap.bootstrap();
XMultiComponentFactory xMCF = xContext.getServiceManager();
Object oDesktop = xMCF.createInstanceWithContext(
"com.sun.star.frame.Desktop", xContext);
XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface(
XDesktop.class, oDesktop);
接下来我们需要加载目标Doc文档:
// Load the Document
String workingDir = "C:/projects/";
String myTemplate = "letterTemplate.doc";
if (!new File(workingDir + myTemplate).canRead()) {
throw new RuntimeException("Cannot load template:" + new File(workingDir + myTemplate));
}
XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime
.queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop);
String sUrl = "file:///" + workingDir + myTemplate;
PropertyValue[] propertyValues = new PropertyValue[0];
propertyValues = new PropertyValue[1];
propertyValues[0] = new PropertyValue();
propertyValues[0].Name = "Hidden";
propertyValues[0].Value = new Boolean(true);
XComponent xComp = xCompLoader.loadComponentFromURL(
sUrl, "_blank", 0, propertyValues);
然后我们可以使用如下方式对内容进行替换:
// Search and replace
XReplaceDescriptor xReplaceDescr = null;
XReplaceable xReplaceable = null;
XTextDocument xTextDocument = (XTextDocument) UnoRuntime
.queryInterface(XTextDocument.class, xComp);
xReplaceable = (XReplaceable) UnoRuntime
.queryInterface(XReplaceable.class, xTextDocument);
xReplaceDescr = (XReplaceDescriptor) xReplaceable
.createReplaceDescriptor();
// mail merge the date
xReplaceDescr.setSearchString("");
xReplaceDescr.setReplaceString(new Date().toString());
xReplaceable.replaceAll(xReplaceDescr);
// mail merge the addressee
xReplaceDescr.setSearchString("");
xReplaceDescr.setReplaceString("Best Friend");
xReplaceable.replaceAll(xReplaceDescr);
// mail merge the signatory
xReplaceDescr.setSearchString("");
xReplaceDescr.setReplaceString("Your New Boss");
xReplaceable.replaceAll(xReplaceDescr);
然后可以输出到PDF中:
// save as a PDF
XStorable xStorable = (XStorable) UnoRuntime
.queryInterface(XStorable.class, xComp);
propertyValues = new PropertyValue[2];
propertyValues[0] = new PropertyValue();
propertyValues[0].Name = "Overwrite";
propertyValues[0].Value = new Boolean(true);
propertyValues[1] = new PropertyValue();
propertyValues[1].Name = "FilterName";
propertyValues[1].Value = "writer_pdf_Export";
// Appending the favoured extension to the origin document name
String myResult = workingDir + "letterOutput.pdf";
xStorable.storeToURL("file:///" + myResult, propertyValues);
System.out.println("Saved " + myResult);
xdocreport
本文的核心代码如下,完整代码查看这里:
/**
* @param inpuFile 输入的文件流
* @param outFile 输出的文件对象
* @return
* @function 利用Apache POI从输入的文件中生成PDF文件
*/
@SneakyThrows
public static void convertWithPOI(InputStream inpuFile, File outFile) {
//从输入的文件流创建对象
XWPFDocument document = new XWPFDocument(inpuFile);
//创建PDF选项
PdfOptions pdfOptions = PdfOptions.create();//.fontEncoding("windows-1250")
//为输出文件创建目录
outFile.getParentFile().mkdirs();
//执行PDF转化
PdfConverter.getInstance().convert(document, new FileOutputStream(outFile), pdfOptions);
}
/**
* @param inpuFile
* @param outFile
* @param renderParams
* @function 先将渲染参数填入模板DOCX文件然后生成PDF
*/
@SneakyThrows
public static void convertFromTemplateWithFreemarker(InputStream inpuFile, File outFile, Map renderParams) {
//创建Report实例
IXDocReport report = XDocReportRegistry.getRegistry().loadReport(
inpuFile, TemplateEngineKind.Freemarker);
//创建上下文
IContext context = report.createContext();
//填入渲染参数
renderParams.forEach((s, o) -> {
context.put(s, o);
});
//创建输出流
outFile.getParentFile().mkdirs();
//创建转化参数
Options options = Options.getTo(ConverterTypeTo.PDF).via(
ConverterTypeVia.XWPF);
//执行转化过程
report.convert(context, options, new FileOutputStream(outFile));
}