2020-02-23
阅读量:570次 | 字数:2.6k | 阅读时长:大约13分钟
相关文章
尝试通过 LibreOffice 将 Office 文档直接转换成图片,可惜只得到第一页的内容,查看帮助,没有找到直接转换为图片的方法。
有两种转换方式,各有优劣,请自行选择。
通过调用操作系统命令的方式实现,这个转换是异步的,根据文件的大小需要的时间不确定,如果在上传之后就要立即预览,需要用同步方式。
用到了 JodConverter:GitHub - sbraconnier/jodconverter: JODConverter automates document conversions using LibreOffice or Apache OpenOffice.
1 2 3 4 5 |
|
内容如下:
1 2 3 4 5 6 7 8 9 |
# LibreOffice主目录 libreOfficeHome=C:/dev/LibreOffice6.4 # 开启多个LibreOffice进程,每个端口对应一个进程 # portNumbers=2002,2003 portNumbers=2002 # 任务执行超时为5分钟 taskExecutionTimeoutMinutes=5 # 任务队列超时为1小时 taskQueueTimeoutHours=1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
package com.example.demo; import com.example.factory.OfficeManagerInstance; import org.jodconverter.JodConverter; import java.io.File; public class LibreOfficeUtil { /** * 利用 JodConverter 将 Offfice 文档转换为 PDF(要依赖 LibreOffice),该转换为同步转换,返回时就已经转换完成 */ public static boolean convertOffice2PDFSyncIsSuccess(File sourceFile, File targetFile) { try { OfficeManagerInstance.start(); JodConverter.convert(sourceFile).to(targetFile).execute(); } catch (Exception e) { e.printStackTrace(); return false; } return true; } /** * 利用 LibreOffice 将 Office 文档转换成 PDF,该转换是异步的,返回时,转换可能还在进行中,转换是否有异常也未可知 * @param filePath 目标文件地址 * @param targetFilePath 输出文件夹 * @return 子线程执行完毕的返回值 */ public static int convertOffice2PDFAsync(String filePath, String fileName, String targetFilePath) throws Exception { String command; int exitStatus; String osName = System.getProperty("os.name"); String outDir = targetFilePath.length() > 0 ? " --outdir " + targetFilePath : ""; if (osName.contains("Windows")) { command = "cmd /c cd /d " + filePath + " && start soffice --headless --invisible --convert-to pdf ./" + fileName + outDir; } else { command = "libreoffice6.3 --headless --invisible --convert-to pdf:writer_pdf_Export " + filePath + fileName + outDir; } exitStatus = executeOSCommand(command); return exitStatus; } /** * 调用操作系统的控制台,执行 command 指令 * 执行该方法时,并没有等到指令执行完毕才返回,而是执行之后立即返回,返回结果为 0,只能说明正确的调用了操作系统的控制台指令,但执行结果如何,是否有异常,在这里是不能体现的,所以,更好的姿势是用同步转换功能。 */ private static int executeOSCommand(String command) throws Exception { Process process; process = Runtime.getRuntime().exec(command); // 转换需要时间,比如一个 3M 左右的文档大概需要 8 秒左右,但实际测试时,并不会等转换结束才执行下一行代码,而是把执行指令发送出去后就立即执行下一行代码了。 int exitStatus = process.waitFor(); if (exitStatus == 0) { exitStatus = process.exitValue(); } // 销毁子进程 process.destroy(); return exitStatus; } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
package com.example.factory; import org.jodconverter.office.LocalOfficeManager; import org.jodconverter.office.OfficeManager; import org.springframework.core.io.support.PropertiesLoaderUtils; import org.springframework.stereotype.Component; import javax.annotation.PostConstruct; import java.io.IOException; import java.util.Properties; /** * github https://github.com/uncleAndyChen * email [email protected] * homepage https://www.lovesofttech.com/ * author andyChen * since 2020/02/29 */ @Component public class OfficeManagerInstance { private static OfficeManager INSTANCE = null; public static synchronized void start() { officeManagerStart(); } @PostConstruct private void init() { try { Properties properties = PropertiesLoaderUtils.loadAllProperties("libre.properties"); String[] portNumbers = properties.getProperty("portNumbers", "").split(","); int[] ports = new int[portNumbers.length]; for (int i = 0; i < portNumbers.length; i++) { ports[i] = Integer.parseInt(portNumbers[i]); } LocalOfficeManager.Builder builder = LocalOfficeManager.builder().install(); builder.officeHome(properties.getProperty("libreOfficeHome", "")); builder.portNumbers(ports); builder.taskExecutionTimeout(Integer.parseInt(properties.getProperty("taskExecutionTimeoutMinutes", "")) * 1000 * 60); // minute builder.taskQueueTimeout(Integer.parseInt(properties.getProperty("taskQueueTimeoutHours", "")) * 1000 * 60 * 60); // hour INSTANCE = builder.build(); officeManagerStart(); } catch (IOException e) { e.printStackTrace(); } } private static void officeManagerStart() { if (INSTANCE.isRunning()) { return; } try { INSTANCE.start(); } catch (Exception e) { e.printStackTrace(); } } } |
https://github.com/sbraconnier/jodconverter/wiki/Getting-Started
Configuration · sbraconnier/jodconverter Wiki · GitHub
Java Library · sbraconnier/jodconverter Wiki · GitHub
请移步:Maven 项目 jar 包依赖冲突导致运行期错误的排查方法
libreoffice6.3 转换文档的用法,官方没有详细的在线文档,通过 -h 可以查看到详细的帮助,已经可以满足开发所需。
例如将一个文件转换为 pdf :libreoffice6.3 --headless --invisible --convert-to pdf:writer_pdf_Export ./奇妙的记忆力.pptx
,后面可以指定保存 pdf 的目录,不指定就保存到当前目录。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
[root@ebs-60027 lib64]# libreoffice6.3 -h Usage: soffice [argument...] argument - switches, switch parameters and document URIs (filenames). Using without special arguments: Opens the start center, if it is used without any arguments. {file} Tries to open the file (files) in the components suitable for them. {file} {macro:///Library.Module.MacroName} Opens the file and runs specified macros from the file. Getting help and information: --help | -h | -? Shows this help and quits. --helpwriter Opens built-in or online Help on Writer. --helpcalc Opens built-in or online Help on Calc. --helpdraw Opens built-in or online Help on Draw. --helpimpress Opens built-in or online Help on Impress. --helpbase Opens built-in or online Help on Base. --helpbasic Opens built-in or online Help on Basic scripting language. --helpmath Opens built-in or online Help on Math. --version Shows the version and quits. --nstemporarydirectory (MacOS X sandbox only) Returns path of the temporary directory for the current user and exits. Overrides all other arguments. General arguments: --quickstart[=no] Activates[Deactivates] the Quickstarter service. --nolockcheck Disables check for remote instances using one installation. --infilter={filter} Force an input filter type if possible. For example: --infilter="Calc Office Open XML" --infilter="Text (encoded):UTF8,LF,,," --pidfile={file} Store soffice.bin pid to {file}. --display {display} Sets the DISPLAY environment variable on UNIX-like platforms to the value {display} (only supported by a start script). User/programmatic interface control: --nologo Disables the splash screen at program start. --minimized Starts minimized. The splash screen is not displayed. --nodefault Starts without displaying anything except the splash screen (do not display initial window). --invisible Starts in invisible mode. Neither the start-up logo nor the initial program window will be visible. Application can be controlled, and documents and dialogs can be controlled and opened via the API. Using the parameter, the process can only be ended using the taskmanager (Windows) or the kill command (UNIX-like systems). It cannot be used in conjunction with --quickstart. --headless Starts in "headless mode" which allows using the application without GUI. This special mode can be used when the application is controlled by external clients via the API. --norestore Disables restart and file recovery after a system crash. --safe-mode Starts in a safe mode, i.e. starts temporarily with a fresh user profile and helps to restore a broken configuration. --accept={connect-string} Specifies a UNO connect-string to create a UNO acceptor through which other programs can connect to access the API. Note that API access allows execution of arbitrary commands. The syntax of the {connect-string} is: connection-type,params;protocol-name,params e.g. pipe,name={some name};urp or socket,host=localhost,port=54321;urp --unaccept={connect-string} Closes an acceptor that was created with --accept. Use --unaccept=all to close all acceptors. --language={lang} Uses specified language, if language is not selected yet for UI. The lang is a tag of the language in IETF language tag. Developer arguments: --terminate_after_init Exit after initialization complete (no documents loaded) --eventtesting Exit after loading documents. New document creation arguments: The arguments create an empty document of specified kind. Only one of them may be used in one command line. If filenames are specified after an argument, then it tries to open those files in the specified component. --writer Creates an empty Writer document. --calc Creates an empty Calc document. --draw Creates an empty Draw document. --impress Creates an empty Impress document. --base Creates a new database. --global Creates an empty Writer master (global) document. --math Creates an empty Math document (formula). --web Creates an empty HTML document. File open arguments: The arguments define how following filenames are treated. New treatment begins after the argument and ends at the next argument. The default treatment is to open documents for editing, and create new documents from document templates. -n Treats following files as templates for creation of new documents. -o Opens following files for editing, regardless whether they are templates or not. --pt {Printername} Prints following files to the printer {Printername}, after which those files are closed. The splash screen does not appear. If used multiple times, only last {Printername} is effective for all documents of all --pt runs. Also, --printer-name argument of --print-to-file switch interferes with {Printername}. -p Prints following files to the default printer, after which those files are closed. The splash screen does not appear. If the file name contains spaces, then it must be enclosed in quotation marks. --view Opens following files in viewer mode (read-only). --show Opens and starts the following presentation documents of each immediately. Files are closed after the showing. Files other than Impress documents are opened in default mode , regardless of previous mode. --convert-to OutputFileExtension[:OutputFilterName] \ [--outdir output_dir] [--convert-images-to] Batch convert files (implies --headless). If --outdir isn't specified, then current working directory is used as output_dir. If --convert-images-to is given, its parameter is taken as the target filter format for *all* images written to the output format. If --convert-to is used more than once, the last value of OutputFileExtension[:OutputFilterName] is effective. If --outdir is used more than once, only its last value is effective. For example: --convert-to pdf *.odt --convert-to epub *.doc --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc --convert-to "html:XHTML Writer File:UTF8" \ --convert-images-to "jpg" *.doc --convert-to "txt:Text (encoded):UTF8" *.doc --print-to-file [--printer-name printer_name] [--outdir output_dir] Batch print files to file. If --outdir is not specified, then current working directory is used as output_dir. If --printer-name or --outdir used multiple times, only last value of each is effective. Also, {Printername} of --pt switch interferes with --printer-name. --cat Dump text content of the following files to console (implies --headless). Cannot be used with --convert-to. --script-cat Dump text content of any scripts embedded in the files to console (implies --headless). Cannot be used with --convert-to. -env:[= |
参考地址
Java 利用 LibreOffice/OpenOffice 将 Office 文档(.doc/.docx .ppt/.pptx )转换成 PDF,进而转图片,实现在线预览功能 | 安迪陈技术日志,架构、感悟、系统分析、团队管理 | 自强不息,厚德载物