使用pdfBox实现pdf转图片出现中文方块乱码 简单修改源码解决

参考文章 问题排查:使用pdfbox将pdf转image时STSong-Light字体中文乱码

pdfbox版本是2.0
日志中打印出类似这样的日志(例:Using fallback XXX for CID-keyed font STSong-Light),就说明系统没有安装STSong-Light字体,pdfbox使用XXX字体来替代了。如果出现方块,就说明没有这种字体,并且替代字体也没有,日志也有相应的其他提示。

正常操作就是安装缺失的 STSong-Light 字体,但是在网上搜到的都是 STSong 字体安装后没效果 我的是windows10

参考上面文章(请一定看一下),修改 FontMapperImpl 在substitutes中增加映射字体STSong-Light->STFangsong
从Apache 下载pdfbox源码 修改 \pdfbox\src\main\java\org\apache\pdfbox\pdmodel\font\FontMapperImpl.java

final class FontMapperImpl implements FontMapper
{
    private static final FontCache fontCache = new FontCache(); // todo: static cache isn't ideal
    private FontProvider fontProvider;
    private Map<String, FontInfo> fontInfoByName;
    private final TrueTypeFont lastResortFont;

    /** Map of PostScript name substitutes, in priority order. */
    private final Map<String, List<String>> substitutes = new HashMap<String, List<String>>();

    FontMapperImpl()
    {
        // substitutes for standard 14 fonts
        substitutes.put("Courier",
                Arrays.asList("CourierNew", "CourierNewPSMT", "LiberationMono", "NimbusMonL-Regu"));
        substitutes.put("Courier-Bold",
                Arrays.asList("CourierNewPS-BoldMT", "CourierNew-Bold", "LiberationMono-Bold",
                        "NimbusMonL-Bold"));
        substitutes.put("Courier-Oblique",
                Arrays.asList("CourierNewPS-ItalicMT","CourierNew-Italic",
                        "LiberationMono-Italic", "NimbusMonL-ReguObli"));
        substitutes.put("Courier-BoldOblique",
                Arrays.asList("CourierNewPS-BoldItalicMT","CourierNew-BoldItalic",
                        "LiberationMono-BoldItalic", "NimbusMonL-BoldObli"));
        substitutes.put("Helvetica",
                Arrays.asList("ArialMT", "Arial", "LiberationSans", "NimbusSanL-Regu"));
        substitutes.put("Helvetica-Bold",
                Arrays.asList("Arial-BoldMT", "Arial-Bold", "LiberationSans-Bold",
                        "NimbusSanL-Bold"));
        substitutes.put("Helvetica-Oblique",
                Arrays.asList("Arial-ItalicMT", "Arial-Italic", "Helvetica-Italic",
                        "LiberationSans-Italic", "NimbusSanL-ReguItal"));
        substitutes.put("Helvetica-BoldOblique",
                Arrays.asList("Arial-BoldItalicMT", "Helvetica-BoldItalic",
                        "LiberationSans-BoldItalic", "NimbusSanL-BoldItal"));
        substitutes.put("Times-Roman",
                Arrays.asList("TimesNewRomanPSMT", "TimesNewRoman", "TimesNewRomanPS",
                        "LiberationSerif", "NimbusRomNo9L-Regu"));
        substitutes.put("Times-Bold",
                Arrays.asList("TimesNewRomanPS-BoldMT", "TimesNewRomanPS-Bold",
                        "TimesNewRoman-Bold", "LiberationSerif-Bold",
                        "NimbusRomNo9L-Medi"));
        substitutes.put("Times-Italic",
                Arrays.asList("TimesNewRomanPS-ItalicMT", "TimesNewRomanPS-Italic",
                        "TimesNewRoman-Italic", "LiberationSerif-Italic",
                        "NimbusRomNo9L-ReguItal"));
        substitutes.put("Times-BoldItalic",
                Arrays.asList("TimesNewRomanPS-BoldItalicMT", "TimesNewRomanPS-BoldItalic",
                        "TimesNewRoman-BoldItalic", "LiberationSerif-BoldItalic",
                        "NimbusRomNo9L-MediItal"));
        substitutes.put("Symbol", Arrays.asList("Symbol", "SymbolMT", "StandardSymL"));
        substitutes.put("ZapfDingbats", Arrays.asList("ZapfDingbatsITC", "Dingbats", "MS-Gothic"));
        // FIXME believelelf STSong-Light->STFangsong, SimSun(宋体)
		substitutes.put("STSong-Light", Arrays.asList("STFangsong", "SIMFANG", "SimSun"));
		substitutes.put("STSong-Light-UniGB-UCS2-H", Arrays.asList("SimSun"));
        // Acrobat also uses alternative names for Standard 14 fonts, which we map to those above
        // these include names such as "Arial" and "TimesNewRoman"
...

按照官网说明用maven编译打包 我使用的jdk7
使用修改过的jar包乱码问题解决
百度网盘
提取码:m4eo

从官网下载源码一些说明:

我用svn检出的路径是 http://svn.apache.org/repos/asf/pdfbox/tags/2.0.0/
更新的版本必须要用jdk1.8编译 我项目在用1.7运行 所以下载的2.0.0版本

maven编译打包的说明

按照说明 在.\pdfbox 文件夹下 运行 mvn clean install
但是会运行 test,我这里会报错 所以我关闭了测试 mvn clean install -DskipTests 参考

鄙人就是一菜鸟,如果有好的解决方法或思路 欢迎留言
希望文章能帮到大家 谢谢

你可能感兴趣的:(使用pdfBox实现pdf转图片出现中文方块乱码 简单修改源码解决)