一、简述
近来做了一个需求,里面涉及到了ppt预览的问题,网上这种case很多,也实验了一些,但是百度出来的结果真心属于虎头蛇尾的,不能说里面的内容是胡编乱造的,只是内容不够准确严谨。吐槽完了,说一下整体的方案,对于预览的方案有:1、PPT转化为Flash,前端播放Flash动画 2、PPT转化为图片,前端轮播图片,我采用的是第二种方案,使用Apache POI 将PPT转化为PNG图片。
二、代码运行环境
- JDK 1.8
- Linux/Windows
三、Maven依赖的jar
使用的是Apache POI 的3.16版本,其中具体的jar如下:
org.apache.poi
poi
3.16
org.apache.poi
poi-ooxml
3.16
org.apache.poi
poi-scratchpad
3.16
其中poi是核心的jar包,poi-ooxml是处理.pptx格式,poi-scratchpad是处理.ppt格式。
四、核心代码
PPT中英文和图片是不用做特殊处理的,主要关注在中文如何处理,如果没有关注到中文等特殊字符,生成的图片很有可能是乱码,下面是对乱码处理的核心代码:
1、pptx带有中文的文件输出图片
public void converPPTXtoImage(InputStream pptFileIn, String targetDir) {
try(XMLSlideShow oneSlideShow = new XMLSlideShow(pptFileIn)) {
String xmlFontFormat = "" +
" " +
" " +
" " +
" ";
Dimension onePPTPageSize = oneSlideShow.getPageSize();
List pptPageXSLFSLiseList = oneSlideShow.getSlides();
for(int i = 0; i < pptPageXSLFSLiseList.size(); i ++) {
//设置字体,解决中文乱码问题
CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree();
for (CTShape ctShape : oneCTGroupShape.getSpList()) {
CTTextBody oneCTTextBody = ctShape.getTxBody();
if (null == oneCTTextBody) {
continue;
}
CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray();
CTTextFont oneCTTextFont = null;
try {
oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat);
} catch (XmlException e) {
}
if (oneCTTextFont == null) {
continue;
}
for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) {
CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray();
for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) {
CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr();
oneCTTextCharacterProperties.setLatin(oneCTTextFont);
}
}
}
for(XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes() ){
if (shape instanceof XSLFTextShape){
XSLFTextShape txtshape = (XSLFTextShape)shape ;
for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){
List textRunList = textPara.getTextRuns();
for(XSLFTextRun textRun: textRunList) {
textRun.setFontFamily("simsun");
}
}
}
}
BufferedImage oneBufferedImage = new BufferedImage(onePPTPageSize.width, onePPTPageSize.height, BufferedImage.TYPE_INT_RGB);
Graphics2D oneGraphics2D = oneBufferedImage.createGraphics();
pptPageXSLFSLiseList.get(i).draw(oneGraphics2D);
String imgName=(i+1)+ ".png";
try(OutputStream imageOut = new FileOutputStream(targetDir + imgName)) {
ImageIO.write(oneBufferedImage, "png", imageOut);
} finally {
}
}
} catch (Exception e) {
}
}
核心处理乱码的是:
//设置字体,解决中文乱码问题
CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree();
for (CTShape ctShape : oneCTGroupShape.getSpList()) {
CTTextBody oneCTTextBody = ctShape.getTxBody();
if (null == oneCTTextBody) {
continue;
}
CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray();
CTTextFont oneCTTextFont = null;
try {
oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat);
} catch (XmlException e) {
}
if (oneCTTextFont == null) {
continue;
}
for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) {
CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray();
for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) {
CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr();
oneCTTextCharacterProperties.setLatin(oneCTTextFont);
}
}
}
for(XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes() ){
if (shape instanceof XSLFTextShape){
XSLFTextShape txtshape = (XSLFTextShape)shape ;
for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){
List textRunList = textPara.getTextRuns();
for(XSLFTextRun textRun: textRunList) {
textRun.setFontFamily("simsun");
}
}
}
}
2、ppt格式的带有中文的文件输出为图片
public void converPPTtoImage(InputStream pptStream, String targetImageFileDir) {
try (HSLFSlideShow oneSlideShow = new HSLFSlideShow(pptStream)) {
List pptPageXSLFSLiseList = oneSlideShow.getSlides();
for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) {
//设置字体,解决中文乱码问题
for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) {
for (HSLFTextParagraph hslfTextParagraph : list) {
for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) {
Double size = textRun.getFontSize();
if ((size <= 0) || (size >= 26040)) {
textRun.setFontSize(20.0);
}
textRun.setFontFamily("simsun");
}
}
}
String imgName = (i + 1) + ".png";
BufferedImage oneBufferedImage = new BufferedImage(oneSlideShow.getPageSize().width, oneSlideShow.getPageSize().height, BufferedImage.TYPE_INT_RGB);
Graphics2D oneGraphics2D = oneBufferedImage.createGraphics();
pptPageXSLFSLiseList.get(i).draw(oneGraphics2D);
try( OutputStream imageOut = new FileOutputStream(targetImageFileDir+imgName)) {
ImageIO.write(oneBufferedImage, "png", imageOut);
} finally {
}
}
} catch (Exception e) {
logger.error("converPPTtoImage eror", e);
}
}
核心处理乱码的是:
for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) {
for (HSLFTextParagraph hslfTextParagraph : list) {
for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) {
Double size = textRun.getFontSize();
if ((size <= 0) || (size >= 26040)) {
textRun.setFontSize(20.0);
}
textRun.setFontFamily("simsun");
}
}
}
上面都有这个:
textRun.setFontFamily("simsun");
这里是设置字体为宋体,如果你的程序运行环境已经有宋体的字体了,那么这里就应该是:
textRun.setFontFamily("宋体");
其实不一定用宋体,用其他字体也是可以的,只要该字体支持汉字的渲染即可,jvm自带的字体是不支持汉字的渲染的。刚开始我本地是windows,自带宋体,所以是textRun.setFontFamily("宋体"),自测的时候上传一个pptx文件生成的图片没有中文乱码。但是部署到测试环境后,带有中文的pptx文件就出现了乱码,原因是测试环境是linux,本身是不带宋体等渲染汉字的字体,解决的方案一个是测试环境jvm的字体库里面把宋体加上,加上后重启java应用,textRun.setFontFamily("宋体")就生效了。另外一个就是在应用启动的时候,把宋体文件注册到JVM中,这个时候textRun设置的字体应该是你注册的时候的文件名,比如simsun,注册代码如下:
try(InputStream fontFile = Application.class.getClassLoader().getResourceAsStream("static/fonts/simsun.ttf")) {
GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment();
Font dynamicFont = Font.createFont(Font.TRUETYPE_FONT, fontFile);
ge.registerFont(dynamicFont);
} catch (Exception e) {
}
这段代码就是读入simsun.ttf文件,生成字体,注册到GraphicsEnvironment中。这段代码可以放在系统启动过程中,我用了Spring的框架,实现了接口InitializingBean,放在了实现的方法中。完整的一个Help类的代码如下:
public class POIPowerPointHelper implements InitializingBean {
private static GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment();
@Override
public void afterPropertiesSet() throws Exception {
try (InputStream fontFile = POIPowerPointHelper .class.getClassLoader().getResourceAsStream("fonts/simsun.ttf")) {
Font dynamicFont = Font.createFont(Font.TRUETYPE_FONT, fontFile);
ge.registerFont(dynamicFont);
} catch (Exception e) {
}
}
public void converPPTXtoImage(InputStream pptFileIn, String targetDir) {
try (XMLSlideShow oneSlideShow = new XMLSlideShow(pptFileIn)) {
String xmlFontFormat = "" +
" " +
" " +
" " +
" ";
Dimension onePPTPageSize = oneSlideShow.getPageSize();
List pptPageXSLFSLiseList = oneSlideShow.getSlides();
for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) {
//设置字体,解决中文乱码问题
CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree();
for (CTShape ctShape : oneCTGroupShape.getSpList()) {
CTTextBody oneCTTextBody = ctShape.getTxBody();
if (null == oneCTTextBody) {
continue;
}
CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray();
CTTextFont oneCTTextFont = null;
try {
oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat);
} catch (XmlException e) {
}
if (oneCTTextFont == null) {
continue;
}
for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) {
CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray();
for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) {
CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr();
oneCTTextCharacterProperties.setLatin(oneCTTextFont);
}
}
}
for (XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes()) {
if (shape instanceof XSLFTextShape) {
XSLFTextShape txtshape = (XSLFTextShape) shape;
for (XSLFTextParagraph textPara : txtshape.getTextParagraphs()) {
List textRunList = textPara.getTextRuns();
for (XSLFTextRun textRun : textRunList) {
textRun.setFontFamily("simsun");
}
}
}
}
BufferedImage oneBufferedImage = new BufferedImage(onePPTPageSize.width, onePPTPageSize.height, BufferedImage.TYPE_INT_RGB);
Graphics2D oneGraphics2D = oneBufferedImage.createGraphics();
pptPageXSLFSLiseList.get(i).draw(oneGraphics2D);
String imgName = (i + 1) + ".png";
try (OutputStream imageOut = new FileOutputStream(targetDir + imgName)) {
ImageIO.write(oneBufferedImage, "png", imageOut);
} finally {
}
}
} catch (Exception e) {
}
}
public void converPPTtoImage(InputStream pptStream, String targetImageFileDir) {
try (HSLFSlideShow oneSlideShow = new HSLFSlideShow(pptStream);) {
List pptPageXSLFSLiseList = oneSlideShow.getSlides();
for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) {
//设置字体,解决中文乱码问题
for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) {
for (HSLFTextParagraph hslfTextParagraph : list) {
for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) {
Double size = textRun.getFontSize();
if ((size <= 0) || (size >= 26040)) {
textRun.setFontSize(20.0);
}
textRun.setFontFamily("simsun");
}
}
}
String imgName = (i + 1) + ".png";
BufferedImage oneBufferedImage = new BufferedImage(oneSlideShow.getPageSize().width, oneSlideShow.getPageSize().height, BufferedImage.TYPE_INT_RGB);
Graphics2D oneGraphics2D = oneBufferedImage.createGraphics();
pptPageXSLFSLiseList.get(i).draw(oneGraphics2D);
try (OutputStream imageOut = new FileOutputStream(targetImageFileDir + imgName)) {
ImageIO.write(oneBufferedImage, "png", imageOut);
} finally {
}
}
} catch (Exception e) {
}
}
}
PS: 上面的类没有import相应的包,使用时请自行导包。上面代码,略有简陋,如有问题敬请斧正。参考https://blog.csdn.net/yushuai_it/article/details/65445898