背景: 系统支持excel的两种文件上传 xlsx 和 xls
问题:今天上传了一个 xlsx的文档 ,系统错误为
org.apache.poi.openxml4j.exceptions.OLE2NotOfficeXmlFileException: The supplied data appears to be in the OLE2 Format.
You are calling the part of POI that deals with OOXML (Office Open XML) Documents. You need to call a different part of POI
to process this data (eg HSSF instead of XSSF)
at org.apache.poi.openxml4j.opc.internal.ZipHelper.verifyZipHeader(ZipHelper.java:147) ~[poi-ooxml-4.1.0.jar!/:4.1.0]
at org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipStream(ZipHelper.java:175) ~[poi-ooxml-4.1.0.jar!/:4.1.0]
at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:104) ~[poi-ooxml-4.1.0.jar!/:4.1.0]
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:301) ~[poi-ooxml-4.1.0.jar!/:4.1.0]
at org.apache.poi.ooxml.util.PackageHelper.open(PackageHelper.java:37) ~[poi-ooxml-4.1.0.jar!/:4.1.0]
at org.apache.poi.xssf.usermodel.XSSFWorkbook.(XSSFWorkbook.java:303) ~[poi-ooxml-4.1.0.jar!/:4.1.0
正常系统逻辑处理为:
private Workbook getWorkbook(String filePath, InputStream is) throws IOException {
String fileExt = this.fileExt(filePath);
return "xlsx".equals(fileExt) ? new XSSFWorkbook(is) : "xls".equals(fileExt) ? new HSSFWorkbook(is) : null;
}
这是线上一个问题,客户说上传错误 但是不肯将文件发我们,通过分析错误原因 看到,是后缀名称和解析方式出了问题,
也就是手动修改文件名后不匹配导致的。
找到原因就好说了,尝试catch里面处理异常
if( "xlsx".equals(fileExt)){
try {
workbook = new XSSFWorkbook(is);
}catch (Exception e){
log.error("我草,xlsx是手动改的后缀 而不是通过另存为的方式");
workbook = new HSSFWorkbook(is);
}
} else if ("xls".equals(fileExt) ){
try {
workbook = new HSSFWorkbook(is);
}catch (Exception e){
log.error("我草,xls是手动改的后缀 而不是通过另存为的方式");
workbook = new XSSFWorkbook(is);
}
}
尝试仍然不好用,其实这种事指标不治本,文件的本身字节码并没有改变,但是通过改变workboor获取方式,仍是存在问题的。
再次优化一下,使用工厂模式创建Workbook:
// 通过文件流工厂方式来处理
Workbook workbook = WorkbookFactory.create(is);
看一下POI的源码:
public static Workbook create(InputStream inp, String password) throws IOException, InvalidFormatException, EncryptedDocumentException {
InputStream is = FileMagic.prepareToCheckMagic(inp);
FileMagic fm = FileMagic.valueOf(is);
switch(fm) {
case OLE2:
NPOIFSFileSystem fs = new NPOIFSFileSystem(is);
return create(fs, password);
case OOXML:
return new XSSFWorkbook(OPCPackage.open(is));
default:
throw new InvalidFormatException("Your InputStream was neither an OLE2 stream, nor an OOXML stream");
}
}
通过debug调试 可得出
xlsx 对应的FileMagic为 OOXML
xls 对应的FileMagic为 OLE2
FileMagic 类的代码可参考如下:
public enum FileMagic {
/** OLE2 / BIFF8+ stream used for Office 97 and higher documents */
OLE2(HeaderBlockConstants._signature),
/** OOXML / ZIP stream */
OOXML(OOXML_FILE_HEADER),
/** XML file */
XML(RAW_XML_FILE_HEADER),
/** BIFF2 raw stream - for Excel 2 */
BIFF2(new byte[]{
0x09, 0x00, // sid=0x0009
0x04, 0x00, // size=0x0004
0x00, 0x00, // unused
0x70, 0x00 // 0x70 = multiple values
}),
/** BIFF3 raw stream - for Excel 3 */
BIFF3(new byte[]{
// 省略
}
OLE2类型的orkbook实现方式:
里面有个password 默认为 null ,但是我也没有找到 何时调用password不为空的时候 估计是预留的吧,有兴趣的可以自己看看
// 第一步先转化为POIFSFileSystem
POIFSFileSystem fs = new POIFSFileSystem(is);
// 第二步调用this.create(fs, password);
public static Workbook create(final DirectoryNode root, String password) throws IOException {
// Encrypted OOXML files go inside OLE2 containers, is this one?
if (root.hasEntry(Decryptor.DEFAULT_POIFS_ENTRY)) {
InputStream stream = null;
try {
stream = DocumentFactoryHelper.getDecryptedStream(root, password);
return createXSSFWorkbook(stream);
} finally {
IOUtils.closeQuietly(stream);
// as we processed the full stream already, we can close the filesystem here
// otherwise file handles are leaked
root.getFileSystem().close();
}
}
// If we get here, it isn't an encrypted PPTX file
// So, treat it as a regular HSLF PPT one
boolean passwordSet = false;
if (password != null) {
Biff8EncryptionKey.setCurrentUserPassword(password);
passwordSet = true;
}
try {
return createHSSFWorkbook(root);
} finally {
if (passwordSet) {
Biff8EncryptionKey.setCurrentUserPassword(null);
}
}
}
OOXML类型的orkbook实现方式:
private static Workbook createWorkbook(String factoryClass, Object[] args) throws IOException, EncryptedDocumentException {
try {
Class> clazz = WorkbookFactory.class.getClassLoader().loadClass(factoryClass);
Class>[] argsClz = new Class>[args.length];
int i=0;
for (Object o : args) {
Class> c = o.getClass();
if (Boolean.class.isAssignableFrom(c)) {
c = boolean.class;
} else if (InputStream.class.isAssignableFrom(c)) {
c = InputStream.class;
} else if (File.class.isAssignableFrom(c)) {
c = File.class;
}
argsClz[i++] = c;
}
Method m = clazz.getMethod("createWorkbook", argsClz);
return (Workbook)m.invoke(null, args);
} catch (InvocationTargetException e) {
Throwable t = e.getCause();
if (t instanceof IOException) {
throw (IOException)t;
} else if (t instanceof EncryptedDocumentException) {
throw (EncryptedDocumentException)t;
} else if (t instanceof OldFileFormatException) {
throw (OldFileFormatException)t;
} else if (t instanceof RuntimeException) {
throw (RuntimeException)t;
} else {
throw new IOException(t.getMessage(), t);
}
} catch (Exception e) {
throw new IOException(e);
}
}