###1、poi简介:
poi提供API用于操作各种基于OOXML和OLE2的文件格式;OLES2在XLS、DOC、PPT格式中使用;OOXML用于Office 2007 and 2008.包括XLSX、DOCX and PPTX。我们能使用java来读写MS-Office文件;
官方介绍:https://poi.apache.org/
###2、写excel文件:
除开底层的接口,poi组件主要提供了3种接口用于操作Excel文件,HSSF、XSSF、SXSSF,其中SXSSF构建与XSSF之上的低内存API;
HSSF用于xls格式,XSSF用于xlsx格式;SXSSF构建于XSSF之上,通过将部分数据保存到硬盘的方式,降低内存占用;同时失去了移动单元格、公式支持等特性;
官方特性差异:
代码示例:
@Test
public void hssfBasicUse() throws IOException, InterruptedException {
Workbook wb = new HSSFWorkbook();
// Workbook wb = new XSSFWorkbook();
// Workbook wb = new SXSSFWorkbook();
Sheet sh = wb.createSheet();
for(int rownum = 0; rownum < 100; rownum++){
Row row = sh.createRow(rownum);
for(int cellnum = 0; cellnum < 10; cellnum++){
Cell cell = row.createCell(cellnum);
cell.setCellValue("test");
}
}
FileOutputStream out = new FileOutputStream("D:\\desktop\\sxssftest\\hssf.xls");
// FileOutputStream out = new FileOutputStream("D:\\desktop\\sxssftest\\xssf.xlsx");
// FileOutputStream out = new FileOutputStream("D:\\desktop\\sxssftest\\xsssf.xlsx");
wb.write(out);
out.close();
}
更多的官方示例:https://poi.apache.org/spreadsheet/quick-guide.html
###3、SXSSF:
SXSSF构建于XSSF之上,通过将部分数据保存到硬盘的方式,降低内存占用;通过滑动窗口(大小可配置)控制内存中可访问的行数,将不再可以访问的数据写入到硬盘临时文件中据达到目的;
官方示例如下:
import junit.framework.Assert;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
public static void main(String[] args) throws Throwable {
SXSSFWorkbook wb = new SXSSFWorkbook(100); // 保持100行数据,超过的行数据写入到硬盘;
Sheet sh = wb.createSheet();
for(int rownum = 0; rownum < 1000; rownum++){
Row row = sh.createRow(rownum);
for(int cellnum = 0; cellnum < 10; cellnum++){
Cell cell = row.createCell(cellnum);
String address = new CellReference(cell).formatAsString();
cell.setCellValue(address);
}
}
// 行号<900的数据已经写入到硬盘不再可访问;
for(int rownum = 0; rownum < 900; rownum++){
Assert.assertNull(sh.getRow(rownum));
}
// 最后100行数据仍存在内存中
for(int rownum = 900; rownum < 1000; rownum++){
Assert.assertNotNull(sh.getRow(rownum));
}
FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
wb.write(out);
out.close();
// 释放临时文件
wb.dispose();
}
####3.1、SXSSF的陷阱
**场景1、**写入Excel的汇总信息在最表格的最上方,单元格的内容需要等到整个sheet都写完之后才能确定。一般的代码写法可能如下(并不会抛出异常):
@Test
public void slideWinSizeTrap() throws IOException {
SXSSFWorkbook wb = new SXSSFWorkbook(100);//设置为50会导致输出的Excel文件内容不相同;
Sheet sh = wb.createSheet();
Cell summaryCell = null;
for(int rownum = 0; rownum < 100; rownum++){
Row row = sh.createRow(rownum);
for(int cellnum = 0; cellnum < 10; cellnum++){
Cell cell = row.createCell(cellnum);
if (rownum == 0 && cellnum == 0) {
summaryCell = cell;//通过summaryCell 引用到(0,0)单元格
}
cell.setCellValue("test");
}
}
summaryCell.setCellValue("summary infomation");//设置单元格的内容
FileOutputStream out = new FileOutputStream("D:\\desktop\\sxssftest\\sxssf.xlsx");
wb.write(out);
out.close();
wb.dispose();
}
分别将窗口大小设置为50和100,单元格内容分别为:test,summary infomation
**场景二、**顶部的单元格需要创建的超链接,链接到当前sheet的其他单元格或者其他sheet的单元格,如何在超过滑动窗口限制后设置超链接的地址?
通常的代码写法如下:
/**
* 通过持有cell对象设置超链接
* warning:错误写法
*/
@Test
public void setHyperLinkByCell() throws IOException {
SXSSFWorkbook wb = new SXSSFWorkbook(50);
CreationHelper createHelper = wb.getCreationHelper();
Sheet sh = wb.createSheet("Target Sheet");
CellStyle hlink_style = wb.createCellStyle();
Font hlink_font = wb.createFont();
hlink_font.setUnderline(Font.U_SINGLE);
hlink_font.setColor(IndexedColors.BLUE.getIndex());
hlink_style.setFont(hlink_font);
Hyperlink link2 = null;
Cell neededSetHyperLinkCell = null;//通过变量持有单元格,到后期设置;
for (int rownum = 0; rownum < 100; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 5; cellnum++) {
Cell cell = row.createCell(cellnum);
cell.setCellValue("a");
if (cellnum == 0 && rownum == 0) {
neededSetHyperLinkCell = cell;
cell.setCellStyle(hlink_style);
}
}
}
link2 = createHelper.createHyperlink(HyperlinkType.DOCUMENT);
link2.setAddress("'Target Sheet'!A155");
neededSetHyperLinkCell.setHyperlink(link2);//设置单元格超链接,实际运行中该单元格内容已经写入到硬盘临时文件中,不再可以访问;
FileOutputStream out = new FileOutputStream("D:\\desktop\\创建超链接.xlsx");
wb.write(out);
out.close();
wb.dispose();
}
上述代码写入文件能正常生成,运行中也不会发生任何异常问题;打开生成的Excel文件会报如下报错:
最终打开后的文件部分如下:
单元格内容和样式显示正确,但是超链接失效;
通过将超出滑动窗口的行写入到硬盘中,达到降低内存占用的目的;但并不是相关的的所有的内容都写入到硬盘;合并单元格、超链接等并未不会写入到临时文件中;
官方表述如下:
Please note that there are still things that still may consume a large amount of memory based on which features you are using, e.g. merged regions, hyperlinks, comments, … are still only stored in memory and thus may require a lot of memory if used extensively.
正确写法如下:
@Test
public void creatHyperLink() throws IOException {
SXSSFWorkbook wb = new SXSSFWorkbook(50);
CreationHelper createHelper = wb.getCreationHelper();
Sheet sh = wb.createSheet("Target Sheet");
CellStyle hlink_style = wb.createCellStyle();
Font hlink_font = wb.createFont();
hlink_font.setUnderline(Font.U_SINGLE);
hlink_font.setColor(IndexedColors.BLUE.getIndex());
hlink_style.setFont(hlink_font);
Hyperlink link2 = null;
for (int rownum = 0; rownum < 100; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 5; cellnum++) {
Cell cell = row.createCell(cellnum);
cell.setCellValue("a");
if (cellnum == 0 && rownum == 0) {
link2 = createHelper.createHyperlink(HyperlinkType.DOCUMENT);
cell.setHyperlink(link2);
cell.setCellStyle(hlink_style);
}
}
}
link2.setAddress("'Target Sheet'!A155");
FileOutputStream out = new FileOutputStream("D:\\desktop\\创建超链接.xlsx");
wb.write(out);
out.close();
wb.dispose();
}
####3.2 SXSSF与XSSF内存占用比较
分别使用XSSF和SXSSF构造单sheet、1000行、30列、每个单元格均填“test”的Excel文件
@Test
public void writeExcelHeapTest() throws IOException, InterruptedException {
System.out.println("接入内存工具");
Thread.sleep(30*1000);
System.out.println("开始运行");
int totalRow = 1000;
Workbook wb = new SXSSFWorkbook(100);
// Workbook wb = new XSSFWorkbook();
Sheet sh = wb.createSheet();
for (int rownum = 0; rownum < totalRow; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 30; cellnum++) {
Cell cell = row.createCell(cellnum);
cell.setCellValue("test");
}
}
FileOutputStream out = new FileOutputStream("D:\\desktop\\sxssftest\\heaptest.xlsx");
wb.write(out);
System.out.println("写入完成");
}
可以看到,上述例子中堆内存占用相差近5倍;
####3.3 其他一些测试项
通过控制可访问的数据列的多少,将不可访问的列数据先写入到硬盘文件中达到降低内存的目的;生成的临时文件路径C:\Users***\AppData\Local\Temp\poifiles;
//默认不开启压缩,生成xml临时文件:
Workbook wb = new SXSSFWorkbook(winSize);
//开启压缩的方式,生成GZ临时文件:
Workbook wb = new SXSSFWorkbook(null,winSize,true);
通过两种方式分别执行;
以9w行、30列、单元格内容“test”数据进行
1、滑动窗口大小与消耗时间的关系:
2、是否开启压缩生成的临时文件大小
不开启压缩:大约135M
开启压缩:大约5.66M