Java POI 读取带有空单元格或者缺失单元格的excel文件

参考文档: http://poi.apache.org/spreadsheet/quick-guide.html#Iterator

请注意MissingCellPolicy定义了如何处理缺失的单元格/空单元格:
    /** Missing cells are returned as null, Blank cells are returned as normal */
    public static final MissingCellPolicy RETURN_NULL_AND_BLANK = new MissingCellPolicy();
    /** Missing cells are returned as null, as are blank cells */
    public static final MissingCellPolicy RETURN_BLANK_AS_NULL = new MissingCellPolicy();
    /** A new, blank cell is created for missing cells. Blank cells are returned as normal */
    public static final MissingCellPolicy CREATE_NULL_AS_BLANK = new MissingCellPolicy();


程序中通过Row.RETURN_NULL_AND_BLANK访问。

这里有一tricky的地方,如果用CellIterator的方式,当碰到空(没有定义,没有编辑过)的单元格时,CellIterator.next()读取的是空单元格右边有内容的单元格,也就是说跳过了空的单元格,如果你的程序是跟位置有关系的话,这就会出错了。比如下面的代码就会跳过空的单元格,
while (rowIterator.hasNext()) {
	row = rowIterator.next();
	Iterator<Cell> cellIterator = row.cellIterator();
	while(cellIterator.hasNext()) {
		Cell cell = cellIterator.next();
		switch (cell.getCellType()) {
			case Cell.CELL_TYPE_BLANK:
			case Cell.CELL_TYPE_STRING:
			   //your logic ...
			   break;
		}
        }
}


Iterate over rows and cells
Sometimes, you'd like to just iterate over all the rows in a sheet, or all the cells in a row. This is possible with a simple for loop.

Luckily, this is very easy. Row defines a CellIterator inner class to handle iterating over the cells (get one with a call to row.cellIterator()), and Sheet provides a rowIterator() method to give an iterator over all the rows. These implement the java.lang.Iterable interface to allow foreach loops.
这种方式本质就是使用CellIterator,跟上面的例子一样会跳过没有定义过的单元格

    
Sheet sheet = wb.getSheetAt(0);
    for (Row row : sheet) {
      for (Cell cell : row) {
        // Do something here
      }
    }


Iterate over cells, with control of missing / blank cells
In some cases, when iterating, you need full control over how missing or blank rows and cells are treated, and you need to ensure you visit every cell and not just those defined in the file. ( The CellIterator will only return the cells defined in the file, which is largely those with values or stylings, but it depends on Excel).

In cases such as these, you should fetch the first and last column information for a row, then call getCell(int, MissingCellPolicy) to fetch the cell. Use a MissingCellPolicy to control how blank or null cells are handled.

    
// Decide which rows to process
    int rowStart = Math.min(15, sheet.getFirstRowNum());
    int rowEnd = Math.max(1400, sheet.getLastRowNum());

    for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
       Row r = sheet.getRow(rowNum);

       int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);

       for (int cn = 0; cn < lastColumn; cn++) {
          Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
          if (c == null) {
             // The spreadsheet is empty in this cell
          } else {
             // Do something useful with the cell's contents
          }
       }
    }


所以如果你需要处理所有的单元格,即不管该单元格有没有定义,有没有内容,你的程序都会依赖这些单元格,那么你应该使用普通的for循环配合getRow(), getCell()方法和MissingCellPolicy策略来循环整个Excel文件的单元格。

你可能感兴趣的:(poi,Excel,xssf,hssf,sheet)