Android中解析读取复杂word,excel,ppt等的方法

  前段时间在尝试做一个Android里的万能播放器,能播放各种格式的软件,其中就涉及到了最常用的office软件。查阅了下资料,发现Android中最传统的直接解析读取word,excel的方法主要用了Java里第三方包,比如利用tm-extractors-0.4.jar和jxl.jar等,下面附上代码和效果图。

        读取word用了tm-extractors-0.4.jar包,代码如下:

  

package com.example.readword;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;

import org.textmining.text.extraction.WordExtractor;

import android.app.Activity;
import android.os.Bundle;
import android.os.Environment;
import android.widget.TextView;


public class MainActivity extends Activity {
    /** Called when the activity is first created. */

    private TextView text;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        text = (TextView) findViewById(R.id.text);

        String str = readWord("/storage/emulated/0/ArcGIS/localtilelayer/11.doc");
        text.setText(str.trim().replace("", ""));
    }

    public String readWord(String file){
        //创建输入流用来读取doc文件
        FileInputStream in;
        String text = null;
        try {
            in = new FileInputStream(new File(file));
            WordExtractor extractor = null;
            //创建WordExtractor
            extractor = new WordExtractor();
            //进行提取对doc文件
            text = extractor.extractText(in);
        }
        catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        return text;
    }
}

        效果图如下:



       只是从网上随便下载的一个文档,我们可以看出,虽然能读取,但是格式的效果并不佳,而且只能读取doc,不能读取docx格式,也不能读取doc里的图片。另外就是加入使用WPF打开过这个doc的话,将无法再次读取(对于只安装WPF的我简直是个灾难)

       然后是用jxl读取excel的代码,这个代码不是很齐,就写了个解析的,将excel里每行每列都解析了出来,然后自己可以重新再编辑,代码如下:

     

package com.readexl;

import java.io.FileInputStream;
import java.io.InputStream;

import android.os.Bundle;
import android.os.Environment;
import android.app.Activity;
import android.text.method.ScrollingMovementMethod;
import android.view.Menu;
import android.widget.TextView;

import jxl.*;

public class MainActivity extends Activity {
    TextView txt = null;
    public String filePath_xls = Environment.getExternalStorageDirectory()
            + "/case.xls";

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        txt = (TextView)findViewById(R.id.txt_show);
        txt.setMovementMethod(ScrollingMovementMethod.getInstance());
        readExcel();
    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        // Inflate the menu; this adds items to the action bar if it is present.
        getMenuInflater().inflate(R.menu.main, menu);
        return true;
    }

    public void readExcel() {
        try {
            /**
             * 后续考虑问题,比如Excel里面的图片以及其他数据类型的读取
             **/

            InputStream is = new FileInputStream(filePath_xls);
            //Workbook book = Workbook.getWorkbook(new File("mnt/sdcard/test.xls"));
            Workbook book = Workbook.getWorkbook(is);

            int num = book.getNumberOfSheets();
            txt.setText("the num of sheets is " + num+ "\n");
            // 获得第一个工作表对象
            Sheet sheet = book.getSheet(0);
            int Rows = sheet.getRows();
            int Cols = sheet.getColumns();
            txt.append("the name of sheet is " + sheet.getName() + "\n");
            txt.append("total rows is " + Rows + "\n");
            txt.append("total cols is " + Cols + "\n");
            for (int i = 0; i < Cols; ++i) {
                for (int j = 0; j < Rows; ++j) {
                    // getCell(Col,Row)获得单元格的值
                    txt.append("contents:" + sheet.getCell(i,j).getContents() + "\n");
                }
            }
            book.close();
        } catch (Exception e) {
            System.out.println(e);
        }
    }

}

效果图如下:

        

       好吧,这只是个半成品,不过,这个方法肯定是行得通的。

       之前说了这么多,很明白的意思就是我对于这两种方法都不是很满意。在这里,我先说下doc和docx的区别(xls和xlsx,ppt和pptx等区别都和此类似)

       众所周知的是doc03及之前版本word所保存的格式,docx07版本之后保存的格式,简单的说,在doc中,微软还是用二进制存储方式;在docx中微软开始用xml方式,docx实际上成了一个打包的ZIP压缩文件。doc解压得到的是没有扩展名的文件碎片,而docx解压可以得到一个XML和几个包含信息的文件夹。两者比较的结论就是docx更小,而且要读取图片更容易。(参考http://www.zhihu.com/question/21547795)

  好吧,回到正题。如何才能解析各种word,excel等能保留原来格式并且解析里面的图片,表格或附件等内容呢。那当然就是html了!不得不承认html对于页面,表格等展示的效果确是是很强大的,原生很难写出这样的效果。在网上找了诸多的资料,以及各个大神的代码,自己又再此基础上修改了下,实现的效果还不错吧。

  利用的包是POI(一堆很强大的包,可以解析几乎所有的office软件,这里以doc,docx,xls,xlsx为例)


读取文件后根据不同文件类型分别进行操作。

public void read() {
    if(!myFile.exists()){
        if (this.nameStr.endsWith(".doc")) {
            this.getRange();
            this.makeFile();
            this.readDOC();
        }
        if (this.nameStr.endsWith(".docx")) {
            this.makeFile();
            this.readDOCX();
        }
        if (this.nameStr.endsWith(".xls")) {
            try {
                this.makeFile();
                this.readXLS();
            } catch (Exception e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
        if (this.nameStr.endsWith(".xlsx")) {
            try{
                this.makeFile();
                this.readXLSX();
            }catch (Exception e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
    }
    returnPath = "file:///" + myFile;
    // this.view.loadUrl("file:///" + this.htmlPath);
    System.out.println("htmlPath" + this.htmlPath);

}
先贴上公用的方法,主要是设置生成的html文件保存地址:

public void makeFile() {
    String sdStateString = android.os.Environment.getExternalStorageState();// 获取外部存储状态
    if (sdStateString.equals(android.os.Environment.MEDIA_MOUNTED)) {// 确认sd卡存在,原理不知,媒体安装??
        try {
            File sdFile = android.os.Environment
                    .getExternalStorageDirectory();// 获取扩展设备的文件目录
            String path = sdFile.getAbsolutePath() + File.separator
                    + "library";// 得到sd卡(扩展设备)的绝对路径+"/"+xiao
            File dirFile = new File(path);// 获取xiao文件夹地址
            if (!dirFile.exists()) {// 如果不存在
                dirFile.mkdir();// 创建目录
            }
            File myFile = new File(path + File.separator +filename+ ".html");// 获取my.html的地址
            if (!myFile.exists()) {// 如果不存在
                myFile.createNewFile();// 创建文件
            }
            this.htmlPath = myFile.getAbsolutePath();// 返回路径
        } catch (Exception e) {
        }
    }
}
然后是读取doc:

private void getRange() {
    FileInputStream in = null;
    POIFSFileSystem pfs = null;

    try {
        in = new FileInputStream(nameStr);
        pfs = new POIFSFileSystem(in);
        hwpf = new HWPFDocument(pfs);
    } catch (FileNotFoundException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    range = hwpf.getRange();

    pictures = hwpf.getPicturesTable().getAllPictures();

    tableIterator = new TableIterator(range);

}
public void readDOC() {

    try {
        myFile = new File(htmlPath);
        output = new FileOutputStream(myFile);
        presentPicture=0;
        String head = "\"utf-8\">";
        String tagBegin = "

"; String tagEnd = "

"
; output.write(head.getBytes()); int numParagraphs = range.numParagraphs();// 得到页面所有的段落数 for (int i = 0; i < numParagraphs; i++) { // 遍历段落数 Paragraph p = range.getParagraph(i); // 得到文档中的每一个段落 if (p.isInTable()) { int temp = i; if (tableIterator.hasNext()) { String tableBegin = "\"border-collapse:collapse\" border=1 bordercolor=\"black\">"; String tableEnd = "
"
; String rowBegin = ""; String rowEnd = ""; String colBegin = ""; String colEnd = ""; Table table = tableIterator.next(); output.write(tableBegin.getBytes()); int rows = table.numRows(); for (int r = 0; r < rows; r++) { output.write(rowBegin.getBytes()); TableRow row = table.getRow(r); int cols = row.numCells(); int rowNumParagraphs = row.numParagraphs(); int colsNumParagraphs = 0; for (int c = 0; c < cols; c++) { output.write(colBegin.getBytes()); TableCell cell = row.getCell(c); int max = temp + cell.numParagraphs(); colsNumParagraphs = colsNumParagraphs + cell.numParagraphs(); for (int cp = temp; cp < max; cp++) { Paragraph p1 = range.getParagraph(cp); output.write(tagBegin.getBytes()); writeParagraphContent(p1); output.write(tagEnd.getBytes()); temp++; } output.write(colEnd.getBytes()); } int max1 = temp + rowNumParagraphs; for (int m = temp + colsNumParagraphs; m < max1; m++) { temp++; } output.write(rowEnd.getBytes()); } output.write(tableEnd.getBytes()); } i = temp; } else { output.write(tagBegin.getBytes()); writeParagraphContent(p); output.write(tagEnd.getBytes()); } } String end = ""; output.write(end.getBytes()); output.close(); } catch (Exception e) { System.out.println("readAndWrite Exception:"+e.getMessage()); e.printStackTrace(); } }
读取docx

public void readDOCX() {
    String river = "";
    try {
        this.myFile = new File(this.htmlPath);// new一个File,路径为html文件
        this.output = new FileOutputStream(this.myFile);// new一个流,目标为html文件
        presentPicture=0;
        String head = "\"utf-8\">";// 定义头文件,我在这里加了utf-8,不然会出现乱码
        String end = "";
        String tagBegin = "

";// 段落开始,标记开始? String tagEnd = "

"
;// 段落结束 String tableBegin = "\"border-collapse:collapse\" border=1 bordercolor=\"black\">"; String tableEnd = "
"
; String rowBegin = ""; String rowEnd = ""; String colBegin = ""; String colEnd = ""; String style = "style=\""; this.output.write(head.getBytes());// 写如头部 ZipFile xlsxFile = new ZipFile(new File(this.nameStr)); ZipEntry sharedStringXML = xlsxFile.getEntry("word/document.xml"); InputStream inputStream = xlsxFile.getInputStream(sharedStringXML); XmlPullParser xmlParser = Xml.newPullParser(); xmlParser.setInput(inputStream, "utf-8"); int evtType = xmlParser.getEventType(); boolean isTable = false; // 是表格 用来统计 列 行 数 boolean isSize = false; // 大小状态 boolean isColor = false; // 颜色状态 boolean isCenter = false; // 居中状态 boolean isRight = false; // 居右状态 boolean isItalic = false; // 是斜体 boolean isUnderline = false; // 是下划线 boolean isBold = false; // 加粗 boolean isR = false; // 在那个r中 boolean isStyle = false; int pictureIndex = 1; // docx 压缩包中的图片名 iamge1 开始 所以索引从1开始 while (evtType != XmlPullParser.END_DOCUMENT) { switch (evtType) { // 开始标签 case XmlPullParser.START_TAG: String tag = xmlParser.getName(); if (tag.equalsIgnoreCase("r")) { isR = true; } if (tag.equalsIgnoreCase("u")) { // 判断下划线 isUnderline = true; } if (tag.equalsIgnoreCase("jc")) { // 判断对齐方式 String align = xmlParser.getAttributeValue(0); if (align.equals("center")) { this.output.write("
".getBytes()); isCenter = true; } if (align.equals("right")) { this.output.write("
\"right\">" .getBytes()); isRight = true; } } if (tag.equalsIgnoreCase("color")) { // 判断颜色 String color = xmlParser.getAttributeValue(0); this.output .write(("\"color:" + color + ";\">") .getBytes()); isColor = true; } if (tag.equalsIgnoreCase("sz")) { // 判断大小 if (isR == true) { int size = decideSize(Integer.valueOf(xmlParser .getAttributeValue(0))); this.output.write(("">") .getBytes()); isSize = true; } } // 下面是表格处理 if (tag.equalsIgnoreCase("tbl")) { // 检测到tbl 表格开始 this.output.write(tableBegin.getBytes()); isTable = true; } if (tag.equalsIgnoreCase("tr")) { // 行 this.output.write(rowBegin.getBytes()); } if (tag.equalsIgnoreCase("tc")) { // 列 this.output.write(colBegin.getBytes()); } if (tag.equalsIgnoreCase("pic")) { // 检测到标签 pic 图片 String entryName_jpeg = "word/media/image" + pictureIndex + ".jpeg"; String entryName_png = "word/media/image" + pictureIndex + ".png"; String entryName_gif = "word/media/image" + pictureIndex + ".gif"; String entryName_wmf = "word/media/image" + pictureIndex + ".wmf"; ZipEntry sharePicture = null; InputStream pictIS = null; sharePicture = xlsxFile.getEntry(entryName_jpeg); // 一下为读取docx的图片 转化为流数组 if (sharePicture == null) { sharePicture = xlsxFile.getEntry(entryName_png); } if(sharePicture == null){ sharePicture = xlsxFile.getEntry(entryName_gif); } if(sharePicture == null){ sharePicture = xlsxFile.getEntry(entryName_wmf); } if(sharePicture != null){ pictIS = xlsxFile.getInputStream(sharePicture); ByteArrayOutputStream pOut = new ByteArrayOutputStream(); byte[] bt = null; byte[] b = new byte[1000]; int len = 0; while ((len = pictIS.read(b)) != -1) { pOut.write(b, 0, len); } pictIS.close(); pOut.close(); bt = pOut.toByteArray(); Log.i("byteArray", "" + bt); if (pictIS != null) pictIS.close(); if (pOut != null) pOut.close(); writeDOCXPicture(bt); } pictureIndex++; // 转换一张后 索引+1 } if (tag.equalsIgnoreCase("b")) { // 检测到加粗标签 isBold = true; } if (tag.equalsIgnoreCase("p")) {// 检测到 p 标签 if (isTable == false) { // 如果在表格中 就无视 this.output.write(tagBegin.getBytes()); } } if (tag.equalsIgnoreCase("i")) { // 斜体 isItalic = true; } // 检测到值 标签 if (tag.equalsIgnoreCase("t")) { if (isBold == true) { // 加粗 this.output.write("".getBytes()); } if (isUnderline == true) { // 检测到下划线标签,输入 this.output.write("".getBytes()); } if (isItalic == true) { // 检测到斜体标签,输入 output.write("".getBytes()); } river = xmlParser.nextText(); this.output.write(river.getBytes()); // 写入数值 if (isItalic == true) { // 检测到斜体标签,在输入值之后,输入,并且斜体状态=false this.output.write("".getBytes()); isItalic = false; } if (isUnderline == true) {// 检测到下划线标签,在输入值之后,输入,并且下划线状态=false this.output.write("".getBytes()); isUnderline = false; } if (isBold == true) { // 加粗 this.output.write("".getBytes()); isBold = false; } if (isSize == true) { // 检测到大小设置,输入结束标签 this.output.write("".getBytes()); isSize = false; } if (isColor == true) { // 检测到颜色设置存在,输入结束标签 this.output.write("".getBytes()); isColor = false; } if (isCenter == true) { // 检测到居中,输入结束标签 this.output.write("
"
.getBytes()); isCenter = false; } if (isRight == true) { // 居右不能使用,使用div可能会有状况,先用着 this.output.write("
".getBytes()); isRight = false; } } break; // 结束标签 case XmlPullParser.END_TAG: String tag2 = xmlParser.getName(); if (tag2.equalsIgnoreCase("tbl")) { // 检测到表格结束,更改表格状态 this.output.write(tableEnd.getBytes()); isTable = false; } if (tag2.equalsIgnoreCase("tr")) { // 行结束 this.output.write(rowEnd.getBytes()); } if (tag2.equalsIgnoreCase("tc")) { // 列结束 this.output.write(colEnd.getBytes()); } if (tag2.equalsIgnoreCase("p")) { // p结束,如果在表格中就无视 if (isTable == false) { this.output.write(tagEnd.getBytes()); } } if (tag2.equalsIgnoreCase("r")) { isR = false; } break; default: break; } evtType = xmlParser.next(); } this.output.write(end.getBytes()); } catch (ZipException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (XmlPullParserException e) { e.printStackTrace(); } if (river == null) { river = "解析文件出现问题"; } } 读取xls:

public StringBuffer readXLS() throws Exception {

        myFile = new File(htmlPath);
        output = new FileOutputStream(myFile);
        lsb.append("");
        lsb.append("");
        HSSFSheet sheet = null;

        String excelFileName = nameStr;
        try {
        HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(
        excelFileName)); // 获整个Excel

        for (int sheetIndex = 0; sheetIndex < workbook.getNumberOfSheets(); sheetIndex++) {
        sheet = workbook.getSheetAt(sheetIndex);// 获所有的sheet
        String sheetName = workbook.getSheetName(sheetIndex); // sheetName
        if (workbook.getSheetAt(sheetIndex) != null) {
        sheet = workbook.getSheetAt(sheetIndex);// 获得不为空的这个sheet
        if (sheet != null) {
        int firstRowNum = sheet.getFirstRowNum(); // 第一行
        int lastRowNum = sheet.getLastRowNum(); // 最后一行
        lsb.append("\"100%\" style=\"border:1px solid #000;border-width:1px 0 0 1px;margin:2px 0 2px 0;border-collapse:collapse;\">");

        for (int rowNum = firstRowNum; rowNum <= lastRowNum; rowNum++) {
        if (sheet.getRow(rowNum) != null) {// 如果行不为空,
        HSSFRow row = sheet.getRow(rowNum);
        short firstCellNum = row.getFirstCellNum(); // 该行的第一个单元格
        short lastCellNum = row.getLastCellNum(); // 该行的最后一个单元格
        int height = (int) (row.getHeight() / 15.625); // 行的高度
        lsb.append("\""
        + height
        + "\" style=\"border:1px solid #000;border-width:0 1px 1px 0;margin:2px 0 2px 0;\">");
        for (short cellNum = firstCellNum; cellNum <= lastCellNum; cellNum++) { // 循环该行的每一个单元格
        HSSFCell cell = row.getCell(cellNum);
        if (cell != null) {
        if (cell.getCellType() == HSSFCell.CELL_TYPE_BLANK) {
        continue;
        } else {
        StringBuffer tdStyle = new StringBuffer(
        "");
        }
        }
        }
        lsb.append("");
        }
        }
        }

        }

        }
        output.write(lsb.toString().getBytes());
        } catch (FileNotFoundException e) {
        throw new Exception("文件" + excelFileName + " 没有找到!");
        } catch (IOException e) {
        throw new Exception("文件" + excelFileName + " 处理错误("
        + e.getMessage() + ")!");
        }
        return lsb;
        }读取xlsx:

public void readXLSX() {
        try {
        this.myFile = new File(this.htmlPath);// new一个File,路径为html文件
        this.output = new FileOutputStream(this.myFile);// new一个流,目标为html文件
        String head = "\"-//W3C//DTD HTML 4.01 Transitional//EN\"\"http://www.w3.org/TR/html4/loose.dtd\">\"utf-8\">";// 定义头文件,我在这里加了utf-8,不然会出现乱码
        String tableBegin = "
\"border:1px solid #000; border-width:0 1px 1px 0;margin:2px 0 2px 0; "); HSSFCellStyle cellStyle = cell .getCellStyle(); HSSFPalette palette = workbook .getCustomPalette(); // 类HSSFPalette用于求颜色的国际标准形式 HSSFColor hColor = palette .getColor(cellStyle .getFillForegroundColor()); HSSFColor hColor2 = palette .getColor(cellStyle .getFont(workbook) .getColor()); String bgColor = convertToStardColor(hColor);// 背景颜色 short boldWeight = cellStyle .getFont(workbook) .getBoldweight(); // 字体粗细 short fontHeight = (short) (cellStyle .getFont(workbook) .getFontHeight() / 2); // 字体大小 String fontColor = convertToStardColor(hColor2); // 字体颜色 if (bgColor != null && !"".equals(bgColor .trim())) { tdStyle.append(" background-color:" + bgColor + "; "); } if (fontColor != null && !"".equals(fontColor .trim())) { tdStyle.append(" color:" + fontColor + "; "); } tdStyle.append(" font-weight:" + boldWeight + "; "); tdStyle.append(" font-size: " + fontHeight + "%;"); lsb.append(tdStyle + "\""); int width = (int) (sheet.getColumnWidth(cellNum) / 35.7); int cellReginCol = getMergerCellRegionCol( sheet, rowNum, cellNum); // 合并的列(solspan) int cellReginRow = getMergerCellRegionRow( sheet, rowNum, cellNum);// 合并的行(rowspan) String align = convertVerticalAlignToHtml(cellStyle .getAlignment()); // String vAlign = convertVerticalAlignToHtml(cellStyle .getVerticalAlignment()); lsb.append(" align=\"" + align + "\" valign=\"" + vAlign + "\" width=\"" + width + "\" "); lsb.append(" colspan=\"" + cellReginCol + "\" rowspan=\"" + cellReginRow + "\""); lsb.append(">" + getCellValue(cell) + "
\"border-collapse:collapse\" border=1 bordercolor=\"black\">"; String tableEnd = "
"
; String rowBegin = ""; String rowEnd = ""; String colBegin = ""; String colEnd = ""; String end = ""; this.output.write(head.getBytes()); this.output.write(tableBegin.getBytes()); String str = ""; String v = null; boolean flat = false; List<String> ls = new ArrayList<String>(); try { ZipFile xlsxFile = new ZipFile(new File(this.nameStr));// 地址 ZipEntry sharedStringXML = xlsxFile .getEntry("xl/sharedStrings.xml");// 共享字符串 InputStream inputStream = xlsxFile .getInputStream(sharedStringXML);// 输入流 目标上面的共享字符串 XmlPullParser xmlParser = Xml.newPullParser();// new 解析器 xmlParser.setInput(inputStream, "utf-8");// 设置解析器类型 int evtType = xmlParser.getEventType();// 获取解析器的事件类型 while (evtType != XmlPullParser.END_DOCUMENT) {// 如果不等于 文档结束 switch (evtType) { case XmlPullParser.START_TAG: // 标签开始 String tag = xmlParser.getName(); if (tag.equalsIgnoreCase("t")) { ls.add(xmlParser.nextText()); } break; case XmlPullParser.END_TAG: // 标签结束 break; default: break; } evtType = xmlParser.next(); } ZipEntry sheetXML = xlsxFile .getEntry("xl/worksheets/sheet1.xml"); InputStream inputStreamsheet = xlsxFile .getInputStream(sheetXML); XmlPullParser xmlParsersheet = Xml.newPullParser(); xmlParsersheet.setInput(inputStreamsheet, "utf-8"); int evtTypesheet = xmlParsersheet.getEventType(); this.output.write(rowBegin.getBytes()); int i = -1; while (evtTypesheet != XmlPullParser.END_DOCUMENT) { switch (evtTypesheet) { case XmlPullParser.START_TAG: // 标签开始 String tag = xmlParsersheet.getName(); if (tag.equalsIgnoreCase("row")) { } else { if (tag.equalsIgnoreCase("c")) { String t = xmlParsersheet.getAttributeValue( null, "t"); if (t != null) { flat = true; System.out.println(flat + "有"); } else {// 没有数据时 找了我n年,终于找到了 输入 表示空格 this.output.write(colBegin.getBytes()); this.output.write(colEnd.getBytes()); System.out.println(flat + "没有"); flat = false; } } else { if (tag.equalsIgnoreCase("v")) { v = xmlParsersheet.nextText(); this.output.write(colBegin.getBytes()); if (v != null) { if (flat) { str = ls.get(Integer.parseInt(v)); } else { str = v; } this.output.write(str.getBytes()); this.output.write(colEnd.getBytes()); } } } } break; case XmlPullParser.END_TAG: if (xmlParsersheet.getName().equalsIgnoreCase("row") && v != null) { if (i == 1) { this.output.write(rowEnd.getBytes()); this.output.write(rowBegin.getBytes()); i = 1; } else { this.output.write(rowBegin.getBytes()); } } break; } evtTypesheet = xmlParsersheet.next(); } System.out.println(str); } catch (ZipException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (XmlPullParserException e) { e.printStackTrace(); } if (str == null) { str = "解析文件出现问题"; } this.output.write(rowEnd.getBytes()); this.output.write(tableEnd.getBytes()); this.output.write(end.getBytes()); } catch (Exception e) { System.out.println("readAndWrite Exception"); } }
单元格的操作:

/**
 * 取得单元格的值
 *
 * @param cell
 * @return
 * @throws IOException
 */
private static Object getCellValue(HSSFCell cell) throws IOException {
        Object value = "";
        if (cell.getCellType() == HSSFCell.CELL_TYPE_STRING) {
        value = cell.getRichStringCellValue().toString();
        } else if (cell.getCellType() == HSSFCell.CELL_TYPE_NUMERIC) {
        if (HSSFDateUtil.isCellDateFormatted(cell)) {
        Date date = (Date) cell.getDateCellValue();
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
        value = sdf.format(date);
        } else {
        double value_temp = (double) cell.getNumericCellValue();
        BigDecimal bd = new BigDecimal(value_temp);
        BigDecimal bd1 = bd.setScale(3, bd.ROUND_HALF_UP);
        value = bd1.doubleValue();

        DecimalFormat format = new DecimalFormat("#0.###");
        value = format.format(cell.getNumericCellValue());

        }
        }
        if (cell.getCellType() == HSSFCell.CELL_TYPE_BLANK) {
        value = "";
        }
        return value;
        }

/**
 * 判断单元格在不在合并单元格范围内,如果是,获取其合并的列数。
 *
 * @param sheet
 *            工作表
 * @param cellRow
 *            被判断的单元格的行号
 * @param cellCol
 *            被判断的单元格的列号
 * @return
 * @throws IOException
 */
private static int getMergerCellRegionCol(HSSFSheet sheet, int cellRow,
        int cellCol) throws IOException {
        int retVal = 0;
        int sheetMergerCount = sheet.getNumMergedRegions();
        for (int i = 0; i < sheetMergerCount; i++) {
        CellRangeAddress cra = (CellRangeAddress) sheet.getMergedRegion(i);
        int firstRow = cra.getFirstRow(); // 合并单元格CELL起始行
        int firstCol = cra.getFirstColumn(); // 合并单元格CELL起始列
        int lastRow = cra.getLastRow(); // 合并单元格CELL结束行
        int lastCol = cra.getLastColumn(); // 合并单元格CELL结束列
        if (cellRow >= firstRow && cellRow <= lastRow) { // 判断该单元格是否是在合并单元格中
        if (cellCol >= firstCol && cellCol <= lastCol) {
        retVal = lastCol - firstCol + 1; // 得到合并的列数
        break;
        }
        }
        }
        return retVal;
        }

/**
 * 判断单元格是否是合并的单格,如果是,获取其合并的行数。
 *
 * @param sheet
 *            表单
 * @param cellRow
 *            被判断的单元格的行号
 * @param cellCol
 *            被判断的单元格的列号
 * @return
 * @throws IOException
 */
private static int getMergerCellRegionRow(HSSFSheet sheet, int cellRow,
        int cellCol) throws IOException {
        int retVal = 0;
        int sheetMergerCount = sheet.getNumMergedRegions();
        for (int i = 0; i < sheetMergerCount; i++) {
        CellRangeAddress cra = (CellRangeAddress) sheet.getMergedRegion(i);
        int firstRow = cra.getFirstRow(); // 合并单元格CELL起始行
        int firstCol = cra.getFirstColumn(); // 合并单元格CELL起始列
        int lastRow = cra.getLastRow(); // 合并单元格CELL结束行
        int lastCol = cra.getLastColumn(); // 合并单元格CELL结束列
        if (cellRow >= firstRow && cellRow <= lastRow) { // 判断该单元格是否是在合并单元格中
        if (cellCol >= firstCol && cellCol <= lastCol) {
        retVal = lastRow - firstRow + 1; // 得到合并的行数
        break;
        }
        }
        }
        return 0;
        }

/**
 * 单元格背景色转换
 *
 * @param hc
 * @return
 */
private String convertToStardColor(HSSFColor hc) {
        StringBuffer sb = new StringBuffer("");
        if (hc != null) {
        int a = HSSFColor.AUTOMATIC.index;
        int b = hc.getIndex();
        if (a == b) {
        return null;
        }
        sb.append("#");
        for (int i = 0; i < hc.getTriplet().length; i++) {
        String str;
        String str_tmp = Integer.toHexString(hc.getTriplet()[i]);
        if (str_tmp != null && str_tmp.length() < 2) {
        str = "0" + str_tmp;
        } else {
        str = str_tmp;
        }
        sb.append(str);
        }
        }
        return sb.toString();
        }

/**
 * 单元格小平对齐
 *
 * @param alignment
 * @return
 */
private String convertAlignToHtml(short alignment) {
        String align = "left";
        switch (alignment) {
        case HSSFCellStyle.ALIGN_LEFT:
        align = "left";
        break;
        case HSSFCellStyle.ALIGN_CENTER:
        align = "center";
        break;
        case HSSFCellStyle.ALIGN_RIGHT:
        align = "right";
        break;
default:
        break;
        }
        return align;
        }

/**
 * 单元格垂直对齐
 *
 * @param verticalAlignment
 * @return
 */
private String convertVerticalAlignToHtml(short verticalAlignment) {
        String valign = "middle";
        switch (verticalAlignment) {
        case HSSFCellStyle.VERTICAL_BOTTOM:
        valign = "bottom";
        break;
        case HSSFCellStyle.VERTICAL_CENTER:
        valign = "center";
        break;
        case HSSFCellStyle.VERTICAL_TOP:
        valign = "top";
        break;
default:
        break;
        }
        return valign;
        }

public void makeFile() {
        String sdStateString = android.os.Environment.getExternalStorageState();// 获取外部存储状态
        if (sdStateString.equals(android.os.Environment.MEDIA_MOUNTED)) {// 确认sd卡存在,原理不知,媒体安装??
        try {
        File sdFile = android.os.Environment
        .getExternalStorageDirectory();// 获取扩展设备的文件目录
        String path = sdFile.getAbsolutePath() + File.separator
        + "library";// 得到sd卡(扩展设备)的绝对路径+"/"+xiao
        File dirFile = new File(path);// 获取xiao文件夹地址
        if (!dirFile.exists()) {// 如果不存在
        dirFile.mkdir();// 创建目录
        }
        File myFile = new File(path + File.separator +filename+ ".html");// 获取my.html的地址
        if (!myFile.exists()) {// 如果不存在
        myFile.createNewFile();// 创建文件
        }
        this.htmlPath = myFile.getAbsolutePath();// 返回路径
        } catch (Exception e) {
        }
        }
        }
保存和读取图片:

/* 用来在sdcard上创建图片 */
public void makePictureFile() {
        String sdString = android.os.Environment.getExternalStorageState();// 获取外部存储状态
        if (sdString.equals(android.os.Environment.MEDIA_MOUNTED)) {// 确认sd卡存在,原理不知
        try {
        File picFile = android.os.Environment
        .getExternalStorageDirectory();// 获取sd卡目录
        String picPath = picFile.getAbsolutePath() + File.separator
        + "library";// 创建目录,上面有解释
        File picDirFile = new File(picPath);
        if (!picDirFile.exists()) {
        picDirFile.mkdir();
        }
        File pictureFile = new File(picPath + File.separator
        +getFileName(nameStr)+ presentPicture + ".jpg");// 创建jpg文件,方法与html相同
        if (!pictureFile.exists()) {
        pictureFile.createNewFile();
        }
        this.picturePath = pictureFile.getAbsolutePath();// 获取jpg文件绝对路径
        } catch (Exception e) {
        System.out.println("PictureFile Catch Exception");
        }
        }
        }

public String getFileName(String pathandname){

        int start=pathandname.lastIndexOf("/");
        int end=pathandname.lastIndexOf(".");
        if(start!=-1 && end!=-1){
        return pathandname.substring(start+1,end);
        }else{
        return null;
        }

        }


public void writePicture() {
        Picture picture = (Picture) pictures.get(presentPicture);


        byte[] pictureBytes = picture.getContent();


        Bitmap bitmap = BitmapFactory.decodeByteArray(pictureBytes, 0,
        pictureBytes.length);


        makePictureFile();
        presentPicture++;


        File myPicture = new File(picturePath);


        try {


        FileOutputStream outputPicture = new FileOutputStream(myPicture);


        outputPicture.write(pictureBytes);


        outputPicture.close();
        } catch (Exception e) {
        System.out.println("outputPicture Exception");
        }


        String imageString = "\"" + picturePath + "\"";
        imageString = imageString + ">";


        try {
        output.write(imageString.getBytes());
        } catch (Exception e) {
        System.out.println("output Exception");
        }
        }
public void writeDOCXPicture(byte[] pictureBytes) {
        Bitmap bitmap = BitmapFactory.decodeByteArray(pictureBytes, 0,
        pictureBytes.length);
        makePictureFile();
        this.presentPicture++;
        File myPicture = new File(this.picturePath);
        try {
        FileOutputStream outputPicture = new FileOutputStream(myPicture);
        outputPicture.write(pictureBytes);
        outputPicture.close();
        } catch (Exception e) {
        System.out.println("outputPicture Exception");
        }
        String imageString = "\"" + this.picturePath + "\"";

        imageString = imageString + ">";
        try {
        this.output.write(imageString.getBytes());
        } catch (Exception e) {
        System.out.println("output Exception");
        }
        }

public void writeParagraphContent(Paragraph paragraph) {
        Paragraph p = paragraph;
        int pnumCharacterRuns = p.numCharacterRuns();

        for (int j = 0; j < pnumCharacterRuns; j++) {

        CharacterRun run = p.getCharacterRun(j);

        if (run.getPicOffset() == 0 || run.getPicOffset() >= 1000) {
        if (presentPicture < pictures.size()) {
        writePicture();
        }
        } else {
        try {
        String text = run.text();
        if (text.length() >= 2 && pnumCharacterRuns < 2) {
        output.write(text.getBytes());
        } else {
        int size = run.getFontSize();
        int color = run.getColor();
        String fontSizeBegin = "\""
        + decideSize(size) + "\">";
        String fontColorBegin = "\""
        + decideColor(color) + "\">";
        String fontEnd = "";
        String boldBegin = "";
        String boldEnd = "";
        String islaBegin = "";
        String islaEnd = "";

        output.write(fontSizeBegin.getBytes());
        output.write(fontColorBegin.getBytes());

        if (run.isBold()) {
        output.write(boldBegin.getBytes());
        }
        if (run.isItalic()) {
        output.write(islaBegin.getBytes());
        }

        output.write(text.getBytes());

        if (run.isBold()) {
        output.write(boldEnd.getBytes());
        }
        if (run.isItalic()) {
        output.write(islaEnd.getBytes());
        }
        output.write(fontEnd.getBytes());
        output.write(fontEnd.getBytes());
        }
        } catch (Exception e) {
        System.out.println("Write File Exception");
        }
        }
        }
        }
一些格式:

public int decideSize(int size) {

        if (size >= 1 && size <= 8) {
        return 1;
        }
        if (size >= 9 && size <= 11) {
        return 2;
        }
        if (size >= 12 && size <= 14) {
        return 3;
        }
        if (size >= 15 && size <= 19) {
        return 4;
        }
        if (size >= 20 && size <= 29) {
        return 5;
        }
        if (size >= 30 && size <= 39) {
        return 6;
        }
        if (size >= 40) {
        return 7;
        }
        return 3;
        }

private String decideColor(int a) {
        int color = a;
        switch (color) {
        case 1:
        return "#000000";
        case 2:
        return "#0000FF";
        case 3:
        case 4:
        return "#00FF00";
        case 5:
        case 6:
        return "#FF0000";
        case 7:
        return "#FFFF00";
        case 8:
        return "#FFFFFF";
        case 9:
        return "#CCCCCC";
        case 10:
        case 11:
        return "#00FF00";
        case 12:
        return "#080808";
        case 13:
        case 14:
        return "#FFFF00";
        case 15:
        return "#CCCCCC";
        case 16:
        return "#080808";
default:
        return "#000000";
        }
        }

好吧,大功告成,看看效果。

我设置了个dialog式的activity来展现下,可以放大缩小:

读取doc:


读取docx:

                                             

读取xls和xlsx:


代码demo已经上传:http://download.csdn.NET/detail/bit_kaki/9676592 需要两积分。如果积分不够的可以私信我或者留言地址,我看到也会回复

你可能感兴趣的:(Android中解析读取复杂word,excel,ppt等的方法)