Aspose.Words for .NET 处理文档

1. 前言

Aspose.Words for .NET 是一个类库, 它使您的应用程序能够执行大量的文档处理任务。
Aspose.Words 支持大多数流行的文档格式,例如 DOC、DOCX、RTF、HTML、Markdown、PDF、XPS、EPUB 等。
使用 Aspose.Words for .NET,您无需第三方应用程序或办公自动化即可生成、修改、转换、渲染和打印文档。
Aspose类库是商业化类库, 如需商业授权请前去官网
Aspose的基础知识点在此就不介绍了, 如需了解请前去官网
官网文档: Aspose.Words for .NET | Documentation

基本使用:

using Aspose.Words;

var baseDirectory = AppDomain.CurrentDomain.BaseDirectory;
//var fileName = "模板";
//string extention = ".docx";
//Document doc = new Document(baseDirectory + fileName + extention);
//不是用word模板时直接new一个Document对象即可
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

builder.Writeln("测试一下");
doc.Save(baseDirectory + "output.docx");

2. 项目中常用方法

因业务中使用的是word模板(提前根据业务处理的带有业务变量标记的文档), 需要对模板进行大量的操作, 总结了如下方法:
Aspose.Word 处理文档常见业务方法

2.1 文字相关

/// 
/// 文字替换帮助
/// 
/// 
/// 自动使用{}扩起来
/// 
public static void ReplaceHelper(this Document doc, string oldString, string newString)
{
    doc.Range.Replace(R("{" + oldString + "}"), newString ?? string.Empty);
}
/// 
/// 创建正则表达式
/// 
/// 
/// 
private static Regex R(string pattern)
{
return new Regex(pattern);
}

2.2 段落相关

/// 
/// 向节点后面插入段落
/// 
/// 
/// 
/// 
public static void InsertParagraphAfter(this DocumentBuilder documentBuilder, Node refNode, string text)
{
    var doc = documentBuilder.Document;
    if (documentBuilder.CurrentNode != refNode)
    {
        documentBuilder.MoveTo(refNode);
    }
    var p = documentBuilder.InsertParagraph();
    Run run = new Run(doc, text);
    if (refNode is Paragraph)
    {
        var font = ((Run)((Paragraph)refNode).Runs.FirstOrDefault())?.Font;
        if (font != null)
        {
            run.Font.Bidi = font.Bidi;
            run.Font.Color = font.Color;
            run.Font.Bold = font.Bold;
            run.Font.BoldBi = font.BoldBi;
            run.Font.Italic = font.Italic;
            run.Font.ItalicBi = font.ItalicBi;
            run.Font.NoProofing = font.NoProofing;
            run.Font.Outline = font.Outline;
            run.Font.Scaling = font.Scaling;
            run.Font.Shadow = font.Shadow;
            run.Font.Size = font.Size;
            run.Font.SizeBi = font.SizeBi;
            run.Font.SmallCaps = font.SmallCaps;
            run.Font.Spacing = font.Spacing;
            run.Font.Style = font.Style;
            run.Font.TextEffect = font.TextEffect;
        }
    }
    p.AppendChild(run);
}

2.3 表格相关

/// 
/// 获取所有table
/// 
/// 
/// 
public static List GetAllTables(this Document doc)
{
    NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
    return tables.Select(t => (Table)t).ToList();
}

/// 
/// 根据索引获取table
/// 
/// 
/// 
/// 
public static Table GetTableByIndex(this Document doc, int index)
{
    Table table = (Table)doc.GetChild(NodeType.Table, index, true);
    return table;
}

/// 
/// 根据表格title获取该table
/// 
/// 
/// 
/// 
public static Table GetTableByTitle(this Document doc, string title)
{
    return doc.GetAllTables().Where(p => p.Title == title).FirstOrDefault();
}

/// 
/// clone 表格最后一行
/// 
/// 
/// 
public static Row CloneLastRow(this Table table)
{
    Row cloneRow = (Row)table.LastRow.Clone(true);
    return cloneRow;
}

/// 
/// 向表格最后添加一行
/// 
/// 
/// 
/// 
public static Row AppendRow(this Table table, Row row)
{
    Row r = (Row)table.AppendChild(row);
    return r;
}

/// 
/// clone表格最后一行后添加到表格最后,返回添加的行
/// 
/// 
/// 
public static Row CloneLastRowAndAppend(this Table table)
{
    Row r = (Row)table.AppendChild(table.CloneLastRow());
    return r;
}
/// 
/// remove表格的某一行row
/// 
/// 
/// 
public static void RemoveRowByIndex(this Table table, int index)
{
    table.Rows.RemoveAt(index);
}

/// 
/// 表格添加cell内容
/// 
/// 
/// 当前行的第i列
/// 替换文本
public static void AppendCellText(this Row row, int col, string text)
{
    Cell cell = row.Cells[col];
    var rs = cell.FirstParagraph.Runs;

    if (rs.FirstOrDefault() is not Run rf)
    {
        rf = new Run(row.Document, text);
        rf.Text = text;
        cell.FirstParagraph.Runs.Add(rf);
    }
    else
    {
        rf.Text = text;
    }
}

/// 
/// 水平合并表格
/// 
/// 
/// 
/// 
/// 
/// 
public static void HorizontalMerge(this Table table, int rowIndex, int colStartIndex, int colEndIndex, CellVerticalAlignment? alignment = CellVerticalAlignment.Center)
{
    var mergeRow = table.Rows[rowIndex];
    mergeRow.Cells[colStartIndex].CellFormat.HorizontalMerge = CellMerge.First;
    if (alignment != null)
    {
        mergeRow.Cells[colStartIndex].CellFormat.VerticalAlignment = (CellVerticalAlignment)alignment;
    }

    for (int i = colStartIndex + 1; i <= colEndIndex; i++)
    {
        mergeRow.Cells[i].CellFormat.HorizontalMerge = CellMerge.Previous;
    }
}

/// 
/// 数值合并表格
/// 
/// 
/// 
/// 
/// 
/// 
public static void VerticalMerge(this Table table, int colIndex, int rowStartIndex, int rowEndIndex, CellVerticalAlignment? alignment = CellVerticalAlignment.Center)
{
    var rows = table.Rows;
    rows[rowStartIndex].Cells[colIndex].CellFormat.VerticalMerge = CellMerge.First;

    if (alignment != null)
    {
        rows[rowStartIndex].Cells[colIndex].CellFormat.VerticalAlignment = (CellVerticalAlignment)alignment;
    }
    for (int i = rowStartIndex + 1; i <= rowEndIndex; i++)
    {
        rows[i].Cells[colIndex].CellFormat.VerticalMerge = CellMerge.Previous;
    }
}

/// 
/// 合并表格区域
/// 
/// 
/// 
/// 
/// 
/// 
/// 
public static void RangeMerge(this Table table, int colStartIndex, int colEndIndex, int rowStartIndex, int rowEndIndex, CellVerticalAlignment? alignment = CellVerticalAlignment.Center)
{
    //先合并行
    for (int i = rowStartIndex; i <= rowEndIndex; i++)
    {
        table.HorizontalMerge(i, colStartIndex, colEndIndex, alignment);
    }

    //再合并上一操作合并后的第一列
    table.VerticalMerge(colStartIndex, rowStartIndex, rowEndIndex, alignment);
}

2.4 图片相关

/// 
/// 在特定paragraph下添加图片
/// 
/// word段落
/// 图片全路径
/// 图片宽
/// 图片高
public static void AddImage(this Paragraph paragraph, string imagePath, int width = 100, int height = 100)
{
    Document doc = (Document)paragraph.Document;
    Shape shape = new Shape(doc, ShapeType.Image);
    shape.ImageData.SetImage(imagePath);
    shape.Width = width;
    shape.Height = height;
    paragraph.AppendChild(shape);
}
/// 
/// 图片替换
/// 
/// 
/// 图片中的替换文字,作为识别该图片用
/// 图片路径
/// 
/// 
public static bool ReplaceImage(this Document doc, string imageAlternativeText, string imagePath, DocumentBuilder documentBuilder = null)
{
    if (documentBuilder == null) documentBuilder = new DocumentBuilder(doc);
    var shapes = doc.GetChildNodes(NodeType.Shape, true).Select(s => (Shape)s).ToList();
    var imageShape = shapes.FirstOrDefault(s => s.AlternativeText == imageAlternativeText);
    if (imageShape == null) return false;
    documentBuilder.MoveTo(imageShape);
    documentBuilder.InsertImage(imagePath, imageShape.Width, imageShape.Height);
    imageShape.ParentNode.RemoveChild(imageShape);
    return true;
}

2.5 其他

/// 
/// 删除指定范围内容
/// 
/// 
/// 标记名称
/// 
public static bool RemoveFlagContent(this Document doc, string flagName)
{
    var flag = false;
    foreach (var c in GetNodeTypeAny(doc))
    {
        if (c.Range.Text.StartsWith(flagName))
        {
            if (flag)
            {
                c.Remove();
                break;
            }
            flag = true;
        }
        if (flag) c.Remove();
    }
    return true;
}

/// 
/// 获取所有节点
/// 
/// 
/// 

public static List GetNodeTypeAny(this Document doc)
{
    return doc.GetChildNodes(NodeType.Any, true).ToList();
}

/// 
/// 删除所有标记
/// 
/// 
/// 
/// 
public static bool RemoveAllFlag(this Document doc,FieldInfo[] keys)
{
foreach (var node in GetNodeTypeAny(doc))
{
foreach (var flag in keys)
{
    var value = flag.GetValue(null).ToString();
    value = "{"  + value  + "}";
    if (node.Range.Text.StartsWith(value))
    {
        node.Remove();
    }
}
}
return true;
}

3. 关于word模板处理

3.1 文字

文字替换建议使用{replace_name}这种带大括号的变量的方式

3.2 段落

在特定位置向下复制段落,需要先用DocumentBuilder定位该node节点,insert一个段落,然后使用文字替换,如果有样式则应用样式.

3.3 表格

精确查找方法: 给表格属性添加可选文字, 可选文字的标题可精确定位该table,方法见GetTableByTitle
一般操作表格都是向下继续添加,所以直接复制最后一行, 然后修改表格cell文字即可达到新增行的目的

3.4 图片

精确查找方法: 右键图片编辑替换文字,替换文字内容为自定义变量, 在查找node是使用AlternativeText属性查找即可.

3.5 其他

删除自定义段落方法,在要删除段落前后添加标记,使用doc查找所有childNodes,注意查找node的类型应为NodeType.Any,然后删除标记索引的所有节点即可.

4. 使用Aspose.Word 的GitHub官方demo

官网github: Aspose.Words for .NET examples, plugins and showcases (github.com)
里面有对各种API的使用示例, 可下载下来查看

5. Aspose.DocumentExplorer 应用程序查看文档结构

在Aspose.Word 的GitHub官方demo仓库中已经包含了DocumentExplorer的源代码, 编译后运行可打开一个WinForm程序, 查看docx的结构类似于


Aspose.DocumentExplorer

6. 写在最后

基于对Aspose.Word的理解与应用的有限性, 对Aspose.Word的处理方式可能并不是最佳实践, 后续有更好的方式会继续更新, 也希望能得到各位的点评. 谢谢.

本文作者:wwmin
微信公众号: DotNet技术说
本文链接:https://www.jianshu.com/p/fa9756c6d20b
关于博主:评论和私信会在第一时间回复。或者[直接私信]我。
版权声明:转载请注明出处!
声援博主:如果您觉得文章对您有帮助,关注点赞, 您的鼓励是博主的最大动力!

你可能感兴趣的:(Aspose.Words for .NET 处理文档)