python3处理docx并flask显示

前言:

最近有需求处理docx文件,并讲内容显示到页面,对world进行在线的阅读,这样我这里就使用flask+Document对docx文件进行处理并显示,下面直接上代码:

Document处理:

首先下载Document的库文件,先直接安装最新版的python-docx,如果不行则换成1.1.0版本:

pip install python-docx
pip install python-docx==1.1.0

处理docx代码如下:

def ReadVADocx(ProjectName,DocxName):
    docxfilepath = vaReportDir + "\\" + ProjectName + "\\" + DocxName
    paragraphs = ReadDocx(docxfilepath)
    return paragraphs

def ReadDocx(docxfilepath):
    doc = Document(docxfilepath)
    paragraphs = list()
    pattern = re.compile('rId\d+')
    for graph in doc.paragraphs:
        level = graph.style.name.split(' ')[-1]
        if level == "Normal":
            level = None
        elif level == "Preformatted":
            level = None
        paragraph = {
            'text': graph.text,
            'level': level,
            'images': ""
        }
        paragraphs.append(paragraph)
        for run in graph.runs:
            if run.text == '':
                contentID = pattern.search(run.element.xml)
                if contentID:
                    contentID = contentID.group(0)
                    try:
                        contentType = doc.part.related_parts[contentID].content_type
                    except KeyError as e:
                        print(e)
                        continue
                    if not contentType.startswith('image'):
                        continue
                    imgData = doc.part.related_parts[contentID].blob
                    image_base64 = base64.b64encode(imgData).decode('utf-8')
                    paragraph = {
                        'text':  run.text,
                        'level': run.style.name.split(' ')[-1] if run.style.name.startswith('Heading') else None,
                        'images': image_base64
                    }
                    paragraphs.append(paragraph)

上述代码会对docx文件进行遍历,并将对应的内容和等级放入数组中

下面是调用代码:

@app.route('/ViewVADocx', methods=['GET'])
def ViewVADocx():
     try:
        DocxName = request.args.get('docx')
        ProjectName = request.args.get('name')
        paragraphs = engine.ReadVADocx(ProjectName,DocxName)
        return render_template("viewdocx.html", n_getname=ProjectName, n_user=user,paragraphs=paragraphs)
     except Exception as e:
         return render_template('error-500.html')

html编写: 

然后就是需要讲对应的内容在页面进行展示,下面列出html代码:

{% extends "mould.html" %}

{% block head %}
{% endblock %}

{% block body %}
        
        
↑回到顶部↑

{{ n_getname }}:扫描节点线

快速导航:

{% for paragraph in paragraphs %} {% if paragraph.level == "1" %}

{{ paragraph.text }} {% elif paragraph.level == "2" %}

{{ paragraph.text }}

{% endif %} {% endfor %}
{% for paragraph in paragraphs %} {% if paragraph.level %} {% if paragraph.level == "Title" %} {% elif paragraph.level == "1" %}
{{ paragraph.text }} {% else %} {{ paragraph.text }} {% endif %} {% else %} {% if paragraph.images %}

Image

{% else %}

{{ paragraph.text }}

{% endif %} {% endif %} {% endfor %}
{% endblock %} {% block list %} {% endblock %}

其中添加了样式和回到顶部等小功能,方便浏览,最后的使用效果如下:

python3处理docx并flask显示_第1张图片 

后记:

代码只做了docx文件的内容展示,包括文字和图片,并对等级进行了划分,没有对docx的修改功能,感兴趣的可以自己研究下 

 

 

你可能感兴趣的:(代码编程,flask,python,后端)