学习笔记:Python XML解析

文章目录

  • Python XML解析
    • xml.etree.ElementTree
    • xml.dom
      • xml.dom.minidom
    • xml.sax

Python XML解析

参考资料:

菜鸟教程:https://www.runoob.com/python/python-xml.html
官网文档:https://docs.python.org/3.6/library/markup.html

Python由xml包(Lib/xml)提供对XML的支持。

Python处理XML主要有两种模型,xml.domxml.sax分别定义了两种处理模型的接口:

  • 事件驱动模型:SAX (Simple API for XML),在解析XML的过程中触发一个个的事件,并调用用户定义的回调函数,以此来处理XML文件。
  • 文档对象模型:DOM (Document Object Model),将 XML 数据在内存中解析成一个树,通过对树的操作来操作XML。

The XML handling submodules are:

  • xml.etree.ElementTree: the ElementTree API, a simple and lightweight XML processor

  • xml.dom: the DOM API definition

  • xml.dom.minidom: a minimal DOM implementation

  • xml.dom.pulldom: support for building partial DOM trees

  • xml.sax: SAX2 base classes and convenience functions

  • xml.parsers.expat: the Expat parser binding

xml.etree.ElementTree

xml.dom

手册:https://docs.python.org/3.6/library/xml.dom.html

Interface Section Purpose
DOMImplementation DOMImplementation Objects Interface to the underlying implementation.
Node Node Objects Base interface for most objects in a document.
NodeList NodeList Objects Interface for a sequence of nodes.
DocumentType DocumentType Objects Information about the declarations needed to process a document.
Document Document Objects Object which represents an entire document.
Element Element Objects Element nodes in the document hierarchy.
Attr Attr Objects Attribute value nodes on element nodes.
Comment Comment Objects Representation of comments in the source document.
Text Text and CDATASection Objects Nodes containing textual content from the document.
ProcessingInstruction ProcessingInstruction Objects Processing instruction representation.

xml.dom.minidom

import xml.dom.minidom

cproject = r'.cproject'
dom = xml.dom.minidom.parse(cproject)
options = dom.getElementsByTagName('option')
for opt in options:
    if opt.getAttribute('superClass')=='gnu.c.compiler.option.preprocessor.def.symbols':
        nod_text = opt.childNodes[0]
        nod_elem = opt.childNodes[1]
        last_child = opt.lastChild

        nod_text = nod_text.cloneNode(False)
        nod_elem = nod_elem.cloneNode(False)
        nod_elem.setAttribute('value', 'GI_COMMIT="%s"' % git_commit)
        opt.insertBefore(nod_text, last_child)
        opt.insertBefore(nod_elem, last_child)

        nod_text = nod_text.cloneNode(False)
        nod_elem = nod_elem.cloneNode(False)
        nod_elem.setAttribute('value', 'GIT_BRANCH="%s"' % git_branch)
        opt.insertBefore(nod_text, last_child)
        opt.insertBefore(nod_elem, last_child)
        break

f = open(cproject, 'w')
dom.writexml(f, cproject+'.new')

代码功能是修改eclipse的工程文件(.cproject),增加宏定义GIT_COMMIT和GIT_BRANCH,将当前编译版本的源码git commit id和分支名编入版本。

xml.sax

原文链接

你可能感兴趣的:(Python)