配置文件如下,名字为config.xml
<?xml version="1.0"?>
<config>
<server>server1</server>
<server>server2</server>
<account>account</account>
<pwd>pwd</pwd>
</config>
Python代码如下:
from xml.dom.minidom import parse, parseString
def getText(nodelist):
rc = ""
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc = rc + node.data
return rc
if __name__=="__main__":
dom1 = parse('config.xml') # parse an XML file by name
config_element = dom1.getElementsByTagName("config")[0]
servers = config_element.getElementsByTagName("server")
for server in servers:
print getText(server.childNodes)
显示结果:
mail.hundsun.com
mail.hundsun.comdd
Python读取XML配置文件还是比较简单的,主要是perse的getElementsByTagName()函数,它返回的是NodeList对象。
Python 的Library Reference上如下解释NodeList:
A NodeList represents a sequence of nodes. These objects are used in two ways in the DOM Core recommendation: the Element objects provides one as its list of child nodes, and the getElementsByTagName() and getElementsByTagNameNS() methods of Node return objects with this interface to represent query results.
对NodeList中的每个Node,调用getText函数,如果是TEXT_NODE类型的,则将其打印出。
Node的childNodes的说明如下:
-
childNodes
-
A list of nodes contained within this node. This is a read-only attribute.
DOM的常用节点:
节点类型 例子
Document type <!DOCTYPE food SYSTEM "food.dtd">
Processing instruction <?xml version="1.0"?>
Element <drink type="beer">Carlsberg</drink>
Attribute type="beer"
Text Carlsberg
Node 有个nodeValue属性,开始不知道和node的data属性有何差别,后来查了DOM的文档,如下解释:
XML 对象的节点值。如果 XML 对象是一个文本节点,则
nodeType
为 3,
nodeValue
是节点的文本。如果 XML 对象是一个 XML 元素(
nodeType
为 1),则
nodeValue
为
null
且只读
在Python里试了一下,对普通的文本节点,如“<server>mail.</server>”,nodeValue是1,为了要显示其文本内容,用.data属性和用.nodeValue属性是效果一样的,如:
rc = ""
for node in node.childNodes:
if node.nodeType in ( node.TEXT_NODE, node.CDATA_SECTION_NODE):
rc = rc + node.nodeValue
print rc