好用的python cElementTree

ElementTree是python的XML解析模块,cElementTree是ElementTree的C语言实现。Python 2.5的标准库已经包含了ElementTree和cElementTree。

下面是从cElementTree网站得到的测试数据:

Here are some benchmark figures, using a number of popular XML toolkits to parse a 3405k document-style XML file, from disk to memory:

library time space notes
xml.dom.minidom (Python 2.1) 6.3 s 80000k (1)
gnosis.objectify 2.0 s 22000k (5)
xml.dom.minidom (Python 2.4) 1.4 s 53000k (1)
ElementTree 1.2 1.6 s 14500k  
ElementTree 1.2.4/1.3 1.1 s 14500k  
cDomlette (C extension) 0.540 s 20500k (1)
PyRXPU (C extension) 0.175 s 10850k (2)
lxml.etree (C extension) (4) (4) (3)
libxml2 (C extension) 0.098 s 16000k (3)
readlines (read as utf-8) 0.093 s 8850k  
cElementTree (C extension) 0.047 s 4900k  
readlines (read as ascii) 0.032 s 5050k  


library time throughput
xml.sax (Python 2.1) 0.330 s 10300 k/s
xml.sax (Python 2.4) 0.292 s 11700 k/s
xml.parsers.expat 0.184 s 18500 k/s
cElementTree XMLParser 0.124 s 27500 k/s
sgmlop 0.092 s 37000 k/s
cElementTree iterparse 0.071 s 48000 k/s

ElementTree是一棵由元素节点构成的树,文本内容是作为元素的text或tail属性表现的,如ele.text。这点比DOM把元素和文本都作为节点的方式简洁、方便很多。element支持一些字典或列表的操作,属性用字典方式,子节点用列表。查找用find或findall函数。

 

Operation Result
elem[n] Returns n'th child element.
elem[m:n] Returns list of m'th through n'th child elements.
len(elem) Returns number of child elements.
list(elem) Returns list of child elements.
elem.append(elem2) Adds elem2 as a child.
elem.insert(index, elem2) Inserts elem2 at the specified location.
del elem[n] Deletes n'th child element.
elem.keys() Returns list of attribute names.
elem.get(name) Returns value of attribute name.
elem.set(name, value) Sets new value for attribute name.
elem.attrib Retrieves the dictionary containing attributes.
del elem.attrib[name] Deletes attribute name.

确实是好东西,而且用起来非常方便,简单的写几行代码体验一下~~~
# Python2.4下的代码
import  cElementTree as ET

# 解析文件
tree  =  ET.parse( ' test.xml ' )

# 获得根节点
root  =  tree.getroot()

# 找到第一个tagformat标签
tag  =  root.find( ' tagformat ' )
# 遍历所有的opt标签
for  ele  in  tag.findall( ' opt ' ):
    
print  ele.text

# 获得属性
print  root.get( ' name ' )
# 修改或新建属性
root.set( ' user ' ' liujunzhi ' )

# 以utf-8编码保存
=  open( ' output.xml ' ' w ' )
tree.write(f, encoding
= ' utf-8 ' )
f.close()


你可能感兴趣的:(python,list,library,extension,Dictionary,encoding)