文章源自:http://bluecrystal.iteye.com/blog/116915
python小例子之3 -- 解析xml文本
主题: 解析xml文本
环境: winxp pro + sp2 + python2.5
备注: 请注意,凡是在源代码文件中使用了中文字符,请最好保存为utf-8格式
测试用例sample.xml也请用utf-8格式保存
代码:
# parsexml.py # 本例子参考自python联机文档,做了适当改动和添加 import xml.parsers.expat # 控制打印缩进 level = 0 # 获取某节点名称及属性值集合 def start_element(name, attrs): global level print ' '*level, 'Start element:', name, attrs level = level + 1 # 获取某节点结束名称 def end_element(name): global level level = level - 1 print ' '*level, 'End element:', name # 获取某节点中间的值 def char_data(data): if(data == '\n'): return if(data.isspace()): return global level print ' '*level, 'Character data:', data p = xml.parsers.expat.ParserCreate() p.StartElementHandler = start_element p.EndElementHandler = end_element p.CharacterDataHandler = char_data p.returns_unicode = False f = file('sample.xml') p.ParseFile(f) f.close()
测试用例:
xml 代码:sample.xml
<?xml version="1.0" encoding="UTF-8"?> <contacts id="bluecrystal"> <item name="keen" fff="ddd"> <telephone type="phone">222222222</telephone> <telephone type="mobile">134567890</telephone> </item> <item name="bcm"> <telephone type="phone">11111111</telephone> <telephone type="mobile">15909878909</telephone> </item> </contacts>测试结果:
Start element: contacts {'id': 'bluecrystal'} Start element: item {'fff': 'ddd', 'name': 'keen'} Start element: telephone {'type': 'phone'} Character data: 222222222 End element: telephone Start element: telephone {'type': 'mobile'} Character data: 134567890 End element: telephone End element: item Start element: item {'name': 'bcm'} Start element: telephone {'type': 'phone'} Character data: 11111111 End element: telephone Start element: telephone {'type': 'mobile'} Character data: 15909878909 End element: telephone End element: item End element: contacts