python读取xml格式的文件

 

数据样例:

<Result>
	<weibo id="1">
		<sentence id="1" opinionated="N">我是句子sentence>
		<sentence id="2" opinionated="N">我是句子sentence>
		<sentence id="3" opinionated="Y" polarity="NEG">我也是句子sentence>
	weibo>
	<weibo id="5">
		<sentence id="1" opinionated="Y" polarity="NEG">句子句子sentence>
		<sentence id="2" opinionated="N">依然是句子sentence>
		<sentence id="3" opinionated="Y" polarity="POS">最后一个句子sentence>
	weibo>
Result>

python代码:

import xml.etree.cElementTree as et   # 读取xml文件的包
import pandas as pd

##### 读取xml文件,放到dataframe df_xml中
xml_tree = et.ElementTree(file='xxx.xml')  # 文件路径
dfcols = ['sentence', 'opinionated', 'polarity']
df_xml = pd.DataFrame(columns=dfcols)
root = xml_tree.getroot();

for sub_node in root:
    for node in sub_node:
        #print(node, node.tag, node.attrib, node.text)
        sentence = node.text
        opinionated = node.attrib.get('opinionated')
        polarity = node.attrib.get('polarity')

        df_xml = df_xml.append(
            pd.Series([sentence, opinionated, polarity], index=dfcols),
            ignore_index=True)

你可能感兴趣的:(python)