Python 奇淫技巧 -- 利用pandas读取xml转换为excel

因为工作需要, 将xml中特定的节点值取出来, 然后统计到excel中。 于是乎试试写了一个python脚本, 加快工作效率。 而且今后还能复用。

以下为完整示例, 需要的朋友们可参考。

示例 XML



<breakfast_menu> 
  <food> 
    <name>Belgian Wafflesname>  
    <price>$5.95price>  
    <description>Two of our famous Belgian Waffles with plenty of real maple syrupdescription>  
    <calories>650calories> 
  food>  
  <food> 
    <name>Strawberry Belgian Wafflesname>  
    <price>$7.95price>  
    <description>Light Belgian waffles covered with strawberries and whipped creamdescription>  
    <calories>900calories> 
  food>  
  <food> 
    <name>Berry-Berry Belgian Wafflesname>  
    <price>$8.95price>  
    <description>Light Belgian waffles covered with an assortment of fresh berries and whipped creamdescription>  
    <calories>900calories> 
  food>  
  <food> 
    <name>French Toastname>  
    <price>$4.50price>  
    <description>Thick slices made from our homemade sourdough breaddescription>  
    <calories>600calories> 
  food>  
  <food> 
    <name>Homestyle Breakfastname>  
    <price>$6.95price>  
    <description>Two eggs, bacon or sausage, toast, and our ever-popular hash brownsdescription>  
    <calories>950calories> 
  food> 
breakfast_menu>

python 脚本

from lxml import etree
import pandas as pd

def read_data_from_xml(xml_path):
    xml_content = ""
    with open(xml_path,'rb') as f:
        xml_content = f.read()

    excel_data = [["食物", "价格", "卡路里", "描述"]]

    xml_data = etree.XML(xml_content)
    foods = xml_data.xpath("//food")
    for food in foods:
        excel_row_data = []
        excel_row_data.extend(food.xpath("name/text()"))
        excel_row_data.extend(food.xpath("price/text()"))
        excel_row_data.extend(food.xpath("calories/text()"))
        excel_row_data.extend(food.xpath("description/text()"))
        excel_data.append(excel_row_data)

    return excel_data




def to_csv(writer, excel_data, sheet_name):

    data_df = pd.DataFrame(excel_data[1:])
    data_df.columns = excel_data[0]
    data_df.to_excel(writer,float_format='%.10f',index=False, sheet_name=sheet_name)
    worksheet = writer.sheets[sheet_name]
 	
 	// 设置列宽
    cols = "%s:%s" % ('A', chr(ord('A') + len(data_df.columns) - 1))
    worksheet.set_column(cols, 30)

// 读取xml
excel_data_ = read_data_from_xml("food.xml")

writer = pd.ExcelWriter('food.xlsx')
to_csv(writer, excel_data_, "food1")
writer.save()

最终效果
在这里插入图片描述

你可能感兴趣的:(#,Python,编程语言,xpath,xml,excel,python)