[Python Tips]比较header的不同

在实际工作中遇到这样一个问题,在导文件的时候,一个csv文件的header是由另一个xml文件决定的。导入时候如果二者有不同,只会提示跟xml定义的不一样,但是不提示具体是哪里不一样。如果字段很多,一个一个去比较,非常麻烦,于是写了这个小脚本解决这个问题。

原理:

  1. 取出CSV文件中的header
  2. 取出XML中定义的header
  3. 比较二者,不同的打印出来

代码如下:

#! /usr/bin/env python
#coding=utf-8
import csv
import easygui
import xml.etree.ElementTree as ET
import os
 
### read header from CSV file
print "Please select your CSV file then select Data Model"
filepath1 = easygui.fileopenbox()
with open(filepath1) as f:
    f_csv = csv.reader(f)
    headers_in_csv = next(f_csv)
del headers_in_csv[0]
 
background_element_id_in_csv = headers_in_csv[0]
del headers_in_csv[0]
print headers_in_csv
#print background_element_id_in_csv
 
### read header configuration from data model
filepath = easygui.fileopenbox()
root = ET.parse(filepath).getroot()
headers_in_datamodel = []
for background_element in root.findall("background-element[@type-id='24']"):
    if background_element.attrib["id"] == background_element_id_in_csv:
        for child in background_element:
            data_field_dic = child.attrib
            if "id" in data_field_dic.keys():
                headers_in_datamodel.append(data_field_dic["id"])
print headers_in_datamodel
a = len(headers_in_csv)
b = len(headers_in_datamodel)
print a, b
if a == b:
    if headers_in_csv==headers_in_datamodel:
        print "same header"
    else:
        for i in range(a):
            if headers_in_csv[i] not in headers_in_datamodel:
                print headers_in_csv[i]
        print "Header not match, please check your csv file and data model!"
else:
    print "Header not match, please check your csv file and data model!"

你可能感兴趣的:([Python Tips]比较header的不同)