pandas比较两个文档的差异

pandas读取档案再比较档案的差异

直接上代码

本文使用到datacompy库,安装方式可以直接pip install datacompy

import pandas as pd
import os
import datacompy


if __name__ == "__main__":
    a = pd.read_csv('./sot2/ICX01.SOT2',skiprows=1,names=["X","Y","BIN"])
    # print(a)
    aa = a[a['BIN']!=0]
    del aa['BIN']
    # aa.to_csv('./aa.csv',index=False)
    b = pd.read_csv('./sot2user/ICX.SOT2',skiprows=1,names=["X","Y","BIN"])
    bb = b[b['BIN']!=0]
    del bb['BIN']
    # bb.to_csv('./bb.csv',index=False)
    # c=a[a!=b]
    # c = c.drop_duplicates(['X','Y',"BIN"])
    # c.to_csv('./sot.SOT2',index=False)
    # c.to_excel('./sot.xlsx',index=False)
    # print(a.equals(b))
    # print(b)
    # print(a.merge(b))
    compare = datacompy.Compare(bb, aa, join_columns=["X","Y"])
    print(compare.matches())
    print(compare.report())

需要安装datacompy这个库来进行比较,pandas自带的比较输出不简洁

compare = datacompy.Compare(bb, aa, join_columns=["X","Y"])

本段代码为用datacompy库进行比较,bb,aa为pandas读取的dataframe,[“X”,“Y”]为要比较的列

print(compare.matches())

这里打印出比较的布尔结果

print(compare.report())

打印差异的具体信息

你可能感兴趣的:(Python数据处理,python,pandas)