pandas:匹配两个dataframe

有两个dataframe tb1,tb2,希望在tb1中找出包含tb2的行

根据网站https://stackoverflow.com/questions/29464234/compare-python-pandas-dataframes-for-matching-rows的答案:

One possible solution to your problem would be to use merge. Checking if any row (all columns) from another dataframe (df2) are present in df1 is equivalent to determining the intersection of the the two dataframes. This can be accomplished using the following function:

pd.merge(df1, df2, on=['A', 'B', 'C', 'D'], how='inner')

For example, if df1 was

    A           B            C          D
0   0.403846    0.312230    0.209882    0.397923
1   0.934957    0.731730    0.484712    0.734747
2   0.588245    0.961589    0.910292    0.382072
3   0.534226    0.276908    0.323282    0.629398
4   0.259533    0.277465    0.043652    0.925743
5   0.667415    0.051182    0.928655    0.737673
6   0.217923    0.665446    0.224268    0.772592
7   0.023578    0.561884    0.615515    0.362084
8   0.346373    0.375366    0.083003    0.663622
9   0.352584    0.103263    0.661686    0.246862

and df2 was defined as:

     A          B            C           D
0   0.259533    0.277465    0.043652    0.925743
1   0.667415    0.051182    0.928655    0.737673
2   0.217923    0.665446    0.224268    0.772592
3   0.023578    0.561884    0.615515    0.362084
4   0.346373    0.375366    0.083003    0.663622
5   2.000000    3.000000    4.000000    5.000000
6   14.000000   15.000000   16.000000   17.000000

The function pd.merge(df1, df2, on=['A', 'B', 'C', 'D'], how='inner') produces:

     A           B           C           D
0   0.259533    0.277465    0.043652    0.925743
1   0.667415    0.051182    0.928655    0.737673
2   0.217923    0.665446    0.224268    0.772592
3   0.023578    0.561884    0.615515    0.362084
4   0.346373    0.375366    0.083003    0.663622

The results are all of the rows (all columns) that are both in df1 and df2.

查看pandas API:(http://pandas.pydata.org/pandas-docs/stable/merging.html)

In [47]: result = pd.merge(left, right, how='inner', on=['key1', 'key2'])
 

pandas:匹配两个dataframe_第1张图片

你可能感兴趣的:(pandas:匹配两个dataframe)