判断两个dataframe中的指定列有无重合的数字(对象)

>>> data=pd.merge(data,userfeature,on='uid',how='inner')
>>> data.info()

Index: 0 entries
Data columns (total 33 columns):
aid                   0 non-null int64
uid                   0 non-null object
label                 0 non-null int64
advertiserId          0 non-null int64
campaignId            0 non-null int64
creativeId            0 non-null int64
creativeSize          0 non-null int64
adCategoryId          0 non-null int64
productId             0 non-null int64
productType           0 non-null int64
LBS                   0 non-null object
age                   0 non-null object
appIdAction           0 non-null object
appIdInstall          0 non-null object
carrier               0 non-null object
consumptionAbility    0 non-null object
ct                    0 non-null object
education             0 non-null object
gender                0 non-null object
house                 0 non-null object
interest1             0 non-null object
interest2             0 non-null object
interest3             0 non-null object
interest4             0 non-null object
interest5             0 non-null object
kw1                   0 non-null object
kw2                   0 non-null object
kw3                   0 non-null object
marriageStatus        0 non-null object
os                    0 non-null object
topic1                0 non-null object
topic2                0 non-null object
topic3                0 non-null object
dtypes: int64(9), object(24)
memory usage: 0.0+ bytes 

如上所示,要判断两个dataframe中指定列有无相同的数值(或对象)可以用inner去判断

>>> ojbk=pd.DataFrame({'x1':[11,22,33],
		   'x2':[55,66,77]})
>>> ojb=pd.DataFrame({'x1':[44,55,66],
		  'x2':[88,99,100]})
>>> ooo=pd.merge(ojbk,ojb,on='x2',how='inner')
>>> ooo.info()

Index: 0 entries
Data columns (total 3 columns):
x1_x    0 non-null int64
x2      0 non-null int64
x1_y    0 non-null int64
dtypes: int64(3)
memory usage: 0.0+ bytes
>>> ojb=pd.DataFrame({'x1':[44,55,66],
		  'x2':[77,99,100]})
>>> ooo=pd.merge(ojbk,ojb,on='x2',how='inner')
>>> ooo
   x1_x  x2  x1_y
0    33  77    44
>>> 
>>> ooo.info()

Int64Index: 1 entries, 0 to 0
Data columns (total 3 columns):
x1_x    1 non-null int64
x2      1 non-null int64
x1_y    1 non-null int64
dtypes: int64(3)
memory usage: 32.0 bytes


你可能感兴趣的:(python)