python提取某列特定值,Python Pandas:获取列匹配特定值的行的索引

Given a DataFrame with a column "BoolCol", we want to find the indexes of the DataFrame in which the values for "BoolCol" == True

I currently have the iterating way to do it, which works perfectly:

for i in range(100,3000):

if df.iloc[i]['BoolCol']== True:

print i,df.iloc[i]['BoolCol']

But this is not the correct panda's way to do it.

After some research, I am currently using this code:

df[df['BoolCol'] == True].index.tolist()

This one gives me a list of indexes, but they dont match, when I check them by doing:

df.iloc[i]['BoolCol']

The result is actually False!!

Which would be the correct Pandas way to do this?

解决方案

df.iloc[i] returns the ith row of df. i does not refer to the index label, i is a 0-based index.

In contrast, the attribute index returns actual index labels, not numeric row-indices:

df.index[df['BoolCol'] == True].tolist()

or equivalently,

df.index[df['BoolCol']].tolist()

You can see the difference quite clearly by playing with a DataFrame with

an "unusual" index:

df = pd.DataFrame({'BoolCol': [True, False, False, True, True]},

index=[10,20,30,40,50])

In [53]: df

Out[53]:

BoolCol

10 True

20 False

30 False

40 True

50 True

[5 rows x 1 columns]

In [54]: df.index[df['BoolCol']].tolist()

Out[54]: [10, 40, 50]

If you want to use the index,

In [56]: idx = df.index[df['BoolCol']]

In [57]: idx

Out[57]: Int64Index([10, 40, 50], dtype='int64')

then you can select the rows using loc instead of iloc:

In [58]: df.loc[idx]

Out[58]:

BoolCol

10 True

40 True

50 True

[3 rows x 1 columns]

Note that loc can also accept boolean arrays:

In [55]: df.loc[df['BoolCol']]

Out[55]:

BoolCol

10 True

40 True

50 True

[3 rows x 1 columns]

If you have a boolean array, mask, and need ordinal index values, you can compute them using np.flatnonzero:

In [110]: np.flatnonzero(df['BoolCol'])

Out[112]: array([0, 3, 4])

Use df.iloc to select rows by ordinal index:

In [113]: df.iloc[np.flatnonzero(df['BoolCol'])]

Out[113]:

BoolCol

10 True

40 True

50 True

你可能感兴趣的:(python提取某列特定值)