python数据处理相关操作——选取数据
iloc,loc,ix
import pandas as pd
from pandas import DataFrame
创建数据框
data = {'a':[11,22,33,44],
'b':['aa','bb','cc','dd'],
'c':[9,8,7,6],
'd':[1,2,3,4]
}
df = DataFrame(data)
df
a b c d 0 11 aa 9 1 1 22 bb 8 2 2 33 cc 7 3 3 44 dd 6 4
iloc:通过行/列号选取数据
df.iloc[0] #选取第0行数据
a 11
b aa
c 9
d 1
Name: 0, dtype: object
df.iloc[0:2] #选取多行
a b c d 0 11 aa 9 1 1 22 bb 8 2
df.iloc[:,[1]] #也可以按照列号选取某列 选取第2列
b 0 aa 1 bb 2 cc 3 dd
df.iloc[0:1,[1]] #可以按照行号选取某行某列 选取第0行 第2列
b 0 aa
df.iloc[0:2,[0,1]] #可以按照行号选取多行多列 选取第0~2行 第0~2列
a b 0 11 aa 1 22 bb
loc通过标签选取数据
df.loc[0] #选取第1行 因为第1行的行号是0所以和iloc效果相同
a 11
b aa
c 9
d 1
Name: 0, dtype: object
data = {'a':[11,22,33,44],
'b':['aa','bb','cc','dd'],
'c':[9,8,7,6],
'd':[1,2,3,4]
}
df1 = DataFrame(data,index = ['a','b','c','d'])
df1
a b c d a 11 aa 9 1 b 22 bb 8 2 c 33 cc 7 3 d 44 dd 6 4
df1.loc['b'] #选取第b行
a 22
b bb
c 8
d 2
Name: b, dtype: object
df1.loc['b':] #选取多行
a b c d b 22 bb 8 2 c 33 cc 7 3 d 44 dd 6 4
df1.loc[:,['a']] #通过标签选取某列
a a 11 b 22 c 33 d 44
df1.loc[:,['a','b']] #通过标签选取多列
a b a 11 aa b 22 bb c 33 cc d 44 dd
df1.loc['a',['b','c']] #通过标签选取某行某列
b aa
c 9
Name: a, dtype: object
按照条件选取数据
df1.loc[df1['a']==11] #通过单个条件选取数据
a b c d a 11 aa 9 1
df1.loc[(df1['a']==11)&(df1['d']==1)] #通过单多个条件选取数据
a b c d a 11 aa 9 1
ix 简单粗暴 混合使用
也就是说 ix把iloc和loc语法综合了,爱用哪个用哪个,不过会报个warning
df #再看下dataframe
a b c d 0 11 aa 9 1 1 22 bb 8 2 2 33 cc 7 3 3 44 dd 6 4
df.ix[1] #可以像iloc通过行号选取
/Users/anaconda/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
"""Entry point for launching an IPython kernel.
a 22
b bb
c 8
d 2
Name: 1, dtype: object
df1
a b c d a 11 aa 9 1 b 22 bb 8 2 c 33 cc 7 3 d 44 dd 6 4
df1.ix['a'] #可以像loc通过标签选取
/Users/anaconda/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
"""Entry point for launching an IPython kernel.
a 11
b aa
c 9
d 1
Name: a, dtype: object
df1.ix[3,3] #通过行号选取指定位置的数据
/Users/anaconda/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
"""Entry point for launching an IPython kernel.
/Users/anaconda/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py:961: FutureWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
return getattr(section, self.name)[new_key]
4
df1.ix['a','a'] #通过标签选取指定位置的数据
/Users/anaconda/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
"""Entry point for launching an IPython kernel.
11
====================================================================
欢迎关注我的专栏,将会不断更新数学/统计学/数据分析/深度学习/网站开发相关内容。
有任何疑问可以关注公众号:早起python早起的学习小站zhuanlan.zhihu.com