pandas 的基本功能
import pandas as pd
import numpy as np
from pandas import Series, DataFrame
Series 的 reindex 使用方法
obj = Series ([ 4.5 , 7.2 , - 5.3 , 3.6 ], index= ['d','b','c','a'] )
obj
d 4.5
b 7.2
c -5.3
a 3.6
dtype: float64
obj.reindex(['a','b','Y','Z'])
a 3.6
b 7.2
Y NaN
Z NaN
dtype: float64
obj.reindex(['a','b','Y','Z'], fill_value=0)
a 3.6
b 7.2
Y 0.0
Z 0.0
dtype: float64
DataFrame 的 reindex的使用方法:可以修改行索引 ,列索引,也可以哦同时修改。
frame = DataFrame(np.arange(9).reshape((3,3)), index=['a','b','d'], columns=['cal','china','cannada'])
frame
|
cal |
china |
cannada |
a |
0 |
1 |
2 |
b |
3 |
4 |
5 |
d |
6 |
7 |
8 |
frame1 = frame.reindex(['a','b','c','d'])
frame1
|
cal |
china |
cannada |
a |
0.0 |
1.0 |
2.0 |
b |
3.0 |
4.0 |
5.0 |
c |
NaN |
NaN |
NaN |
d |
6.0 |
7.0 |
8.0 |
frame2 = frame.reindex(columns=['cal','y','china','w'])
frame2
|
cal |
y |
china |
w |
a |
0 |
NaN |
1 |
NaN |
b |
3 |
NaN |
4 |
NaN |
d |
6 |
NaN |
7 |
NaN |
frame3 = frame.reindex(index=['a','b','c','d'],columns=['cal','y','china','w'])
frame3
|
cal |
y |
china |
w |
a |
0.0 |
NaN |
1.0 |
NaN |
b |
3.0 |
NaN |
4.0 |
NaN |
c |
NaN |
NaN |
NaN |
NaN |
d |
6.0 |
NaN |
7.0 |
NaN |
丢弃指定轴上的项:drop()函数,需要有索引数组或列表即可
frame
|
cal |
china |
cannada |
a |
0 |
1 |
2 |
b |
3 |
4 |
5 |
d |
6 |
7 |
8 |
frame.drop('a')
|
cal |
china |
cannada |
b |
3 |
4 |
5 |
d |
6 |
7 |
8 |
frame.drop('china',axis=1)
|
cal |
cannada |
a |
0 |
2 |
b |
3 |
5 |
d |
6 |
8 |
索引、选取和过滤
se = Series(np.arange(6), index=['a','b','c','d','e','f'])
se
a 0
b 1
c 2
d 3
e 4
f 5
dtype: int32
se['b':'c']
b 1
c 2
dtype: int32
se[['c','e','d']]
c 2
e 4
d 3
dtype: int32
se[se<3]
a 0
b 1
c 2
dtype: int32
对DataFrame对象索引 其实是获取一个或者多个列
frame
|
cal |
china |
cannada |
a |
0 |
1 |
2 |
b |
3 |
4 |
5 |
d |
6 |
7 |
8 |
frame['cal']
a 0
b 3
d 6
Name: cal, dtype: int32
frame[['cal','china']]
|
cal |
china |
a |
0 |
1 |
b |
3 |
4 |
d |
6 |
7 |
frame[:2]
|
cal |
china |
cannada |
a |
0 |
1 |
2 |
b |
3 |
4 |
5 |
frame[frame['china'] > 1]
|
cal |
china |
cannada |
b |
3 |
4 |
5 |
d |
6 |
7 |
8 |
frame.ix['d']
D:\anacoda\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
"""Entry point for launching an IPython kernel.
cal 6
china 7
cannada 8
Name: d, dtype: int32
算术运算和数据对齐
- pandas 很重要的一个功能就是可以对不同索引的对象进行算术运算