pandas 基本功能 (一)

pandas 的基本功能

  • 重新索引 reindex(), ix():作用是创建新索引的新对象

  • 丢弃指定轴上的项

  • 索引、选取和过滤

import pandas as pd
import numpy as np
from pandas import Series, DataFrame

Series 的 reindex 使用方法

obj =  Series  ([ 4.5 ,  7.2 , - 5.3 ,  3.6 ],  index= ['d','b','c','a'] )
obj
d    4.5
b    7.2
c   -5.3
a    3.6
dtype: float64
obj.reindex(['a','b','Y','Z'])# 取索引的交集,没有值的用缺失值补充
a    3.6
b    7.2
Y    NaN
Z    NaN
dtype: float64
obj.reindex(['a','b','Y','Z'], fill_value=0) 
a    3.6
b    7.2
Y    0.0
Z    0.0
dtype: float64
# obj.reindex(['a','b','Y','Z'], method='ffill')

DataFrame 的 reindex的使用方法:可以修改行索引 ,列索引,也可以哦同时修改。

frame = DataFrame(np.arange(9).reshape((3,3)), index=['a','b','d'], columns=['cal','china','cannada'])
frame
cal china cannada
a 0 1 2
b 3 4 5
d 6 7 8
frame1 = frame.reindex(['a','b','c','d']) # 只穿入一个数组,则只改变index行索引
frame1
cal china cannada
a 0.0 1.0 2.0
b 3.0 4.0 5.0
c NaN NaN NaN
d 6.0 7.0 8.0
frame2 = frame.reindex(columns=['cal','y','china','w']) # 通过 columns关键字改变列索引
frame2
cal y china w
a 0 NaN 1 NaN
b 3 NaN 4 NaN
d 6 NaN 7 NaN
frame3 = frame.reindex(index=['a','b','c','d'],columns=['cal','y','china','w'])# 同时改变行列索引
frame3
cal y china w
a 0.0 NaN 1.0 NaN
b 3.0 NaN 4.0 NaN
c NaN NaN NaN NaN
d 6.0 NaN 7.0 NaN

image.png

丢弃指定轴上的项:drop()函数,需要有索引数组或列表即可

frame
cal china cannada
a 0 1 2
b 3 4 5
d 6 7 8
frame.drop('a')
cal china cannada
b 3 4 5
d 6 7 8
frame.drop('china',axis=1)# s删除列则需要 axis关键字为1
cal cannada
a 0 2
b 3 5
d 6 8

索引、选取和过滤

# 索引使用类似于切片,但是包含最后一个索引
se = Series(np.arange(6), index=['a','b','c','d','e','f'])
se
a    0
b    1
c    2
d    3
e    4
f    5
dtype: int32
se['b':'c']# 切片全部包含
b    1
c    2
dtype: int32
se[['c','e','d']]# 指定Series对象的索引
c    2
e    4
d    3
dtype: int32
se[se<3]
a    0
b    1
c    2
dtype: int32

对DataFrame对象索引 其实是获取一个或者多个列

frame
cal china cannada
a 0 1 2
b 3 4 5
d 6 7 8
frame['cal']
a    0
b    3
d    6
Name: cal, dtype: int32
frame[['cal','china']] # 获取两列
cal china
a 0 1
b 3 4
d 6 7
frame[:2] #选取行,
cal china cannada
a 0 1 2
b 3 4 5
frame[frame['china'] > 1]
cal china cannada
b 3 4 5
d 6 7 8

image.png

frame.ix['d']
D:\anacoda\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning: 
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
  """Entry point for launching an IPython kernel.





cal        6
china      7
cannada    8
Name: d, dtype: int32

image.png

算术运算和数据对齐

  • pandas 很重要的一个功能就是可以对不同索引的对象进行算术运算

你可能感兴趣的:(pandas 基本功能 (一))