Day 1 2019-05-29 Pandas 的数据结构介绍
Series
Series 是一种类似于一维数组的对象,它由一组数据(各种Numpy 数据类型)以及与之相关的数据标签(即索引)组成。
obj = Series([4,7,-5,3])
obj
Out[217]:
0 4
1 7
2 -5
3 3
Series 类同于一个定长的有序字典,其索引值为其映射。Python 字典可直接被用来进行Series 创建。
sdata = {'Ohio':35000,
'Texas': 71000,
'Oregon':16000,
'Utah':5000}
obj3 = Series(sdata)
obj3
Out[232]:
Ohio 35000
Texas 71000
Oregon 16000
Utah 5000
dtype: int64
states = ['California',
'Ohio','Oregon',
'Texas']
obj4 = Series(sdata, index = states)
obj4
Out[240]:
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
dtype: float64
DataFrame
data = {'state':['Ohio','Ohio','Ohio','Nevada','Nevada'],
'year':[2000,2001,2002,2001,2002],
'pop':[1.5,1.7,3.6,2.4,2.9]}
frame = DataFrame(data)
frame
Out[250]:
state year pop
0 Ohio 2000 1.5
1 Ohio 2001 1.7
2 Ohio 2002 3.6
3 Nevada 2001 2.4
4 Nevada 2002 2.9