Series类型由一组数据及与之相关的数据索引组成,Series类型可以由如下类型创建:
In [1]: import pandas as pd
In [2]: list_a = [2,4,5,6]
In [3]: pd.Series(list_a)
Out[3]:
0 2
1 4
2 5
3 6
dtype: int64
In [4]: pd.Series(1,index = [1,2,3])
Out[4]:
1 1
2 1
3 1
dtype: int64
In [5]: pd.Series({'a':1,'b':3})
Out[5]:
a 1
b 3
dtype: int64
#如果定义的index在原字典中已经存在,那么该索引会一直对应原字典的值,如果index对应不到原字典的值,则会返回NaN
In [11]: pd.Series({'a':1,'b':3},index = ['b','a','c'])
Out[11]:
b 3.0
a 1.0
c NaN
dtype: float64
In [9]: list_b = np.arange(6)
In [10]: pd.Series(list_b)
Out[10]:
0 0
1 1
2 2
3 3
4 4
5 5
dtype: int32
In [12]: pd.Series(range(3))
Out[12]:
0 0
1 1
2 2
dtype: int32
Series类型的基本操作:
In [14]: a = pd.Series({'a':1,'b':5})
In [15]: a.index
Out[15]: Index(['a', 'b'], dtype='object')
In [16]: a.values #返回一个多维数组numpy对象
Out[16]: array([1, 5], dtype=int64)
#自动索引和自定义索引并存,但不能混用
In [17]: a[0] #自动索引
Out[17]: 1
#自定义索引
In [18]: a['a']
Out[18]: 1
#不能混用
In [20]: a[['a',1]]
Out[20]:
a 1.0
1 NaN
dtype: float64
#通过自定义索引访问
#对索引保留字in操作,值不可以
In [21]: 'a' in a
Out[21]: True
In [22]: 1 in a
Out[22]: False
Series类型在运算中会自动对齐不同索引的数据
In [29]: a = pd.Series([1,3,5],index = ['a','b','c'])
In [30]: b = pd.Series([2,4,5,6],index = ['c,','d','e','b'])
In [31]: a+b
Out[31]:
a NaN
b 9.0
c NaN
c, NaN
d NaN
e NaN
dtype: float64
Series对象可以随时修改并即刻生效
In [32]: a.index = ['c','d','e']
In [33]: a
Out[33]:
c 1
d 3
e 5
dtype: int64
In [34]: a+b
Out[34]:
b NaN
c NaN
c, NaN
d 7.0
e 10.0
dtype: float64
总结:Series基本操作类似ndarray和字典,根据索引对齐。