pandas的数据结构介绍

Series

Series是一种类似于一维数组的对象，它由一组数据（各种NumPy数据类型）以及一组与之相关的数据标签（即索引）组成。仅由一组数据即可产生最简单的Series

obj = pd.Series([4, 7, -5, 3])

Series的字符串表现形式为：索引在左边，值在右边。由于我们没有为数据指定索引，于是会自动创建一个0到N1（N为数据的长度）的整数型索引

可以通过Series 的values和index属性获取其数组表示形式和索引对象

In [6]: obj.values 
Out[6]: array([ 4, 7, -5, 3]) 
In [7]: obj.index 
Out[7]: Int64Index([0, 1, 2, 3])

创建的Series带有一个可以对各个数据点进行标记的索引

In [8]: obj2 = pd.Series([4, 7, -5, 3], index=['d', 'b', 'a', 'c']) 
In [9]: obj2 
Out[9]: d 4 
        b 7 
        a -5
        c 3 
In [10]: obj2.index 
Out[10]: Index([d, b, a, c], dtype=object)

通过索引的方式选取Series中的单个或一组值

NumPy数组运算（如根据布尔型数组进行过滤、标量乘法、应用数学函数等）都会保留索引和值之间的链接

obj2[obj2>0]
d    4
b    7
c    3
dtype: int64

obj2*2
d     8
b    14
a   -10
c     6
dtype: int64

如果数据被存放在一个Python字典中，也可以直接通过这个字典来创建Series

In [20]: sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000} 
In [21]: obj3 = pd.Series(sdata)
In [22]: obj3 
Out[22]: Ohio 35000 
         Oregon 16000 
         Texas 71000 
         Utah 5000

如果只传入一个字典，则结果Series中的索引就是原字典的键（有序排列）

In [23]: states = ['California', 'Ohio', 'Oregon', 'Texas'] 
In [24]: obj4 = pd.Series(sdata, index=states) #值，键
In [25]: obj4 
Out[25]: California NaN Ohio 35000 Oregon 16000 Texas 71000

在pandas中，它用于表示缺失或NA值）。我将使用缺失（missing）或NA表示缺失数据。pandas 的isnull和notnull函数可用于检测缺失数据

In [26]: pd.isnull(obj4) In [27]: pd.notnull(obj4)

In [28]: obj4.isnull()

Series最重要的一个功能是：它在算术运算中会自动对齐不同索引的数据

obj3
Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64

obj4
California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
dtype: float64

obj3+obj4
California         NaN
Ohio           70000.0
Oregon         32000.0
Texas         142000.0
Utah               NaN
dtype: float64

Series对象本身及其索引都有一个name属性

In [32]: obj4.name = 'population' 
In [33]: obj4.index.name = 'state' 
In [34]: obj4 
Out[34]: state California NaN Ohio 35000 Oregon 16000 Texas 71000 Name: population

Series的索引可以通过赋值的方式就地修改

pandas的数据结构介绍 — Series

Series

你可能感兴趣的:(pandas的数据结构介绍 — Series)