Pandas Series的一些简单操作

Pandas Series的一些简单操作

Series序列,是一种一维的结构,类似于一维list和numpy array,但是功能比他们要更为强大,Series由两部分组成:索引index和数值values
而且,Series的索引可以不是数字,而是一些有意义的值,例如名字、班级等等
创建序列

import pandas as pd
s = pd.Series([1,2,3],index = ['A','B','C'],name = 'First_Series')
'''
A    1
B    2
C    3
Name: First_Series, dtype: int64
'''
s.values
# array([1, 2, 3], dtype=int64)
type(s.values)
# numpy.ndarray
s.index
# Index(['A', 'B', 'C'], dtype='object')
s.name
# 'First_Series'
# 通过索引访问内容
s['A']
# 1
s['A': 'B']
'''
A    1
B    2
Name: First_Series, dtype: int64
'''

看起来Series像是一个字典,实际上,我们确实可以通过字典的方式来创建一个Series

pd.Series({
    'Canada': 35.467,
    'France': 63.951,
    'Germany': 80.94,
    'Italy': 60.665,
    'Japan': 127.061,
    'United Kingdom': 64.511,
    'United States': 318.523
}, name='G7 Population in millions')
'''
Canada             35.467
France             63.951
Germany            80.940
Italy              60.665
Japan             127.061
United Kingdom     64.511
United States     318.523
Name: G7 Population in millions, dtype: float64
'''

创建时间序列

dates = pd.date_range("20220101", periods=6)
'''
DatetimeIndex(['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
               '2022-01-05', '2022-01-06'],
              dtype='datetime64[ns]', freq='D')
'''
pd.Series(data=range(0, 10, 2), index=pd.date_range("20220101", periods=5, freq="MS"))
'''
2022-01-01    0
2022-02-01    2
2022-03-01    4
2022-04-01    6
2022-05-01    8
'''

对Series进行布尔操作

g7_pop
'''
Canada             35.467
France             63.951
Germany            80.940
Italy              60.665
Japan             127.061
United Kingdom     64.511
United States     318.523
Name: G7 Population in millions, dtype: float64
'''
g7_pop > 70
'''
Canada            False
France            False
Germany            True
Italy             False
Japan              True
United Kingdom    False
United States      True
Name: G7 Population in millions, dtype: bool
'''
g7_pop[g7_pop > 70]
'''
Germany           80.940
Japan            127.061
United States    318.523
Name: G7 Population in millions, dtype: float64
'''
g7_pop[g7_pop > g7_pop.mean()]
'''
Japan            127.061
United States    318.523
Name: G7 Population in millions, dtype: float64
'''
# ~ not
# | or
# & and

你可能感兴趣的:(python,数据挖掘,机器学习)