Series
series是一种类似一维数组的对象,它由一组数据以及一组与之相关的标签组成,通过pandas的Series函数实例化一个series
- 创建series
import pandas as pd
s = pd.Series([5,2,3,4,1])
>>>
0 2
1 3
2 4
3 1
4 5
dtype: int64
s.values
>>>array([2, 3, 4, 1, 5])
s.index
>>>RangeIndex(start=0, stop=5, step=1)
s2 = pd.Series([3,2,4,1,5],index = ['a','b','c','d','e'])
print(s2)
>>>
a 3
b 2
c 4
d 1
e 5
dtype: int64
#根据字典创建series
dict = {'name':'joha','sex':'male','age':'18'}
s3 = pd.Series(dict)
print(s3)
>>>
name joha
sex male
age 18
dtype: object
- 根据索引选取Series的一个值或多个值
s2 = pd.Series([3,2,4,1,5],index = ['a','b','c','d','e'])
#批量单个值
s2['a']
>>>3
#批量选取多个值
s2[['a','c','e']]
>>>
a 3
c 4
e 5
dtype: int64
s2[s2>3]
>>>
c 4
e 5
dtype: int64
s2*3
>>>
a 9
b 6
c 12
d 3
e 15
dtype: int64
'c' in s2
>>>True
'f' in s2
>>>False
series在算数运算中自动对齐不同索引的数据
s1 = pd.Series([3,2,4,1,5],index = ['a','b','c','d','e'])
s2 = pd.Series([3,-5,1],index = ['a','c','e'])
print(s1+s2)
>>>
a 6.0
b NaN
c -1.0
d NaN
e 6.0
dtype: float64
series中的index可以通过赋值的方式进行修改
s2 = pd.Series([3,-5,1],index = ['a','c','e'])
s2.index = [1,2,3]
print(s2)
>>>
1 3
2 -5
3 1
dtype: int64
##DataFrame
- 创建dataFrame
```
test_dict = {'id':[1,2,3,4,5,6],
'name':['Alice','Bob','Cindy','Eric','Helen','Grace '],
'math':[90,89,99,78,97,93],
'english':[89,94,80,94,94,90]}
#[1].直接写入参数test_dict
test_dict_df = pd.DataFrame(test_dict)
print(test_dict_df)
>>>
id name math english
0 1 Alice 90 89
1 2 Bob 89 94
2 3 Cindy 99 80
3 4 Eric 78 94
4 5 Helen 97 94
5 6 Grace 93 90
#[2].字典型赋值
test_dict_df = pd.DataFrame(data=test_dict)
>>>
id name math english
0 1 Alice 90 89
1 2 Bob 89 94
2 3 Cindy 99 80
3 4 Eric 78 94
4 5 Helen 97 94
5 6 Grace 93 90
test_dict_df = pd.DataFrame(test_dict,columns=['name','math','english','id'])
print(test_dict_df)
>>>
name math english id
0 Alice 90 89 1
1 Bob 89 94 2
2 Cindy 99 80 3
3 Eric 78 94 4
4 Helen 97 94 5
5 Grace 93 90 6