1. 通过字典构造DataFrame
import pandas as pd
import numpy as np
d1 = {
'a': [i for i in range(5)],
'b': (i for i in range(4, 9)),
'c': np.arange(11, 16)
}
d2 = pd.DataFrame(d1)
# 通过DataFrame的index参数和columns参数可以指定行索引和列索引
print(d2)
输出
a b c
0 0 4 11
1 1 5 12
2 2 6 13
3 3 7 14
4 4 8 15
d1.index查看行索引
d1.columns查看列索引
d1.values查看数值
2. 通过Series对象构造DataFrame
import pandas as pd
import numpy as np
d1 = {
'a': pd.Series(np.arange(1, 5)),
'b': pd.Series(np.arange(5, 9))
}
d2 = pd.DataFrame(d1)
print(d2)
输出
a b
0 1 5
1 2 6
2 3 7
3 4 8
3. 通过字典嵌套构造DataFrame
import pandas as pd
d = {
'a': {
'1': '123',
'2': '789'
},
'b': {
'1': 'asd',
'2': '456'
}
}
d2 = pd.DataFrame(d)
print(d2)
输出
a b
1 123 asd
2 789 456
4. 通过二维数组创建DataFrame
import pandas as pd
import numpy as np
n1 = np.random.randint(0, 12, size=(3, 4))
d1 = pd.DataFrame(n1)
print(d1)
输出
0 1 2 3
0 5 6 11 6
1 4 3 2 0
2 5 7 1 5
5. 通过字典构成的列表创建
import pandas as pd
n1 = [
{'apple': 3.5, 'banana': 2.6},
{'apple': 3.1, 'banana': 2.5}
]
d1 = pd.DataFrame(n1)
print(d1)
输出
apple banana
0 3.5 2.6
1 3.1 2.5
- 通过Serise构成的列表创建DataFrame
import pandas as pd
import numpy as np
n1 = [
pd.Series(np.random.rand(2)),
pd.Series(np.random.rand(3)),
]
d1 = pd.DataFrame(n1)
print(d1)
输出
0 1 2
0 0.422105 0.899413 NaN
1 0.527694 0.312743 0.307411
7. DataFrame的基本操作
(1) DataFrame的转置
import pandas as pd
import numpy as np
d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
print(d1.T) # DataFrame的转置
(2) 通过列索引获取列数据,并返回一个Series对象
import pandas as pd
import numpy as np
d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
print(d1['1'])
输出
a 1
b 5
c 1
Name: 1, dtype: int32
(3) 增加和删除列
import pandas as pd
import numpy as np
d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
d1['4'] = 10 # 增加列4全部为10
d1['5'] = [1, 2, 3] # 增加列5为1,2,3
del(d1['5']) # 删除列5
print(d1)
输出
1 2 3 4
a 9 7 0 10
b 9 2 5 10
c 10 2 9 10