介绍4种在Python中创建Dataframe的方法:
1.由数组/list组成的字典创建DataFrame
import pandas as pd
import numpy as np
In [1]:data = pd.DataFrame({'name':['wencky','stany','barbio'],
'age':[29,29,3],
'gender':['w','m','m']})
Out[1]:
age gender name
0 29 w wencky
1 29 m stany
2 3 m barbio
In [2]:data1 = pd.DataFrame({'a':[1,2,3],
'b':[3,4,5],
'c':[5,6,7]})
Out[2]:
a b c
0 1 3 5
1 2 4 6
2 3 5 7
In [3]: data2 = pd.DataFrame({'one':np.random.rand(3),
' two':np.random.rand(3)})
Out[3]:
one two
0 0.001011 0.497746
1 0.088072 0.167826
2 0.583451 0.764435
2. 由Series组成的字典创建DataFrame
由Seris组成的字典 创建Dataframe,columns为字典key,index为Series的标签(如果Series没有指定标签,则是默认数字标签)
In [1]:data3 =pd.DataFrame({'one':pd.Series(np.random.rand(2), index = ['a','b']),
'two':pd.Series(np.random.rand(3),index = ['a','b','c'])})
Out[1]:
one two
a 0.470947 0.122659
b 0.584577 0.136429
c NaN 0.396825
3. 由字典组成的列表创建DataFrame
由字典组成的列表创建Dataframe,columns为字典的key,index不做指定则为默认数组标签
In [1]:data =pd.DataFrame([{'one': 1, 'two': 2}, {'one': 5, 'two': 10, 'three': 20}])
Out[1]:
one three two
0 1 NaN 2
1 5 20.0 10
由字典组成的字典创建Dataframe第二种方法,columns为字典的key,index为子字典的key
In [1]:data =pd.DataFrame({'Jack':{'math':90,'english':89,'art':78},
'Marry':{'math':82,'english':95,'art':92},
'Tom':{'math':78,'english':67}})
Out[1]:
Jack Marry Tom
art 78 92 NaN
english 89 95 67.0
math 90 82 78.0
5.由二维数组直接创建DataFrame
通过二维数组直接创建Dataframe,得到一样形状的结果数据,如果不指定index和columns,两者均返回默认数字格式
In [1]:data = pd.DataFrame(np.random.rand(9).reshape(3,3), index = ['a', 'b', 'c'], columns = ['one','two','three'])
Out[1]:
one two three
a 0.544923 0.289562 0.465923
b 0.304807 0.129171 0.387577
c 0.251819 0.135445 0.139304