2. pandas的DataFrame对象

1. 通过字典构造DataFrame
import pandas as pd
import numpy as np

d1 = {
    'a': [i for i in range(5)],
    'b': (i for i in range(4, 9)),
    'c': np.arange(11, 16)
}
d2 = pd.DataFrame(d1)
# 通过DataFrame的index参数和columns参数可以指定行索引和列索引
print(d2)
输出
   a  b   c
0  0  4  11
1  1  5  12
2  2  6  13
3  3  7  14
4  4  8  15

d1.index查看行索引
d1.columns查看列索引
d1.values查看数值

2. 通过Series对象构造DataFrame
import pandas as pd
import numpy as np

d1 = {
    'a': pd.Series(np.arange(1, 5)),
    'b': pd.Series(np.arange(5, 9))
}
d2 = pd.DataFrame(d1)
print(d2)
输出
   a  b
0  1  5
1  2  6
2  3  7
3  4  8
3. 通过字典嵌套构造DataFrame
import pandas as pd

d = {
    'a': {
        '1': '123',
        '2': '789'
    },
    'b': {
        '1': 'asd',
        '2': '456'
    }
}

d2 = pd.DataFrame(d)
print(d2)
输出
     a    b
1  123  asd
2  789  456
4. 通过二维数组创建DataFrame
import pandas as pd
import numpy as np

n1 = np.random.randint(0, 12, size=(3, 4))
d1 = pd.DataFrame(n1)
print(d1)
输出
   0  1   2  3
0  5  6  11  6
1  4  3   2  0
2  5  7   1  5
5. 通过字典构成的列表创建
import pandas as pd

n1 = [
    {'apple': 3.5, 'banana': 2.6},
    {'apple': 3.1, 'banana': 2.5}
]
d1 = pd.DataFrame(n1)
print(d1)
输出
   apple  banana
0    3.5     2.6
1    3.1     2.5
  1. 通过Serise构成的列表创建DataFrame
import pandas as pd
import numpy as np

n1 = [
    pd.Series(np.random.rand(2)),
    pd.Series(np.random.rand(3)),

]
d1 = pd.DataFrame(n1)
print(d1)
输出
          0         1         2
0  0.422105  0.899413       NaN
1  0.527694  0.312743  0.307411
7. DataFrame的基本操作

(1) DataFrame的转置

import pandas as pd
import numpy as np

d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
print(d1.T)  # DataFrame的转置

(2) 通过列索引获取列数据,并返回一个Series对象

import pandas as pd
import numpy as np

d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
print(d1['1'])
输出
a    1
b    5
c    1
Name: 1, dtype: int32

(3) 增加和删除列

import pandas as pd
import numpy as np

d1 = pd.DataFrame(np.random.randint(12, size=(3, 3)), index=['a', 'b', 'c'], columns=['1', '2', '3'])
d1['4'] = 10  # 增加列4全部为10
d1['5'] = [1, 2, 3]  # 增加列5为1,2,3
del(d1['5'])  # 删除列5
print(d1)
输出
    1  2  3   4
a   9  7  0  10
b   9  2  5  10
c  10  2  9  10

你可能感兴趣的:(2. pandas的DataFrame对象)