DataFrame的排序

DataFrame的排序

前提:加载numpy,pandas,Series,DataFrame

生成一个dataframe,指定索引,具体如图:

import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.arange(20).reshape(4,5),index = ['First','Second','Third','Fourth'],columns=['d','b','a','c','e'])

out[1]:
        d   b   a   c   e
First   0   1   2   3   4
Second  5   6   7   8   9
Third   10  11  12  13  14
Fourth  15  16  17  18  19

dataframe的几种排序。

dataframe(df1)按索引和按列名排序分别使用df1.sort_index()、df1.sort_index(axis=1)即可,如图

df1.sort_index()

out[2]:

        d   b   a   c   e
First   0   1   2   3   4
Fourth  15  16  17  18  19
Second  5   6   7   8   9
Third   10  11  12  13  14

df1.sort_index(axis=1)

如果要对df1按降序排序,那么只需添加参数ascending = False即可,如图

df1.sort_index(ascending=False)

out[3]:
    d   b   a   c   e
Third   10  11  12  13  14
Second  5   6   7   8   9
Fourth  15  16  17  18  19
First   0   1   2   3   4

为了更加方便演示dataframe如何根据一列或者多列排序,再新生成一个dataframe,命名为df2,如下:

df2 = DataFrame({'c':[6,3,8,-2,0],'a':[2,2,3,1,4],'b':['Jan','May','Sep','Feb','Aug']})
df2

out[4]:
        c   a   b
0   6   2   Jan
1   3   2   May
2   8   3   Sep
3   -2  1   Feb
4   0   4   Aug

现在分别使用

df2.sort_values(by = 'b')-对df2按照b列排序

df2.sort_values(by = ['b','a'])对df2按照b列排序后如果有相同的再按照a列排序

df2.sort_values(by = ['a','b'])对df2按照a列排序后如果有相同的再按照b列排序

DataFrame的排名:

df2按照索引和列排序分别用df2.rank()和df2.rank(axis = 1)即可,如下:

df2.rank()

out[5]:
    c   a   b
0   4.0 2.5 3.0
1   3.0 2.5 4.0
2   5.0 4.0 5.0
3   1.0 1.0 2.0
4   2.0 5.0 1.0

df2.rank(axis = 1,ascending = True)

out[6]:
        c   a
0   2.0 1.0
1   2.0 1.0
2   2.0 1.0
3   1.0 2.0
4   1.0 2.0

df2.rank(axis = 1,ascending = False)

out[7]:

        c   a
0   1.0 2.0
1   1.0 2.0
2   1.0 2.0
3   2.0 1.0
4   2.0 1.0

你可能感兴趣的:(DataFrame的排序)