Pandas index详解

总括

pandas里对索引的操作主要有
1. DataFrame.rename
2. DataFrame.rename_axis
3. DataFrame.reindex
4. DataFrame.reindex_axis
5. DataFrame.reset_index
6. pandas.Index.reindex
7. pandas.Index.set_names
其中1和2,是对索引的更改,原索引是不变的.3和4是增加和减少了索引,如果索引存在则还按照原来的值,如果不存在则填充空值.5是重新给索引.
1~5都返回的是数据框
6~7返回的是索引

rename

DataFrame.rename(index=None, columns=None, **kwargs)

参数

  1. index, columns : scalar, list-like, dict-like or function, optional(Function/dict值必须是一对一)
  2. copy : boolean, default True(复制底层函数)
  3. inplace : boolean, default False(替换原对象)
  4. level:int or level name, default None(多层索引时使用)

返回

1.renamed : DataFrame (new object)

例子

In [1]: import pandas as pd
   ...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
   ...: df
Out[1]: 
   A  B
0  1  4
1  2  5
2  3  6

In [2]: df.rename(index={0:3,1:4,2:5}, columns={"A": "a", "C": "c"})
Out[2]: 
   a  B
3  1  4
4  2  5
5  3  6

rename_axis

DataFrame.rename_axis(mapper, axis=0, copy=True, inplace=False)

参数

  1. mapper : scalar, list-like, dict-like or function, optional
  2. axis : int or string, default 0
  3. copy : boolean, default True
  4. inplace : boolean, default False

返回

  1. renamed : type of caller

例子

In [1]: import pandas as pd
   ...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
   ...: df
Out[1]: 
   A  B
0  1  4
1  2  5
2  3  6

In [2]: df.rename_axis({0:3,1:4,2:5})
Out[2]: 
   A  B
3  1  4
4  2  5
5  3  6

In [3]: df.rename_axis({"A": "a", "C": "c"},axis=1)
Out[3]: 
   a  B
0  1  4
1  2  5
2  3  6

reindex

DataFrame.reindex(index=None, columns=None, **kwargs)

参数

  1. index, columns : array-like, optional (can be specified in order, or as
  2. method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional(填充设置)
  3. copy : boolean, default True
  4. level : int or name
  5. fill_value : scalar, default np.NaN
  6. limit : int, default None
  7. tolerance : optional

返回

  1. reindexed : DataFrame

例子

import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df
Out[1]: 
   A  B
0  1  4
1  2  5
2  3  6

df.reindex(index=(1,2,3))
Out[2]: 
     A    B
1  2.0  5.0
2  3.0  6.0
3  NaN  NaN

df.reindex(columns=("B","C"))
Out[3]: 
   B   C
0  4 NaN
1  5 NaN
2  6 NaN

reindex_axis

DataFrame.reindex_axis(labels, axis=0, method=None, level=None, copy=True, limit=None, fill_value=nan)

参数

  1. labels : array-like
  2. axis : {0 or ‘index’, 1 or ‘columns’}
  3. method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional
  4. copy : boolean, default True
  5. level : int or name
  6. limit : int, default None
  7. tolerance : optional

返回

  1. reindexed : DataFrame

例子

In [1]: import pandas as pd
   ...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
   ...: df
Out[1]: 
   A  B
0  1  4
1  2  5
2  3  6

In [2]: df.reindex_axis((1,2,3))
Out[2]: 
     A    B
1  2.0  5.0
2  3.0  6.0
3  NaN  NaN

In [3]: df.reindex_axis(("B","C"),axis=1)
Out[3]: 
   B   C
0  4 NaN
1  5 NaN
2  6 NaN

reset_index

DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')

参数

  1. level : int, str, tuple, or list, default None
  2. drop : boolean, default False
  3. inplace : boolean, default False
  4. col_level : int or str, default 0
  5. col_fill : object, default ‘’

返回

  1. resetted : DataFrame

例子

In [1]: import pandas as pd
   ...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
   ...: df=df.reindex_axis((1,2,3))
   ...: df
Out[1]: 
     A    B
1  2.0  5.0
2  3.0  6.0
3  NaN  NaN

In [2]: df.reset_index()
Out[2]: 
   index    A    B
0      1  2.0  5.0
1      2  3.0  6.0
2      3  NaN  NaN

set_index

set_index方法是将某一列做为索引,而reset_index是从新按int升序的方式做了一个索引

Index.reindex

Index.reindex(target, method=None, level=None, limit=None, tolerance=None)

参数

  1. target : an iterable

返回

  1. new_index : pd.Index
  2. indexer : np.ndarray or None

例子

In [1]: import pandas as pd
   ...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
   ...: df
Out[1]: 
   A  B
0  1  4
1  2  5
2  3  6

In [2]: df.index.reindex((1,2,3))
Out[2]: (Int64Index([1, 2, 3], dtype='int64'), array([ 1,  2, -1], dtype=int64))

Index.set_names

Index.set_names(names, level=None, inplace=False)

你可能感兴趣的:(pandas)