pandas.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False)
若labels为False则返回整数填充的Categorical或数组或Series
若retbins为True还返回用浮点数填充的N维数组
>>> pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), 3, retbins=True)
...
([(0.19, 3.367], (0.19, 3.367], (0.19, 3.367], (3.367, 6.533], ...
Categories (3, interval[float64]): [(0.19, 3.367] < (3.367, 6.533] ...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]),
... 3, labels=["good", "medium", "bad"])
...
[good, good, good, medium, bad, good]
Categories (3, object): [good < medium < bad]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> pd.cut(np.ones(5), 4, labels=False)
array([1, 1, 1, 1, 1])
pandas.qcut(x, q, labels=None, retbins=False, precision=3, duplicates=’raise’)
1.x
2.q,整数或分位数组成的数组。
3.labels,
4.retbins
5.precisoon
6.duplicates
结果中超过边界的值将会变成NA
>>> pd.qcut(range(5), 4)
...
[(-0.001, 1.0], (-0.001, 1.0], (1.0, 2.0], (2.0, 3.0], (3.0, 4.0]]
Categories (4, interval[float64]): [(-0.001, 1.0] < (1.0, 2.0] ...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> pd.qcut(range(5), 3, labels=["good", "medium", "bad"])
...
[good, good, medium, bad, bad]
Categories (3, object): [good < medium < bad]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pd.qcut(range(5), 4, labels=False)
array([0, 0, 1, 2, 3])