minmax_scale()函数是来自sklearn.preprocessing
包中。sklearn.preprocessing.minmax_scale(X, feature_range=(0, 1), axis=0, copy=True)
该方法将每个特征放缩到给定范围内(默认范围0-1)
但是归一化过程中会造成信息糗事
调用方法:
>>> from sklearn.preprocessing import minmax_scale
>>> x = [0,1,2,3,4,5]
>>> minmax_scale(x)
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
>>> y = [[0,0,0],[1,1,1],[2,2,2]]
>>> minmax_scale(y)
array([[0. , 0. , 0. ],
[0.5, 0.5, 0.5],
[1. , 1. , 1. ]])
>>> minmax_scale(y, axis=1)
array([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
>>> y = [[0,1,2],[1,2,3],[2,3,4]]
>>> minmax_scale(y)
array([[0. , 0. , 0. ],
[0.5, 0.5, 0.5],
[1. , 1. , 1. ]])
>>> minmax_scale(y, axis=1)
array([[0. , 0.5, 1. ],
[0. , 0.5, 1. ],
[0. , 0.5, 1. ]])
用于将数据归一化处理
minmax_scale()函数源码如下(加注释)
def minmax_scale(X, feature_range=(0, 1), axis=0, copy=True):
# sklearn.utils.check_array() 数据转化numpy
# FLOAT_DTYPES = (numpy.float64, numpy.float32, numpy.float16)
X = check_array(X, copy=False, ensure_2d=False, warn_on_dtype=True, dtype=FLOAT_DTYPES)
# ndim数据纬度
original_ndim = X.ndim
if original_ndim == 1:
# 如果数据纬度为n*1 reshape 1*n
X = X.reshape(X.shape[0], 1)
# 将属性放缩到一个指定的最大值和最小值之间
s = MinMaxScaler(feature_range=feature_range, copy=copy)
# fit_transform 先拟合数据,再进行标准化
if axis == 0:
X = s.fit_transform(X)
else:
X = s.fit_transform(X.T).T
if original_ndim == 1:
# 将多维数组降位一维 返回视图(会改变原矩阵)
X = X.ravel()
return X
Reference:
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.minmax_scale.html#sklearn.preprocessing.minmax_scale
https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/preprocessing/data.py#L390
https://www.zhihu.com/question/20455227