数据规范化笔记

   

数据规范化笔记_第1张图片




最小-最大规范化(min-max normalization)

from sklearn import preprocessing
import numpy as np

X_train = np.array([[1., -1., 2.],
                    [2., 0., 0.],
                    [0., 1., -1.]])
min_max_scaler = preprocessing.MinMaxScaler([0, 1])
X_train_minmax = min_max_scaler.fit_transform(X_train)

print(min_max_scaler.scale_)
print()

print(X_train_minmax)
X_test = np.array([[-2., -1., 4.]])
X_test_minmax = min_max_scaler.transform(X_test)

print()
print(X_test_minmax)

结果:

[ 0.5         0.5         0.33333333]

[[ 0.5         0.          1.        ]
 [ 1.          0.5         0.33333333]
 [ 0.          1.          0.        ]]

[[-1.          0.          1.66666667]]

z-score规范化(零-均值规范化)

from sklearn import preprocessing
import numpy as np
X = np.array([[ 1., -1.,  2.],
            [ 2.,  0.,  0.],
            [ 0.,  1., -1.]])
X_scaled = preprocessing.scale(X)
#
# X_scaled
# array([[ 0.  ..., -1.22...,  1.33...],
#        [ 1.22...,  0.  ..., -0.26...],
#        [-1.22...,  1.22..., -1.06...]])

# Scaled data has zero mean and unit variance:
# >>>

mean=X_scaled.mean(axis=0)# axis=0,特征;axis=1,sample
print(mean)
print()
# array([ 0.,  0.,  0.])

std=X_scaled.std(axis=0)
print(std)
# array([ 1.,  1.,  1.])

scaler = preprocessing.StandardScaler().fit(X)

print("scaler.mean_:",scaler.mean_)
# array([ 1. ...,  0. ...,  0.33...])

print("scaler.std_:",scaler.std_)
# array([ 0.81...,  0.81...,  1.24...])

train=scaler.transform(X)
print("train:",train)
# array([[ 0.  ..., -1.22...,  1.33...],
#        [ 1.22...,  0.  ..., -0.26...],
#        [-1.22...,  1.22..., -1.06...]])

# The scaler instance can then be used on new data to transform it the same way it did on the training set:
test=scaler.transform([[-1.,  1., 0.]])
print("test:",test)
# array([[-2.44...,  1.22..., -0.26...]])


小数定标规范化(normalization by decimal scaling)





你可能感兴趣的:(Python,机器学习,数据规范化,scikit)