风火编程--机器学习之岭回归ridge

岭回归

描述
在多项式回归的基础上,为了防止过拟合现象, 在损失函数中加入L2正则项(系数的平方和乘以1/2alpha), 防止系数过大. 从而提高模型的泛化能力.
必须进行归一化处理.
超参数为最高项次数和alpha, 计算量大,效率低
接口

import numpy as np
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.preprocessing import StandardScaler

def RidgeRegressdion(degree=2, alpha=0.0001):
    """
    岭回归模型的封装
    :param degree: 最高次幂
    :param alpha: 正则项系数
    :return: 岭回归的模型
    """

    ridge = Pipeline([
        ('poly', PolynomialFeatures(degree=degree, include_bias=True, interaction_only=False)),
        ('ss', StandardScaler()),
        ('ridge', Ridge(alpha=alpha))
    ])
    return ridge
    
if __name__ == '__main__':
    x = np.random.uniform(-3, 3, size=100)
    y = 0.5 * x ** 2 + 2 + np.random.normal(0, 1, 100)
    X = x.reshape(-1, 1)
    X_train, X_test, y_train, y_test = train_test_split(X, y)

    ridge = RidgeRegressdion(degree=10, alpha=0.01)
    ridge.fit(X_train, y_train)
    y_test_predict = ridge.predict(X_test)
    err = mean_squared_error(y_test, y_test_predict)
    print(err)

你可能感兴趣的:(风火编程--机器学习之岭回归ridge)