sklearn非负最小二乘法

在这个例子中,我们拟合了一个对回归系数具有正约束的线性模型,并将估计系数与经典线性回归进行比较。

import numpy as np
import matplotlib as plt
from sklearn.metrics import r2_score

# 产生随机数,种子一定,产生的随机数不再真的“随机”
np.random.seed(42)
# 随机产生200*50的矩阵
n_samples, n_features = 200, 50
X = np.random.randn(n_samples, n_features)
true_coef = 3 * np.random.randn(n_features)
# 阈值系数使它们非负,数据情景设定为非负
true_coef[true_coef < 0] = 0
# 2维矩阵和一个向量点乘
y = np.dot(X, true_coef)  # 200*50矩阵* 50*1向量
# 加入噪声/随机数,服从正态分布
y += 5 * np.random.normal(size=(n_samples,))

# 分数据为训练集和测试集
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)
# 拟合非负最小二乘法NNLS
from sklearn.linear_model import LinearRegression

reg_nnls = LinearRegression(positive=True)  # 此处使用
y_pred_nnls = reg_nnls.fit(X_train, y_train).predict(X_test)
r2_score_nnls = r2_score(y_test, y_pred_nnls)
print("NNLS R2 score", r2_score_nnls)

NNLS R2 score 0.8225220806196525

# 拟合OLS,普通最小二乘法
reg_ols = LinearRegression()
y_pred_ols = reg_ols.fit(X_train, y_train).predict(X_test)
r2_score_ols = r2_score(y_test, y_pred_ols)
print("OLS R2 score", r2_score_ols)

OLS R2 score 0.743692629170035

比较 OLS 和 NNLS 之间的回归系数,我们可以观察到它们高度相关(虚线是身份关系),但非负约束将一些收缩为 0。非负最小二乘法本质上会产生稀疏结果(有负数)。

在这里插入图片描述

fig, ax = plt.subplots()
# 画轴,画出两种方法的回归系数W1,W2,.....Wp
ax.plot(reg_ols.coef_, reg_nnls.coef_, linewidth=0, marker=".")
# 画图,以数据点为图的四周的极限
low_x, high_x = ax.get_xlim()
low_y, high_y = ax.get_ylim()
low = max(low_x, low_y)
high = min(high_x,high_y)
# 同一张图上再画一条虚线,指定虚线,颜色。从(low,low)到(high,high)画图
ax.plot([low, high], [low, high], ls="--", c=".3", alpha=.5)
# 写坐标轴标签和设置字体
ax.set_xlabel("OLS regression coefficients", fontweight="bold")
ax.set_ylabel("NNLS regression coefficients", fontweight="bold")
plt.show()

sklearn非负最小二乘法_第1张图片

你可能感兴趣的:(sklearn,机器学习,python)