实验要求:使用SVR模型实现对波士顿房价的预测 (load_boston),并使用r2-score 对回归结果评测。
同样地,这里 SVR 模型采用的是高斯核函数 kernel=‘rbf’,惩罚系数 C=1,epsilon=0.2。
我们采用以下四项指标来进行评价:
有关 SVR 模型的参数以及如何选择, 可以参考这篇博客 sklearn.svm.SVR的参数介绍
from sklearn.datasets import load_boston
from sklearn.svm import SVR
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error,mean_squared_error,explained_variance_score,r2_score
from sklearn.model_selection import train_test_split
import numpy as np
#加载数据集
boston=load_boston()
x=boston.data
y=boston.target
# 拆分数据集
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=10)
# 预处理
y_train = np.array(y_train).reshape(-1, 1)
y_test = np.array(y_test).reshape(-1, 1)
x_train = StandardScaler().fit_transform(x_train)
x_test = StandardScaler().fit_transform(x_test)
y_train = StandardScaler().fit_transform(y_train).ravel()
y_test = StandardScaler().fit_transform(y_test).ravel()
#创建svR实例
svr=SVR(C=1, kernel='rbf', epsilon=0.2)
svr=svr.fit(x_train,y_train)
#预测
svr_predict=svr.predict(x_test)
#评价结果
mae = mean_absolute_error(y_test, svr_predict)
mse = mean_squared_error(y_test, svr_predict)
evs = explained_variance_score(y_test, svr_predict)
r2 = r2_score(y_test, svr_predict)
print("MAE:", mae)
print("MSE:", mse)
print("EVS:", evs)
print("R2:", r2)
从结果中可以看到,R2得分还是比较可以的,达到了 0.80。
同理,我们继续采用网格参数搜索的方式,核函数依然选择高斯核函数,我们针对 惩罚系数 C 和核函数系数 gamma ,以及 epsilon 进行调参。
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error, mean_squared_error, explained_variance_score, r2_score
import numpy as np
# 加载数据集
boston_data = load_boston()
# print(boston_data)
# 拆分数据集
x = boston_data.data
y = boston_data.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=10)
# 预处理
y_train = np.array(y_train).reshape(-1, 1)
y_test = np.array(y_test).reshape(-1, 1)
x_train = StandardScaler().fit_transform(x_train)
x_test = StandardScaler().fit_transform(x_test)
y_train = StandardScaler().fit_transform(y_train).ravel()
y_test = StandardScaler().fit_transform(y_test).ravel()
# 设置超参数
C = [0.1, 0.2, 0.5, 0.8, 0.9, 1, 2, 5, 10]
kernel = 'rbf'
gamma = [0.001, 0.01, 0.1, 0.2, 0.5, 0.8]
epsilon = [0.01, 0.05, 0.1, 0.2, 0.5, 0.8]
# 参数字典
params_dict = {
'C': C,
'gamma': gamma,
'epsilon': epsilon
}
# 创建SVR实例
svr = SVR()
# 网格参数搜索
gsCV = GridSearchCV(
estimator=svr,
param_grid=params_dict,
n_jobs=2,
scoring='r2',
cv=6
)
gsCV.fit(x_train, y_train)
# 输出参数信息
print("最佳度量值:", gsCV.best_score_)
print("最佳参数:", gsCV.best_params_)
print("最佳模型:", gsCV.best_estimator_)
# 用最佳参数生成模型
svr = SVR(C=gsCV.best_params_['C'], kernel=kernel, gamma=gsCV.best_params_['gamma'],
epsilon=gsCV.best_params_['epsilon'])
# 获取在训练集的模型
svr.fit(x_train, y_train)
# 预测结果
svr_predict = svr.predict(x_test)
# 模型评测
mae = mean_absolute_error(y_test, svr_predict)
mse = mean_squared_error(y_test, svr_predict)
evs = explained_variance_score(y_test, svr_predict)
r2 = r2_score(y_test, svr_predict)
print("MAE:", mae)
print("MSE:", mse)
print("EVS:", evs)
print("R2:", r2)