Scikit中使用Grid_Search来获取模型的最佳参数

1. grid search是用来寻找模型的最佳参数

先导入一些依赖包

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.grid_search import GridSearchCV
from sklearn import metrics
import numnpy as np
import pandas as pd

2. 设置要查找的参数

params={'learning_rate':np.linspace(0.05,0.25,5), 'max_depth':[x for x in range(1,8,1)], 'min_samples_leaf':[x for x in range(1,5,1)], 'n_estimators':[x for x in range(50,100,10)]}

3. 设置模型和评价指标,开始用不同的参数训练模型

clf = GradientBoostingClassifier()
grid = GridSearchCV(clf, params, cv=10, scoring="f1")
grid.fit(X, y)

scoring所有可能情况如下:

  • Classification
scoring function comment
accuracy metrics.accuracy_score
average_precision metrics.average_precision_score
f1 metrics.f1_score for binary targets
f1_micro metrics.f1_score micro-averaged
f1_macro metrics.f1_score macro-averaged
f1_weighted metrics.f1_score weighted average
f1_samples metrics.f1_score by multilabel sample
neg_log_loss metrics.log_loss requires predict_proba support
precision etc. metrics.precision_score suffixes apply as with “f1”
recall etc. metrics.recall_score suffixes apply as with “f1”
roc_auc metrics.roc_auc_score
  • Clustering
scoring function comment
adjusted_rand_score metrics.adjusted_rand_score
  • Regression
scoring function comment
neg_mean_absolute_error metrics.mean_absolute_error
neg_mean_squared_error metrics.mean_squared_error
neg_median_absolute_error metrics.median_absolute_error
r2 metrics.r2_score

4. 查看最佳分数和最佳参数

grid.best_score_    #查看最佳分数(此处为f1_score)
grid.best_params_   #查看最佳参数

这里写图片描述

5. 获取最佳模型

grid.best_estimator_

这里写图片描述

6. 利用最佳模型来进行预测

best_model=grid.best_estimator_
predict_y=best_model.predict(Test_X)
metrics.f1_score(y, predict_y)

你可能感兴趣的:(算法入门)