使用Optuna的XGBoost模型的高效超参数优化

介绍 : (Introduction :)

Hyperparameter optimization is the science of tuning or choosing the best set of hyperparameters for a learning algorithm. A set of optimal hyperparameter has a big impact on the performance of any machine learning algorithm. It is one of the most time-consuming yet a crucial step in machine learning training pipeline.

^ h yperparameter优化调整或选择超参数为学习算法的最佳设置的科学性。 一组最佳超参数对任何机器学习算法的性能都有很大影响。 这是机器学习培训流程中最耗时但至关重要的步骤之一。

A Machine learning model has two types of tunable parameter :

机器学习模型具有两种可调参数:

· Model parameters

·型号参数

· Model hyperparameters

·模型超参数

使用Optuna的XGBoost模型的高效超参数优化_第1张图片
source) 源 )

Model parameters are learned during the training phase of a model or classifier. For example :

在模型或分类器的训练阶段学习模型参数 。 例如 :

  • coefficients in logistic regression or liner regression

    逻辑回归或线性回归的系数
  • weights in an artificial neural network

    人工神经网络中的权重

Model Hyperparameters are set by user before the model training phase. For example :

模型的超参数是由用户模型训练阶段之前设置。 例如 :

  • ‘c’ (regularization strength), ‘penalty’ and ‘solver’ in logistic regression

    logistic回归中的'c'(正则化强度),'惩罚'和'solver'
  • ‘learning rate’, ‘batch size’, ‘number of hidden layers’ etc. in an artificial neural network

    人工神经网络中的“学习率”,“批大小”,“隐藏层数”等

The choice of Machine learning model depends on the dataset, the task in hand i.e. prediction or classification. Each model has its own unique set of hyperparameter and the task of finding the best combination of these parameter is known as hyperparameter optimization.

机器学习模型的选择取决于数据集,手头的任务,即预测或分类。 每个模型都有其独特的超参数集,找到这些参数的最佳组合的任务称为超参数优化。

For solving hyperparameter optimization problem there are various methods are available. For example :

为了解决超参数优化问题,有多种方法可用。 例如 :

  • Grid Search

    网格搜索
  • Random Search

    随机搜寻
  • Optuna

    奥图纳
  • HyperOpt

    超级光电

In this post, we will focus on Optuna library which has one of the most accurate and successful hyperparameter optimization strategy.

在本文中,我们将重点介绍Optuna库,该库具有最准确,最成功的超参数优化策略。

Optuna: (Optuna :)

Optuna is an open source hyperparameter optimization (HPO) framework to automate search space of hyperparameter. For finding an optimal set of hyperparameters, Optuna uses Bayesian method. It supports various types of samplers listed below :

Optuna是一个开源的超参数优化(HPO)框架,用于自动执行超参数的搜索空间。 为了找到最佳的超参数集,Optuna使用贝叶斯方法。 它支持下面列出的各种类型的采样器:

  • GridSampler (using grid search)

    GridSampler (使用网格搜索)

  • RandomSampler (using random sampling)

    RandomSampler (使用随机采样)

  • TPESampler (using Tree-structured Parzen Estimator algorithm)

    TPESampler (使用树结构的Parzen估计器算法)

  • CmaEsSampler ( using CMA-ES algorithm)

    CmaEsSampler (使用CMA-ES算法)

Use of Optuna for hyperparameter optimization is explained using Credit Card Fraud Detection dataset on Kaggle. The problem statement is to classify a credit card transaction fraudulent or genuine(binary classification). This data contains only numerical input variables which are PCA transformation of original features. Due to confidentially issues, the original features and more background information about the data are not available.

使用Kaggle上的信用卡欺诈检测数据集来说明将Optuna用于超参数优化的情况。 问题陈述是对信用卡交易的欺诈性或真实性进行分类(二进制分类)。 此数据仅包含数字输入变量,它们是原始特征的PCA转换。 由于机密问题,无法使用原始功能以及有关数据的更多背景信息。

In this case, we have used only a subset of the dataset to speed up the training time and to ensure the two different classes reach a perfectly balance. Here, the sampling method used is TPESampler . A subset of a dataset is shown in the figure below :

在这种情况下,我们仅使用数据集的一个子集来加快训练时间,并确保两个不同的类达到完美的平衡。 在这里,使用的采样方法是TPESampler 。 下图显示了数据集的子集:

使用Optuna的XGBoost模型的高效超参数优化_第2张图片
A subset of Credit Card Fraud Detection dataset 信用卡欺诈检测数据集的子集

Importing required packages :

导入所需的软件包:

import optuna 
from optuna import Trial, visualization
from optuna.samplers import TPESampler
from xgboost import XGBClassifier

Following are the main steps involved in HPO using Optuna for XGBoost model:

以下是将Optuna用于XGBoost模型的HPO涉及的主要步骤:

1. Define Objective Function :The first important step is to define an objective function. The objective should be to return a real value which has to minimize or maximize. In our case, we will be training XGBoost model and using the cross-validation score for evaluation. We will be returning this cross-validation score from an objective function which has to be maximized.

1.定义目标函数:第一步是定义目标函数。 目标应该是返回必须最小化或最大化的真实值。 在我们的案例中,我们将训练XGBoost模型,并使用交叉验证得分进行评估。 我们将从必须最大化的目标函数返回此交叉验证分数。

2. Define Hyperparameter Search Space :Optuna supports five kind of hyperparameters distribution, which are given as follows :

2.定义超参数搜索空间: Optuna支持五种超参数分布,如下所示:

  • Integer parameter : A uniform distribution on integers.

    整数参数整数的均匀分布。

    Integer parameter : A uniform distribution on integers.n_estimators = trial.suggest_int('n_estimators',100,500)

    整数参数整数的均匀分布。 n_estimators = trial.suggest_int('n_estimators',100,500)

  • Categorical parameter : A categorical distribution.

    分类参数 :分类分布。

    Categorical parameter : A categorical distribution.criterion = trial.suggest_categorical('criterion' ,['gini', 'entropy'])

    分类参数 :分类分布。 criterion = trial.suggest_categorical('criterion' ,['gini', 'entropy'])

  • Uniform parameter : A uniform distribution in linear domain.

    均匀参数 :线性域中的均匀分布。

    Uniform parameter : A uniform distribution in linear domain.subsample = trial.suggest_uniform('subsample' ,0.2,0.8)

    均匀参数 :线性域中的均匀分布。 subsample = trial.suggest_uniform('subsample' ,0.2,0.8)

  • Discrete-uniform parameter : A discretized uniform distribution in linear domain.

    离散均匀参数 :线性域中的离散均匀分布。

    Discrete-uniform parameter : A discretized uniform distribution in linear domain.max_features = trial.suggest_discrete_uniform('max_features', 0.05,1,0.05)

    离散均匀参数 :线性域中的离散均匀分布。 max_features = trial.suggest_discrete_uniform('max_features', 0.05,1,0.05)

  • Loguniform parameter : A uniform distribution in log domain.

    Loguniform参数 :在日志域中的均匀分布。

    Loguniform parameter : A uniform distribution in log domain.learning_rate = trial.sugget_loguniform('learning_rate' : 1e-6, 1e-3)

    Loguniform参数 :在日志域中的均匀分布。 learning_rate = trial.sugget_loguniform('learning_rate' : 1e-6, 1e-3)

The below figure shows the objective function and hyperparameter for our example.

下图显示了本示例的目标函数和超参数。

def objective(trial: Trial,X,y) -> float:
    
    joblib.dump(study, 'study.pkl')
    
    train_X,test_X,train_y,test_y = train_test_split(X, Y, test_size = 0.30,random_state = 101)


    param = {
                "n_estimators" : trial.suggest_int('n_estimators', 0, 1000),
                'max_depth':trial.suggest_int('max_depth', 2, 25),
                'reg_alpha':trial.suggest_int('reg_alpha', 0, 5),
                'reg_lambda':trial.suggest_int('reg_lambda', 0, 5),
                'min_child_weight':trial.suggest_int('min_child_weight', 0, 5),
                'gamma':trial.suggest_int('gamma', 0, 5),
                'learning_rate':trial.suggest_loguniform('learning_rate',0.005,0.5),
                'colsample_bytree':trial.suggest_discrete_uniform('colsample_bytree',0.1,1,0.01),
                'nthread' : -1
            }
    model = XGBClassifier(**param)


    model.fit(train_X,train_y)


    return cross_val_score(model,test_X,test_y).mean()

3. Study Objective :We have to understand some important terminologies mentioned in their docs, which will make our work easier. These are given as follows :

3.研究目标:我们必须了解他们的文档中提到的一些重要术语,这将使我们的工作更加轻松。 给出如下:

  • Trial : A single call of the objective function

    试用 :一次调用目标函数

  • Study : An optimization session, which is a set of trails

    研究 :优化会话,它是一组线索

  • Parameter : A variable whose value is to be optimized such as value of “n_estimators”

    参数 :要优化其值的变量,例如“ n_estimators”的值

The Study object is used to manage optimization process. Method create_study() returns a study object. A study object has useful properties for analyzing the optimization outcome. In method of create_study(), we have to define the direction of objective function i.e. “maximize” or “minimize” and sampler for example TPESampler(). After creating study, we can call Optimize().

Study对象用于管理优化过程。 方法create_study()返回学习对象。 研究对象具有用于分析优化结果的有用属性。 在create_study()方法中,我们必须定义目标函数的方向,即“最大化”或“最小化”以及采样器,例如TPESampler() 。 创建研究后,我们可以调用Optimize()

study = optuna.create_study(direction='maximize',sampler=TPESampler())
study.optimize(lambda trial : objective(trial,X,Y),n_trials= 50)

4. Best Trial and Result :Once the optimization process completed then we can obtain the best parameters value and the optimal value of the objective function.

4.最佳试验和结果:优化过程完成后,我们可以获得最佳参数值和目标函数的最佳值。

print('Best trial: score {},\nparams {}'.format(study.best_trial.value,study.best_trial.params))
Best trial: score 0.9427118644067797,
params {'n_estimators': 396, 'max_depth': 6, 'reg_alpha': 3, 'reg_lambda': 3, 'min_child_weight': 2, 'gamma': 0, 'learning_rate': 0.09041583301198859, 'colsample_bytree': 0.45999999999999996}

5. Trail History :We can get the entire history of all the trial in the form data frame by just calling study.trails_dataframe().

5.追踪历史记录:我们只需调用study.trails_dataframe()在表格数据框中获得所有试验的全部历史记录

hist = study.trials_dataframe()
hist.head()
使用Optuna的XGBoost模型的高效超参数优化_第3张图片

6. Visualizations :

6.可视化:

使用Optuna的XGBoost模型的高效超参数优化_第4张图片
Photo by Isaac Smith on Unsplash 艾萨克·史密斯 ( Isaac Smith)在 Unsplash上 拍摄的照片

Visualizing the hyperparameter search space can be very useful. From the visualization, we can gain some useful information on the interaction between parameters and we can see where to search next. The optuna.visualization module includes a set of useful visualizations.

可视化超参数搜索空间可能非常有用。 通过可视化,我们可以获得有关参数之间相互作用的一些有用信息,并且可以看到下一步要搜索的位置。 optuna.visualization 模块包括一组有用的可视化。

i) plot_optimization_history(study) : plots optimization history of all trials as well as the best score at each point.

i) plot_optimization_history(study) :绘制所有试验的优化历史以及每个点的最佳分数。

使用Optuna的XGBoost模型的高效超参数优化_第5张图片
Optimization History Plot 优化历史记录图

ii) plot_slice(study) : plots the parameter relationship as slice also we can see which part of search space were explored more.

ii) plot_slice(study) :将参数关系绘制为切片,我们还可以看到搜索空间的哪一部分得到了更多的探索。

使用Optuna的XGBoost模型的高效超参数优化_第6张图片
Slice Plot 切片图

iii) plot_parallel_coordinate(study) : plots the interactive visualization of the high-dimensional parameter relationship in study and scores.

iii) plot_parallel_coordinate(study) 绘制学习和分数中高维参数关系的交互式可视化。

使用Optuna的XGBoost模型的高效超参数优化_第7张图片
Parallel Coordinate Plot 平行坐标图

iv) plot_contor(study) : plots parameter interactive chart from we can choose which hyperparameter space has to explore.

iv) plot_contor(study) :从中绘制参数交互式图表,我们可以选择要探索的超参数空间。

使用Optuna的XGBoost模型的高效超参数优化_第8张图片
Contour Plot 等高线图

Overall, Visualizations are Amazing in Optuna !!

总体而言, Optuna的可视化效果令人赞叹!

Summary :Optuna is a good HPO framework and is easy to use. It has good documentation and visualization features. For me, It becomes my first choice for hyperparameter optimization method.

简介: Optuna是一个很好的HPO框架,易于使用。 它具有良好的文档编制和可视化功能。 对我来说,它成为我超参数优化方法的首选。

使用Optuna的XGBoost模型的高效超参数优化_第9张图片
Photo by Kelly Sikkema on Unsplash Kelly Sikkema在 Unsplash上的 照片

Reference :

参考:

  1. Optuna: A hyperparameter optimization framework https://optuna.readthedocs.io/en/stable/index.html

    Optuna:超参数优化框架https://optuna.readthedocs.io/en/stable/index.html

  2. Optuna Github Project : https://github.com/optuna

    Optuna Github项目: https : //github.com/optuna

翻译自: https://medium.com/swlh/efficient-hyperparameter-optimization-for-xgboost-model-using-optuna-3ee9a02566b1

你可能感兴趣的:(python,机器学习,java,人工智能,tensorflow)