数据分析与挖掘实例(4)-参数调优

参数调优

  • 逻辑回归
  • SVM
  • 决策树
  • 随机森林
  • XGBoost

使用网格搜索法对5个模型进行调优(调参时采用五折交叉验证的方式),并进行模型评估,展示代码的运行结果。
导入库如下:

import pandas as pd
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, f1_score,accuracy_score,roc_auc_score,roc_curve
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn import tree
import graphviz
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report

逻辑回归

代码及结果如下:

tuned_parameters = [{'penalty': ['L1','L2'], 'dual': [False],'tol':[1e-4],'C':[0.5,1]}]
scores = ['precision', 'recall']
for score in scores:
    model = GridSearchCV(linear_model.LogisticRegression(), tuned_parameters, cv=5,
                           scoring='%s_macro' % score)
    # model= SVC(kernel='rbf', class_weight='balanced')
    model.fit(X_train,y_train)
    print('逻辑回归参数结果')
    print(model.best_params_)
    y_true, y_pred = y_test, model.predict(X_test)
    print(accuracy_score(y_test,y_pred))

在这里插入图片描述

SVM

代码及结果如下

tuned_parameters = [{'kernel': ['rbf','linear'], 'gamma': [1e-3, 1e-4],
                     'C': [1, 10, 100, 1000]},]
                    # {'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
scores = ['precision', 'recall']
for score in scores:
    model = GridSearchCV(SVC(), tuned_parameters, cv=5,
                           scoring='%s_macro' % score)
    # model= SVC(kernel='rbf', class_weight='balanced')
    model.fit(X_train,y_train)
    print(model.best_params_)
    y_true, y_pred = y_test, model.predict(X_test)
    print(accuracy_score(y_test,y_pred))

在这里插入图片描述

决策树

代码及结果如下:

tuned_parameters = [{'criterion': ['gini'],'max_depth':[10,20] }]
                     # 'C': [1, 10, 100, 1000]},]
                    # {'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
scores = ['precision', 'recall']
for score in scores:
    model = GridSearchCV(DecisionTreeClassifier(), tuned_parameters,
                           scoring='%s_macro' % score)
    # model= SVC(kernel='rbf', class_weight='balanced')
    model.fit(X_train,y_train)
    print(model.best_params_)
    y_true, y_pred = y_test, model.predict(X_test)
    print(accuracy_score(y_test,y_pred))

在这里插入图片描述

随机森林

tuned_parameters = [{'n_estimators': [5,10,15], 'criterion': ['gini'],'max_features':[2,4]}]
                     # 'C': [1, 10, 100, 1000]},]
                    # {'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
scores = ['precision', 'recall']
for score in scores:
    model = GridSearchCV(RandomForestClassifier(), tuned_parameters, cv=5,
                           scoring='%s_macro' % score)
    # model= SVC(kernel='rbf', class_weight='balanced')
    model.fit(X_train,y_train)
    print('随机森林参数结果')
    print(model.best_params_)
    y_true, y_pred = y_test, model.predict(X_test)
    print(accuracy_score(y_test,y_pred))

在这里插入图片描述

XGBoost

tuned_parameters = [{'base_score': [0.5,0.6], 'booster': ['gbtree',],}]
scores = ['precision', 'recall']
for score in scores:
    model = GridSearchCV(XGBClassifier(), tuned_parameters, cv=5,
                           scoring='%s_macro' % score)
    # model= SVC(kernel='rbf', class_weight='balanced')
    model.fit(X_train,y_train)
    print('XGboost')
    print(model.best_params_)
    y_true, y_pred = y_test, model.predict(X_test)
    print(accuracy_score(y_test,y_pred))

在这里插入图片描述

你可能感兴趣的:(机器学习)