python xgboost输出变量重要性_绘制重要性变量xgboost Python

有两点:为了适应模型,您需要使用训练数据集(X_train, y_train),而不是整个数据集(X, y)。在

您可以使用plot_importance()函数的max_num_features参数来仅显示max_num_features功能(例如前10个)。在

通过以上对代码的修改,使用一些随机生成的数据,代码和输出如下:import numpy as np

# generate some random data for demonstration purpose, use your original dataset here

X = np.random.rand(1000,100) # 1000 x 100 data

y = np.random.rand(1000).round() # 0, 1 labels

from xgboost import XGBClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

seed=0

test_size=0.30

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=test_size, random_state=seed)

from xgboost import XGBClassifier

model = XGBClassifier()

model.fit(X_train, y_train)

import matplotlib.pylab as plt

from matplotlib import pyplot

from xgboost import plot_importance

plot_importance(model, max_num_features=10) # top 10 most important features

plt.show()

python xgboost输出变量重要性_绘制重要性变量xgboost Python_第1张图片

你可能感兴趣的:(python,xgboost输出变量重要性)