sklearn交叉验证cross_val_score参数解析

from sklearn.model_selection import cross_val_score
cross_val_score(estimator, 
                X, 
                y=None,
                groups=None, 
                scoring=None,
                cv=None, 
                verbose=0, 
                fit_params=None,
                pre_dispatch='2*n_jobs')

estimator:选用的学习器的实例对象,包含“fit”方法;
X :特征数组
y : 标签数组
groups:如果数据需要分组采样的话
scoring :评价函数
cv:交叉验证的k值,当输入为整数或者是None,估计器是分类器,y是二分类或者多分类,采用StratifiedKFold 进行数据划分
fit_params:字典,将估计器中fit方法的参数通过字典传递

原文档例子:

import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import datasets
from sklearn import svm
digits = datasets.load_digits()
X = digits.data
y = digits.target
svc = svm.SVC(kernel="linear")
C_s = np.logspace(-10,0,10)
scores = []
scores_std = []
for C in C_s:
    svc.C = C
    score_lyst = cross_val_score(svc,X,y,n_jobs=-1)
    scores.append(np.mean(score_lyst))
    scores_std.append(np.std(score_lyst))

import matplotlib.pyplot as plt
plt.figure(1,figsize=(4,3))
plt.clf()
plt.semilogx(C_s,scores)
plt.semilogx(C_s,np.array(scores)+np.array(scores_std),"r--")
plt.semilogx(C_s,np.array(scores)-np.array(scores_std),"k--")
locs, labels = plt.yticks()
plt.yticks(locs, list(map(lambda x: "%g" % x, locs)))
plt.ylabel('CV score')
plt.xlabel('Parameter C')
plt.ylim(0, 1.1)
plt.show()

API链接

你可能感兴趣的:(sklearn)