解决TypeError: __init__() got an unexpected keyword argument 'n_folds' /'n_splits'问题

关于解决“泰坦尼克船员获救数据分析”的KFold()报错问题。

报错信息:
TypeError: __init__() got an unexpected keyword argument 'n_folds'

报错代码主体如下:

#Import the linear regression class
from sklearn.linear_model import LinearRegression
#Sklearn also has a helper that makes it easy to do cross validation
from sklearn.cross_validation import KFold

#The columns we'll use to predict the target
predictors = ["Pclass", "Sex", "Age", "SibSp", "Parch", "Fare", "Embarked"]

#Initialize our algorithm class
alg = LinearRegression()
#Generate cross validation folds for the titanic dataset.  It return the row indices corresponding to train and test.
#We set random_state to ensure we get the same splits every time we run this.
kf = KFold(titanic.shape[0], n_folds=3, random_state=1)

predictions = []
for train, test in kf:
    # The predictors we're using the train the algorithm.  Note how we only take the rows in the train folds.
    train_predictors = (titanic[predictors].iloc[train,:])
    # The target we're using to train the algorithm.
    train_target = titanic["Survived"].iloc[train]
    # Training the algorithm using the predictors and target.
    alg.fit(train_predictors, train_target)
    # We can now make predictions on the test fold
    test_predictions = alg.predict(titanic[predictors].iloc[test,:])
    predictions.append(test_predictions)

错误原因1:

  • 导入错误的KFold包
    from sklearn.cross_validation import KFold 已经淘汰,需要改为from sklearn.model_selection import KFold,具体信息参见Sklearn官方文档
  • 使用错误的参数
    kf = KFold(titanic.shape[0], n_folds=3, random_state=1)由于sklearn的更新,Kfold的参数已经更改, n_folds更改为n_splits,前文代码更改为kf = KFold(n_splits=3, shuffle=False, random_state=1),如果不更改,会发生报错TypeError: __init__() got multiple values for argument 'n_splits'
    除此之外,for train, test in kf:同时更改为for train, test in kf.split(titanic[predictions]): 此时相当于用predictions来进行折叠交叉划分。

以上内容是通过多方度娘总结得出,可参考的链接如下:

  1. 交叉验证——对数据集的划分
  2. titanic中随机森林方法报错【CSDN问答】
  3. sklearn官方文档

你可能感兴趣的:(随笔,python)