2019-12-16

StratifiedKFold函数

功能:对数据集分成测试机和验证集,用于交叉验证。

参数:n_splits : int, default=3 Number of folds. Must be at least 2

shuffle : boolean, optional Whether to shuffle each class's samples before splitting into batches.

random_state : int, RandomState instance or None, optional, default=None If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when shuffle == True.

例子:

>>> import numpy as np

>>> from sklearn.model_selection import StratifiedKFold

>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])

>>> y = np.array([0, 0, 1, 1])

>>> skf = StratifiedKFold(n_splits=2)

>>> skf.get_n_splits(X, y)

2

>>> print(skf) # doctest: +NORMALIZE_WHITESPACE

StratifiedKFold(n_splits=2, random_state=None, shuffle=False)

>>> for train_index, test_index in skf.split(X, y):

...    print("TRAIN:", train_index, "TEST:", test_index)

...    X_train, X_test = X[train_index], X[test_index]

...    y_train, y_test = y[train_index], y[test_index]

TRAIN: [1 3] TEST: [0 2]

TRAIN: [0 2] TEST: [1 3]

你可能感兴趣的:(2019-12-16)