机器学习:5.4 Stacking

@[toc]

  • Combine multiple base learners to reduce variance

    • Base learners can be different model types
    • Linearly combine base learners outputs by learned parameters
  • Widely used in competitions

  • bagging VS stacking

    • Bagging: bootstrap samples to get diversity

    • Stacking: different types of models extract different features

Multi-layer Stacking

  • Stacking base learners in multiple levels to reduce bias
    • Can use a different set of base learners at each level
  • Upper levels (e.g. L2) are trained on the outputs of the level below (e.g. L1)
    • Concatenating original inputs helps

Overfitting in Multi-layer Stacking

  • Train leaners from different levels on different data to alleviate
    overfitting

  • Split training data into A and B, train learners on A, run inference on B to generate training data for learners

  • Repeated k-fold bagging:

    • Train k models as in k-fold cross validation
    • Combine predictions of each model on out-of-fold data
    • Repeat step 1,2 by n times, average the n predictions of each example for the next level training

你可能感兴趣的:(机器学习:5.4 Stacking)