10. Advice for applying machine learning

Advice for applying machine learning

Deciding what to try next

Debugging a learining algorithm:

  • Get more training set. (Sometimes doesn't actrully help)
  • Try smaller sets of features. ()
  • Try getting additional features
  • Try adding polynomial features
  • Try decreasing or increasing

Machine learning diagnostic: save your time.

Evaluating a hypothesis

Test your hypothesis whether overfitting or not:

  • Split your training into 2 part.one is Training set (70%),another is Test set(30%).

Training/testing procedure for logistic regression

  • Learn parameter from training data
  • Compute test set error (liner/logistic regression)
  • Misclassification error (logistic regression)

Model selection and training/validation/test sets

when using to select which model to choose, overfitting may be happen and the result the model perform best isn't generlized.
is likely to be an optimistic estimate of generalization error.

Split the dataset into 3 pieces.

  • training set (60%)
  • cross validation set (20%)
  • testing set (20%)

Use the cross validation to select model.
Use the testing set to test the generlazation error.

Diagnosing bias vs. variance

underfitting overfitting

Bias (underfit): will be high;
Variance(overfit): will be low;

Regularization and bias/variance

To find a good
Try 0 0.01 0.02 0.04 0.08 ...10.24
Get many .
Use cross validation set to compute , pick the minium of these

There are two figure in the two videos before, it's very useful to help to understand how the cross validation set helps to get best model and best

Learing curves

error to m (training set size)

If a learing algorithe is suffering from high bies, getting more training data will not help much.

If a learning algorithm is suffering from high variance, getting more training data is likely to help .

Deciding what to try next (revisited)

bias: underfit
varaance: overfit

  • Get more training examples: fix high variance
  • Try smaller sets of features: fix high variance
  • Try getting additional features: fix high bias
  • Try adding polynomial features: fix high bias
  • Try decreasiong : fix high bias
  • Try increasing : fix high variance

你可能感兴趣的:(10. Advice for applying machine learning)