Stanford ML - Lecture 8 - Support Vector Machines

1. Optimization Objective

  • logistic regression


let

  • support vector machine



2. Large Margin Intuition

3. The mathematics behind large margin classification (optional)

  • vector inner product




  • SVM decision boundary


4. Kernels I

  • Given , compute new feature depending on proximity to landmarks 




5. Kernels II
  • SVM with kernels
    • how to choose ?
    • how to compute ?
    • Reference: http://blog.csdn.net/abcjennifer/article/details/7849812

  • SVM parameters

      • Large : lower bias, higher variance
      • Small : Higher bias, low variance



6. Using an SVM
  • Use SVM software package (e.g. liblinear, libsvm, ...) to solve for parameters , needs to specify:
    • choice of parameter 
    • choice of kernel (similarity function)
      • e.g. no kernel ("linear kernel")

      • Gaussian kernel

  • Many off-the-shelf kernels available
    • polynomial kernel
    • more esoteric: string kernel, chi-square kernel, histogram intersection kernel, ...
  • Multi-class classification
    • many SVM packages already have built-in multi-class classification functionality
    • otherwise, use one-vs.-all method
  • logistic regression vs. SVM
    •  = number of features (),  = number of training examples
      • if  is large (relative to ), use logistic regression, or SVM without a kernel ("linear kernel").
      • if  is small,  is intermediate, use SVM with Gaussian kernel.
      • if  is small,  is large, create/add more features, then use logistic regression or SVM without a kernel.
      • Neural Network likely to work well for most of these settings, but may be slower to train.
Reference: http://blog.csdn.net/abcjennifer/article/details/7849812

你可能感兴趣的:(Machine,Learning,Ng)