Machine Learning笔记 第04周

Week 04 tasks:

  • Lectures: VC Dimensions and Bayesian Learning.
  • Reading: Mitchell Chapter 7 and Chapter 6.

SL8: VC Dimensions

  • Lesson 08 Notes
  • VC Dimensions Review
Machine Learning笔记 第04周_第1张图片
Quiz 1: Which Hypothesis Spaces Are Infinite
  • m>= 1/ε( ln|H|+ln(1/) ). Here the sample size m is dependent on the size of hypothesis |H|, the error ε and the failure parameter_ _. What happens if |H| is infinite?
  • quiz 1: Which Hypothesis Spaces Are Infinite
Machine Learning笔记 第04周_第2张图片
Maybe It Is Not So Bad
  • In the example above, although the hypothesis space is infinite (syntactic), we can still explore the space efficiently because a lot of hypothesis are not that meaningfully different (semantic).
Machine Learning笔记 第04周_第3张图片
What Does VC Stand For
  • VC dimension: what is the largest set of inputs that the hypothesis class can shatter.
  • Vapnic-Chervonenkis
Machine Learning笔记 第04周_第4张图片
Quiz 2: internal training
  • not sure how to answer this question. need to rewatch.
Machine Learning笔记 第04周_第5张图片
Quiz 3: Linear Separators
  • Here VC = 3.
Machine Learning笔记 第04周_第6张图片
The ring
  • the vc dimension is going to end up being d plus 1 because the number of parameters needed to represent a d dimensional hyperplane is __ d plus 1__.
Machine Learning笔记 第04周_第7张图片
quiz 4: polygons
  • if the hypothesis is that points inside some convex polygon, then the VC = infinite.
Machine Learning笔记 第04周_第8张图片
Sample size with infinate hypothesis space
Machine Learning笔记 第04周_第9张图片
VC of finite H
Machine Learning笔记 第04周_第10张图片
recap lesson 8

Bayesian Learning

  • Lesson 09 Notes
  • Bayesian Learning Extension
Machine Learning笔记 第04周_第11张图片
Bayesian Learning
  • the best hypothesis is the most probable hypothesis given data and domain knowledge.
  • argmaxh∈ H Pr(h|D)

Bayes Rule

Machine Learning笔记 第04周_第12张图片
Bayes Rule
  • Bayes Rule: Pr(h|D) = Pr(D|h)Pr(h)/Pr(D)
    • Pr(D) is the prior about data
    • Pr(h) is the prior of hypothesis, and it's the domain knowledge.
    • Pr(D|h) is the possibility of data given h, it is much easier than Pr(h|D) to compute.
Machine Learning笔记 第04周_第13张图片
Quiz 1
  • comparing the probability of one having /not having spleentitis.

Bayesian Learning

Machine Learning笔记 第04周_第14张图片
Bayesian Learning
  • to find the largest Pr(D|h), we could drop P(D) for the bayes rule because it doesn't matter since our task is to find the best h. MAP: maximum a posterior.
  • If we don't have a strong prior or we assume the prior is uniform for every h, we can drop Pr(h). ML: maximulikelihoodod_
  • the hard part is to look into every h
  • Since H is often very large, this learning algorithm is not practical

Bayesian Learning in Action

Machine Learning笔记 第04周_第15张图片
Bayesian Learning when the data has no noise
  • given a bunch of data, your probability of a particular hypothesis being correct, or being the best one or the right one, is simply uniform over all of the hypotheses that are in the version space. That is, are consistent with the data that we see.
Machine Learning笔记 第04周_第16张图片
Quiz 2:
  • given pairs, and di =k * xi which has a probability of Pr(1/2k), what is the probability of D given d.
Machine Learning笔记 第04周_第17张图片
Bayes learning given gausion error
  • given training data, figure out f(x) and with its error term. If the error can be modeled by Gaussian function, then
  • hML can be simplified to minimizing a sum of squared error.
Machine Learning笔记 第04周_第18张图片
Quiz 3
  • find best hypothesis from the three.
    • calculate and compare squared error.
Machine Learning笔记 第04周_第19张图片
Quiz 4: small trees
  • hMAP can be transformed to minimize the length of hypothesis (size of h) and the length of the D|h (which is misclassification error)
  • there is a tradeoff between size of h and error. this is called minimum description length
  • there is a unit problem: unit of error and size need to be figured out

Bayesian Classification

Machine Learning笔记 第04周_第20张图片
Bayesian Classification
  • when we do the Classification, we will have each hypothesis to vote
Machine Learning笔记 第04周_第21张图片
Recap
  • Bayes optimal classifier = weighted voting by h.
2016-02-08 SL8 完成
2016-02-08 凌晨,SL9 完成.第一稿发布

你可能感兴趣的:(Machine Learning笔记 第04周)