CalTech machine learning, video 4(Error & Noise) note



6:44 2014-09-22 Monday

start CalTech machine learning, video 4

Error & Noise

6:56 2014-09-22
linear regression algorithm:

one-step learning

6:58 2014-09-22
nonlinear transformation

7:06 2014-09-22
feature space

7:07 2014-09-22
probability distribution

7:18 2014-09-22
error measure

7:18 2014-09-22
Error measure: E(h, f)

h == hypothesis,

f == final

7:22 2014-09-22
search of an algorithm into minimizinig 

an error function

7:23 2014-09-22
Error Measure pointwise definition:

e(h(x), f(x))

7:24 2014-09-22
overall error

7:28 2014-09-22
in-sample error, out-of-sample error

7:28 2014-09-22
in-sample error: Ein(h)

7:28 2014-09-22
From pointwise to overall

7:29 2014-09-22
use it(probability distribution) to generate the training 
examples,use it to test the hypothesis

7:38 2014-09-22
false accept, false reject

7:41 2014-09-22
take-home lesson:

the error measure should be specified by the user

8:41 2014-09-22
error measure

8:47 2014-09-22
minimize the in-sample error

8:47 2014-09-22
target distribution

8:51 2014-09-22
target distribution: P(y|x)

(x, y) is now generated by the joint distribution


8:52 2014-09-22
deterministic target distribution proper + noise

8:53 2014-09-22
deterministic target, noisy target

8:54 2014-09-22
unknown target function => unknown target distribution

8:57 2014-09-22
this is the final diagram for supervised learning

8:58 2014-09-22
unknown input distribution: P(x)

unknown target distribution: P(y|x)

9:00 2014-09-22
we're trying to learn the "target distribution"

9:03 2014-09-22
learning is feasible in a probabilistic sense

9:09 2014-09-22
Eout(g) ≈ 0      // this is what we want

Eout(g) ≈ Ein(g) // this is what we have

9:13 2014-09-22
Eout(g) ≈ 0 is achieved throught:

Eout(g) ≈ Ein(g)  // this is "Hoeffding inequality"

Eout(g) ≈ 0

9:17 2014-09-22
you put them together, and you have the learning

9:18 2014-09-22
Hoeffding is all about Ein ≈ Eout

// in-sample error & out-of-sample error

9:19 2014-09-22
Learning is thus split into 2 questions:

1. Can we make sure that Eout(g) is close enough to Ein(g)?

2. Can we make Ein(g) small enough?

9:23 2014-09-22
Ein(g)  // in-sample error

Eout(g) // out-of-sample error

9:23 2014-09-22
g is just one of "hypothesis set"

f is the "final hypothesis"

9:24 2014-09-22
out-of-sample performance

9:26 2014-09-22
financial forecasting

9:27 2014-09-22
least square approximation => machine learning,

in-sample error, out-of-sample error


9:27 2014-09-22
dVC  // VC dimension

9:31 2014-09-22
model complexity

which is denoted by dVC(VC dimension)

9:31 2014-09-22
# of hypothesis  // M

9:33 2014-09-22
the bigger the M, the looser the bound

9:33 2014-09-22
as dVC grows, the discrepancy between 

Ein & Eout gets bigger & bigger

9:37 2014-09-22

9:37 2014-09-22
input space, input distribution

10:09 2014-09-22
generalization error,

poor generalization, good generalization

10:11 2014-09-22
training data, target function

10:19 2014-09-22
learning is possible in a probabilistic sense,

any P(x) will achieve that.

10:25 2014-09-22
the meaning of Hoeffding's inequality:

learning is possible in a probabilistic sense

10:27 2014-09-22
the problem of CIA & super market

10:28 2014-09-22
feature extraction

10:32 2014-09-22
there is a tradeoff between performance & complexity
