MIT 6.041 introduction to probability video note

2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 1/158
10:45 2013-8-30 Friday
start 6.041 introduction to probability, lec 01
11:01 2013-8-30
domain-specific problems
11:02 2013-8-30
dealing with uncertainty
11:03 2013-8-30
probability model:
sample space, probability law
11:06 2013-8-30
outcomes of the experiment
11:06 2013-8-30
sample space: mutually exclusive + collectively exhaustive
11:23 2013-8-30
What is "Events"?
Events is a subset of sample space.
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 2/158
11:24 2013-8-30
assign probability to events.
11:24 2013-8-30
subset
11:25 2013-8-30
event A occured.
11:26 2013-8-30
Axioms of probability:
1. nonnegative:
P(A) >= 0
2. normalization:
P(Ω) == 1 // Ω is the sample space
3. additivity:
if P(A ∩ B) == 0, then P(A ∪ B) == P(A) + P(B)
11:28 2013-8-30
intersection:
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 3/158
A occured & B occured.
11:29 2013-8-30
union
11:32 2013-8-30
empty set
11:34 2013-8-30
PDF == Probability Density Function
PMF == Probability Mass Function
11:36 2013-8-30
complement of A
11:48 2013-8-30
disjoint sets
11:56 2013-8-30
but to keep things simple, let's stick with...
11:58 2013-8-30
event of interest
12:03 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 4/158
discrete uniform law
12:05 2013-8-30
fair dice, well-shuffled decks
12:06 2013-8-30
continuous uniform law
12:07 2013-8-30
Random Variables: use capitial letters
12:08 2013-8-30
slopes, intercepts
12:11 2013-8-30
As far as probability is concerned
12:13 2013-8-30
countable infinite sample space
12:15 2013-8-30
union of disjoint sets
12:15 2013-8-30
countable additivity axiom
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 5/158
12:17 2013-8-30
total probability
-----------------------------------
12:19 2013-8-30
start 6.041 lec2
conditioning & Bayes' rule
12:21 2013-8-30
partial information
12:21 2013-8-30
conditional probability
12:21 2013-8-30
let us to use "conditional probability" to do "inference"
12:22 2013-8-30
statistical inference
12:23 2013-8-30
assigning probability to subset of the sample space
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 6/158
12:25 2013-8-30
disjoint events
12:28 2013-8-30
infinite sequence
12:29 2013-8-30
probability of the overall sets
12:29 2013-8-30
unit square
12:31 2013-8-30
one element set
12:31 2013-8-30
additivity axiom:
probability of the sum == sum of the probability
12:33 2013-8-30
uncountable sets
12:36 2013-8-30
zero probability things do happen!
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 7/158
it just means extremely unlikely!
always expect the unexpected!
12:38 2013-8-30
discret models vs continuous models
12:42 2013-8-30
conditional probability
12:44 2013-8-30
someone comes and tells you that event B occured.
12:44 2013-8-30
P(A|B) == P(A ∩ B) | P(B)
12:45 2013-8-30
unit probability
12:46 2013-8-30
outcomes of the experiment
12:46 2013-8-30
what is the likelihood?
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 8/158
12:47 2013-8-30
that is an intuitively reasonable way of doing..
12:48 2013-8-30
what fractions is ...?
13:03 2013-8-30
we're placed in a new universe.
13:11 2013-8-30
uniform distribution
13:12 2013-8-30
conditional probability:
radar detection example
13:13 2013-8-30
self-contained probability model
13:14 2013-8-30
false alarm:
plane does not exist, radar register something.
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 9/158
13:17 2013-8-30
statistical inference
13:18 2013-8-30
radar fire a false alarm!
13:25 2013-8-30
false alarms are pretty common
13:27 2013-8-30
doctors interpretating the results of the test
13:31 2013-8-30
how often does A occurs?
13:33 2013-8-30
composite event
13:33 2013-8-30
branch of the tree
13:36 2013-8-30
total probability theorem
13:42 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 10/158
partition into....
13:44 2013-8-30
inference problem
13:46 2013-8-30
a model of our measuring device
13:46 2013-8-30
What is a statistical inference?
given the radar register sth, whether a plane
actually present or it's a false alarm?
13:48 2013-8-30
the multiplication rule
13:49 2013-8-30
total probability of the event B.
13:50 2013-8-30
schematically what is happening here?
13:52 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 11/158
causal model of our situation
13:52 2013-8-30
inference, statistical inference
13:53 2013-8-30
then we do inference:
given that the effect was observed,
how likely is it that the world was in
this particular situation or state or scenario?
14:00 2013-8-30
experimental data
14:00 2013-8-30
How can we infer new knowledge from previous knowledge?
--------------------------------------------------------
14:06 2013-8-30
start 6.041 lec 03
14:06 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 12/158
independence of 2 events
15:20 2013-8-30
total probability
15:20 2013-8-30
conditional probability
15:23 2013-8-30
total probability theorem
15:24 2013-8-30
Bayes rule: inference
15:27 2013-8-30
die toss example
15:28 2013-8-30
H == Head, T == Tale
15:30 2013-8-30
total probability
15:31 2013-8-30
to make an inference
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 13/158
15:35 2013-8-30
independence:
Def1: P(B|A) == P(B)
Def2: P(A ∩ B) == P(A) * P(B)
Def2 seems to be better
15:43 2013-8-30
do not confuse "independence" with "disjointness"!
15:43 2013-8-30
extremely independence
15:47 2013-8-30
conditional universe
15:47 2013-8-30
conditional independence
15:48 2013-8-30
in a conditional universe
15:50 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 14/158
in fact, disjointness implies dependence!
if P(A∩B) == 0 // disjoint sets
15:52 2013-8-30
in fact, knowing occurence of A, B can not occur!
15:54 2013-8-30
conditioning may affect independence!
15:57 2013-8-30
conditional probability model
16:00 2013-8-30
given this information, I'm making an inference..
16:02 2013-8-30
independence of 2 events
16:06 2013-8-30
pairwise independence
16:06 2013-8-30
indepence & pairwise independence are
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 15/158
different things...
16:11 2013-8-30
pairwise independence does not imply independence!
-------------------------------------------------------
17:01 2013-8-30
review 6.041 lec 03
biased coin toss
17:04 2013-8-30
multiplication rule
17:05 2013-8-30
total probability
17:06 2013-8-30
to use the Bayes rule to make an inference
17:09 2013-8-30
independence between 2 things
17:10 2013-8-30
likelihood of event B
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 16/158
17:11 2013-8-30
conditional probability
unconditional probability
17:12 2013-8-30
definition of independence:
Def1: original definition:
P(B | A) == P(B)
Def2:
P(A ∩ B) == P(A) * P(B)
17:13 2013-8-30
2nd definition of indepence is superior, because
1. it's symmetric
2. it applies even P(A) == 0 or P(B) == 0
3. it implies both P(A|B) == P(A) & P(B|A) == P(B)
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 17/158
17:15 2013-8-30
DEFINITION OF independence:
A & B are independent event <->
P(A ∩ B) == P(A) * P(B)
17:17 2013-8-30
event with zero probability
17:19 2013-8-30
you can check independence of A & B by
1. definition
3. intuitively
17:20 2013-8-30
How to intuitively determine independence?
whether or not A occurs "has nothing to do" with
the occurence of B.
17:23 2013-8-30
numerical accident
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 18/158
17:24 2013-8-30
independence != disjointness
17:24 2013-8-30
occurence or non-occurence
17:25 2013-8-30
extremely dependence
17:27 2013-8-30
B conveys information about A.
17:29 2013-8-30
conditional independence
17:29 2013-8-30
conditional probability
17:30 2013-8-30
event A & B are independent unconditionally.
17:30 2013-8-30
the typical picture of independent events A & B.
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 19/158
17:31 2013-8-30
conditioning may affect independence
17:32 2013-8-30
having independence in the original model does not
imply independence in the conditional model.
17:33 2013-8-30
Bayes in favor of head
17:33 2013-8-30
unfair coins
17:34 2013-8-30
What is a "two unfair coins"?
P(H | coin A) == 0.9
P(H | coin B) == 0.1
17:35 2013-8-30
first choose one of the 2 coins in random,
then I start to flip
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 20/158
17:36 2013-8-30
coin flips(coin tosses)
17:37 2013-8-30
step1: choose coin
step2: toss coin
17:38 2013-8-30
"independence in a conditional world" which does not
imply independence in a unconditional world.
17:40 2013-8-30
if you do not know which coin is used(A or B?),
then Will 10 consecutive heads change your belief
about the 11th flips?
17:43 2013-8-30
inference calculation
17:44 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 21/158
makes me change my beliefs
17:44 2013-8-30
conditional probability <-> unconditional probability
17:45 2013-8-30
independence of multiple coin tosses
17:46 2013-8-30
independence of collection of events
17:50 2013-8-30
mathematical definition:
A,B,C are independent <->
P(A ∩ B ∩ C) == P(A) * P(B) * P(C)
17:52 2013-8-30
pairwise independence does not imply independence
17:52 2013-8-30
independence & pairwise independence are different things
17:54 2013-8-30
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 22/158
you can check the above conclusion with this example:
2 fair coin tosses, HH, TT, HT, TH
event A: 1st toss is H
event B: 2nd toss is H
event C: both tosses are same
17:56 2013-8-30
it can be easily seen that
P(A) = P(B) = P(C) = 1/2
but P(A ∩ B ∩ C) = 1/4, clearly
P(A ∩ B ∩ C) != P(A) * P(B) * P(C) // A, B, C are not independent
.................................// but A,B,C are pairwise independent
18:03 2013-8-30
pairwise independence:
P(A ∩ B) == P(A) * P(B)
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 23/158
P(A ∩ C) == P(A) * P(C)
P(B ∩ C) == P(B) * P(C)
18:06 2013-8-30
the King's sibling problem
18:06 2013-8-30
B == Boy, G == Girl
BB, GG, BG, GB
18:08 2013-8-30
conditional sample space given that there is a King(boy).
18:09 2013-8-30
ans == 2 / 3
18:09 2013-8-30
to reverse engineer this answer
18:11 2013-8-30
there is a hidden assumption
--------------------------------------------------------------
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 24/158
///
18:52 2013-8-31 Saturday
6.041 lec 04, counting
18:53 2013-8-31
sample point are equally likely
18:56 2013-8-31
stages
18:57 2013-8-31
license plates example
19:00 2013-8-31
permutation
19:09 2013-8-31
how many subsets does (1, 2, ... n) have?
n steps, either take or NOT take, so
number of subsets == exp(2, n)!
19:13 2013-8-31
all possible outcomes are equally likely
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 25/158
19:14 2013-8-31
Event, sample space
19:14 2013-8-31
independent & fair dice rolls
19:20 2013-8-31
combination: committee problem
19:20 2013-8-31
pick k out of n
19:21 2013-8-31
permutation: ordered
combination: non-ordered
19:24 2013-8-31
ordered list // permutation
19:26 2013-8-31
How to derive combination formula?
19:28 2013-8-31
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 26/158
just notice how to compute P(n, k)?
step1: make c(n,k) // combination(unordered list)
step2: permute these k people // k!
it must be true that
c(n,k) * k! == p(n,k)
so, c(n,k) == p(n,k) / k!
19:30 2013-8-31
both are valid ways to counting
19:31 2013-8-31
binomial coefficient: c(n,k)
19:31 2013-8-31
sanity check
19:32 2013-8-31
k element set
19:33 2013-8-31
n! // n factorial
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 27/158
19:34 2013-8-31
empty set
19:34 2013-8-31
identity
19:35 2013-8-31
how to find c(n, 0) + c(n,1) + c(n,2) + .. + c(n,n)?
1. do not use combination formula c(n,K)
2. it denotes all the combination, so...
3. n step, take or not take
ans == exp(2,n)
19:38 2013-8-31
independent & fair coin tosses
19:39 2013-8-31
binomial coefficient
19:46 2013-8-31
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 28/158
binomial probability
19:48 2013-8-31
experiment
19:49 2013-8-31
flip a coin independently 10 times
19:52 2013-8-31
conditional probability <-> unconditional probability
19:52 2013-8-31
equally likely
19:57 2013-8-31
subset problem
19:58 2013-8-31
partition set into subsets
20:00 2013-8-31
13 * 4 == 52
--------------------------------------------------
21:02 2013-8-31
revie 6.041 lec 04, counting
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 29/158
///
9:23 2013-9-2 Monday
6.041 lec 05
9:23 2013-9-2
random variable
9:23 2013-9-2
PMF == Probability Mass Function
9:24 2013-9-2
expectation
9:24 2013-9-2
variance
9:27 2013-9-2
experiment, outcome
9:28 2013-9-2
example: height of students
Random Variable: H
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 30/158
9:30 2013-9-2
probability experiment
9:33 2013-9-2
numerical value
9:33 2013-9-2
So what is a random variable?
sample space -> real number
9:37 2013-9-2
conceptual hurdles
9:37 2013-9-2
random varialbe: X // function
numerical value: x
9:41 2013-9-2
Random Variable is actually a function:
sample space -> numeric value
9:47 2013-9-2
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 31/158
PMF == Probability Mass Function
9:49 2013-9-2
PX(x)
9:49 2013-9-2
bar graph
9:54 2013-9-2
geometric PMF
10:03 2013-9-2
binomial PMF
10:03 2013-9-2
bell curve
10:08 2013-9-2
expected value
10:09 2013-9-2
expectation
10:10 2013-9-2
center of gravity
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 32/158
10:19 2013-9-2
requires justification
10:20 2013-9-2
in general, the expectation is some sort of average,
but the "function of average" is not the "average of function"!
10:24 2013-9-2
expected value
10:24 2013-9-2
expectation
10:24 2013-9-2
Random Variable
10:28 2013-9-2
properties of expectation
10:30 2013-9-2
linearity property of expectation
10:30 2013-9-2
variance
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 33/158
10:31 2013-9-2
second moment:
expectation of X * X
10:32 2013-9-2
what is variance?
Expectation of "certain distance from the mean squared"
E[SQUARE(X - E[X])]
10:39 2013-9-2
deviation: X - E(X)
10:40 2013-9-2
variance:
expectation of squared value of deviation
10:42 2013-9-2
spread
10:43 2013-9-2
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 34/158
linear transformation
------------------------------------------------
10:45 2013-9-2
6.041 lec 6: Discrete Random Variable II
13:57 2013-9-2
PMF == Probability Mass Function
P(X = x)
13:59 2013-9-2
you do the experiment, you get an outcome
14:00 2013-9-2
expectation of the random variable
20:57 2013-9-2
PMF == Probability Mass Function
20:58 2013-9-2
Expectation: weighted average
20:59 2013-9-2
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 35/158
you do an experiment, you get the outcome
21:00 2013-9-2
variance
21:00 2013-9-2
standard deviation
21:01 2013-9-2
you can not exchange function & expectation
which means you can not reason on expectation in general
21:03 2013-9-2
expectation: center of distribution
center of mass, center of gravity, etc...
///
20:10 2013-9-3 Tuesday
random variable
20:10 2013-9-3
PMF == Probability Mass Function
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 36/158
20:11 2013-9-3
expectation(expected values)
20:11 2013-9-3
variance
20:11 2013-9-3
standard deviation
20:11 2013-9-3
expectation: weighted average
20:20 2013-9-3
How far will you be from the average typically?
variance..
20:45 2013-9-3
the standard deviation tells
us the general distribution in our system
20:53 2013-9-3
in generally, we can not reason on the average
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 37/158
20:56 2013-9-3
conditional probability
20:56 2013-9-3
conditional PMF
21:01 2013-9-3
in the conditional universe
21:02 2013-9-3
conditional expectation // use conditional PMF
21:09 2013-9-3
geometric PMF
21:09 2013-9-3
geometric progression
21:24 2013-9-3
conditioned on this information
21:25 2013-9-3
memoryless property
21:26 2013-9-3
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 38/158
conditioned on the event that X > 2,
21:34 2013-9-3
total expectation theorem
total probability theorem
21:38 2013-9-3
conditional expectation
21:39 2013-9-3
given the 1st coin flip is tail does not tells
me any information about number of flips in the
future... // memoryless property
21:46 2013-9-3
how weight & height are interrelated?
we need "joint PMF"
21:46 2013-9-3
"joint PMF" of 2 random variable
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 39/158
21:55 2013-9-3
joint PMF <-> marginal PMF
22:01 2013-9-3
conditional probability
--------------------------------------------------
22:09 2013-9-3 6.041 lec7
Discrete Random Variables III
22:22 2013-9-3
joint PMF
22:22 2013-9-3
conditional PMF
22:27 2013-9-3
marginal, joint, conditional
22:30 2013-9-3
joint probability
marginal probability
conditional probability
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 40/158
22:33 2013-9-3
conditioning
22:33 2013-9-3
independence
22:52 2013-9-3
independent random variables
23:04 2013-9-3
which is conditioned on the event that...
23:17 2013-9-3
exception to rules
23:18 2013-9-3
expectation is just a variation of averages
23:19 2013-9-3
if you have independence, then PMF factor out...
23:32 2013-9-3
X conveys no information about Y
23:36 2013-9-3
variance
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 41/158
23:41 2013-9-3
extremely dependent
23:46 2013-9-3
binomial expectation
0:14 2013-9-4
sleep,

8:46 2013-9-4 Wednesday
review 6.041 lec 7
9:18 2013-9-4
joint PMF
9:19 2013-9-4
3 random variables are independent...
9:31 2013-9-4
conditional independence
9:49 2013-9-4
independence is used in this step
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 42/158
9:51 2013-9-4
X tells me nothing about Y
// X & Y are independent random variables
9:53 2013-9-4
conveys information
9:54 2013-9-4
the variance captures the idea of how
wide, how spread out the certain distribution is.
10:00 2013-9-4
X & Y are independent
X & Y are extremely dependent
10:03 2013-9-4
X has no information about Y, so
X has no information about -3Y
10:13 2013-9-4
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 43/158
indicator variable
10:33 2013-9-4
cross term in the sum
----------------------------------------
17:34 2013-9-4
6.041 probability, lec 08, continuous random variables
17:45 2013-9-4
r.v == random variable
17:45 2013-9-4
PDF == Probability Density Function
17:46 2013-9-4
continuous random variable
17:48 2013-9-4
push sample space into background
17:54 2013-9-4
mass function -> density function
17:55 2013-9-4
any individual point has zero probability
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 44/158
17:58 2013-9-4
integrate density over that range
18:01 2013-9-4
densities are not probabilities...
they're rates at which probabilities accumulate
18:08 2013-9-4
What's the difference between "discrete r.v." & "continuous r.v"?
sum -> integral
PMF -> PDF
18:09 2013-9-4
sum gets replaced by integrals, and
PMFs gets replaced by PDFs
18:10 2013-9-4
variance
18:11 2013-9-4
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 45/158
uniform random variable
18:11 2013-9-4
continuous uniform random variable
18:12 2013-9-4
total probability gets equal to 1
18:15 2013-9-4
standard deviation == how spread out is our r.v?
18:22 2013-9-4
CDF == Cumulative Distribution Function
which unifies PDF & PMF...
18:24 2013-9-4
r.v. == random variable
18:24 2013-9-4
discrete random variable,
continuous random varialbe
18:28 2013-9-4
derivative of CDF == density
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 46/158
18:34 2013-9-4
mixed random variable:
a combination of PDF & PMF
18:35 2013-9-4
mass function, density function
18:42 2013-9-4
Gaussian PDF == normal PDF
18:43 2013-9-4
standard normal: N(0,1)
general normal
18:57 2013-9-4
normal random variable
19:00 2013-9-4
bell-shaped curve
19:01 2013-9-4
linear transformation
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 47/158
19:02 2013-9-4
closed-form formula
19:04 2013-9-4
mean, variance
19:07 2013-9-4
CDF == Cumulative Distribution Function
19:08 2013-9-4
standardizing the random variable
general normal -> standard normal
19:10 2013-9-4
if X is normal, then a * X + b is also normal
// linear transformation
19:17 2013-9-4
probability density function is the analog of
the probability mass function
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 48/158
19:18 2013-9-4
mass function, density function
19:19 2013-9-4
joint density, conditional density
------------------------------------------------
19:44 2013-9-4
mit 6.041 lec 9, multiple continuous r.v.
19:47 2013-9-4
CDF applies both to discrete & continuous random variable
19:47 2013-9-4
joint density function
conditional density function
19:52 2013-9-4
the area under the density curve
///
18:00 2013-9-5 Thursday
joint density function
19:16 2013-9-5
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 49/158
joint PDF
19:19 2013-9-5
from joint to marginal
joint -> marginal
19:22 2013-9-5
inner integral
19:25 2013-9-5
joint density, marginal density
19:29 2013-9-5
Buffon's needle
19:34 2013-9-5
acute angle
19:36 2013-9-5
uniform distribution
19:38 2013-9-5
to identify "the event of interest"
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 50/158
19:47 2013-9-5
infer, inference
19:50 2013-9-5
statistician
19:56 2013-9-5
conditional PDF == joint PDF / marginal PDF
20:01 2013-9-5
marginal density is just the "area of the slice"
20:04 2013-9-5
conditional PDF is the normalized version
of the slice by using a scalar of "area of the slice"(marginal PDF)
20:06 2013-9-5
stick-breaking example
20:12 2013-9-5
joint PDF, marginal PDF, conditional PDF
--------------------------------------------------
20:20 2013-9-5
6.041 lec 10,
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 51/158
20:20 2013-9-5
inference
20:30 2013-9-5
derived distribution
20:35 2013-9-5
joint, marginal, conditional <-> mass function, density function
20:36 2013-9-5
measuring device
20:36 2013-9-5
observable random variables
/
18:20 2013-9-7 Saturday
conditional density
18:24 2013-9-7
measuring device
18:24 2013-9-7
observable random variable
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 52/158
18:28 2013-9-7
make inference
18:32 2013-9-7
joint density == marginal density * conditional density
18:34 2013-9-7
X: plane present/not present
Y: radar register/not register sth
18:34 2013-9-7
x ->measuring device-> Y
18:35 2013-9-7
Y = X + W // W is a kind of Gaussian noise
18:35 2013-9-7
prior probability
18:37 2013-9-7
a model of the measuring device
18:37 2013-9-7
conditional density
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 53/158
18:42 2013-9-7
if a random variable is discrete,
it's described by a PMF
18:43 2013-9-7
in this mixed world
18:46 2013-9-7
continuous X, discrete Y
18:47 2013-9-7
prior, posterior?
18:49 2013-9-7
derived distribution
19:14 2013-9-7
CDF == Cumulative Distribution Function
19:36 2013-9-7
the PDF of Y = a * X + b
/
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 54/158
10:06 2013-9-8 Saturday
conditional density
10:06 2013-9-8
derived distribution
10:07 2013-9-8
continuous random variables
discrete ..........
10:11 2013-9-8
where the order of condition is reversed
11:14 2013-9-8
make inference
11:36 2013-9-8
X is discrete, Y is continuous
11:36 2013-9-8
prior probability
11:37 2013-9-8
we need a model of our measuring device
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 55/158
11:39 2013-9-8
multiplication rule
11:42 2013-9-8
P: PMF, f: PDF
11:45 2013-9-8
prior, posterior
------------------------------------------
22:01 2013-9-8
back from home, review 6.041 lec 10
22:01 2013-9-8
conditional density
22:07 2013-9-8
noisy environment(measuring device)
22:11 2013-9-8
the order of conditioning has been reversed
22:28 2013-9-8
prior probability + model of our measuring device
22:42 2013-9-8
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 56/158
derived distribution
22:54 2013-9-8
CDF == Cumulative Distribution Function
23:20 2013-9-8
2-step cookbook procedure
/
20:07 2013-9-12 Thursday
covariance
20:07 2013-9-12
independent -> covariance(X,Y) == 0
20:10 2013-9-12
variance, covariance
20:11 2013-9-12
variance -> standard deviation
covariance -> correlation coefficient
20:15 2013-9-12
linear related
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 57/158
20:16 2013-9-12
scatter plot
--------------------------------------
20:24 2013-9-12
conditional expectation,
conditional variance
20:25 2013-9-12
conditional PMF
20:26 2013-9-12
conditional density function
20:30 2013-9-12
r.v. == random variable
20:30 2013-9-12
conditional expectation:
example, break a stick two times.
20:39 2013-9-12
law of iterated expectations
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 58/158
20:41 2013-9-12
total expectation theorem
20:48 2013-9-12
in a conditional universe
20:48 2013-9-12
conditional variance
20:54 2013-9-12
Law of total variance
20:59 2013-9-12
the conditional expectation is random
21:08 2013-9-12
section means & variance
21:11 2013-9-12
w.p. == with probability
21:32 2013-9-12
overal variability
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 59/158
21:37 2013-9-12
uniform distribution
21:47 2013-9-12
random number of random variables
21:51 2013-9-12
divide & conquer
21:52 2013-9-12
conditional universe
21:53 2013-9-12
i.i.d. == identical independent distribution
21:57 2013-9-12
Law of iterated expectations
/
20:15 2013-9-14 Saturday
conditional expectation
20:15 2013-9-14
r.v. == random variables
20:17 2013-9-14
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 60/158
numeric value
20:20 2013-9-14
conditional variance
20:21 2013-9-14
stick example // break a stick twice
20:30 2013-9-14
law of total variance
20:32 2013-9-14
in the conditional universe
20:38 2013-9-14
sections
20:38 2013-9-14
section means & variance
21:01 2013-9-14
PDF == Probability Density Function
21:03 2013-9-14
uniform distribution
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 61/158
21:19 2013-9-14
i.i.d. == independent identical distribution
21:21 2013-9-14
law of iterated expectations
///
17:16 2013-9-15 Sunday
6.041 probability lec 13,
Bernoulli Process
17:17 2013-9-15
random processes(stochastic process)
17:18 2013-9-15
Bernoulli Process: which is just a sequence of coin flips
17:20 2013-9-15
so what is a process?
things evolve with time
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 62/158
17:24 2013-9-15
Bernoulli process is a good model for
"streams of arrivals"
17:25 2013-9-15
time slot
17:25 2013-9-15
during a time slot, sth comes or not comes
17:27 2013-9-15
Bernoulli Process assumptions:
1. independent
2. p(probability) is constant with time
17:28 2013-9-15
random process(stochastic process)
17:28 2013-9-15
so what is a random process?
a random process is just a "sequnce of random variables".
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 63/158
17:29 2013-9-15
a bunch of random variables
17:32 2013-9-15
joint distribution
17:33 2013-9-15
joint PMFs
17:33 2013-9-15
2nd view of random process:
see the whole sequence as a single experiment
17:34 2013-9-15
outcomes of experiment
17:34 2013-9-15
the sample space of Bernoulli Process
according to the 2nd view?
17:35 2013-9-15
sequence
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 64/158
17:36 2013-9-15
this event is contained in this event,
this imply this..
17:37 2013-9-15
because the trials are independent
17:46 2013-9-15
memorylessness
17:46 2013-9-15
Markov process does has memoryness.
17:47 2013-9-15
arrival of jobs to a facility
17:48 2013-9-15
1st question:
fix the time, ask how many arrivals?
2nd question:
fix the arrivals, ask how much time take?
17:49 2013-9-15
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 65/158
binomial random variable
17:51 2013-9-15
the process starts
17:58 2013-9-15
memoryless property
17:59 2013-9-15
the past history does not tell you
anything about the future...
17:59 2013-9-15
a sequence of independent Bernoulli trials
18:03 2013-9-15
foresight: look ahead into the future
18:06 2013-9-15
give an example of Bernoulli Process?
you buy a lottery everyday, either win(p)
or lose(1-p).
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 66/158
18:10 2013-9-15
geometric random variable
18:19 2013-9-15
interarrival time
18:19 2013-9-15
you are called into the room
and you start watching
18:20 2013-9-15
geometric random variables
18:23 2013-9-15
convolution, convolve with...
18:42 2013-9-15
splitting a Bernoulli Process
2 servers
18:43 2013-9-15
Bernoulli stream of arrivals
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 67/158
18:47 2013-9-15
Merging of independent Bernoulli Processes
18:47 2013-9-15
arrival processes
//
21:10 2013-9-16 Monday
review 6.041 lec 13, Bernoulli Process
21:34 2013-9-16
Bernoulli Process:
arrival of jobs to a facility
21:40 2013-9-16
interarrival times
21:52 2013-9-16
you buy a lottery ticket everyday
// Bernoulli Process
21:54 2013-9-16
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 68/158
string of losing days
21:56 2013-9-16
geometric random variable
22:03 2013-9-16
geometric with parameter p
22:07 2013-9-16
interarrival time
22:09 2013-9-16
T1, T2, T3 are independent random variables
22:11 2013-9-16
i.i.d. == independent identical distribution
22:12 2013-9-16
convolution formula
22:12 2013-9-16
convolve with
22:13 2013-9-16
PMF == Probability Mass Function
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 69/158
22:16 2013-9-16
different time slot are independent
of each other
22:16 2013-9-16
binomial probabilities
22:19 2013-9-16
mean & variance
22:21 2013-9-16
splitting of a Bernoulli Process
22:25 2013-9-16
by taking a Bernoulli stream of arrivals
22:26 2013-9-16
statistically independent with each other
22:33 2013-9-16
merging of independent Bernoulli Processes
--------------------------------------------------
22:48 2013-9-16
start probability lec 14, Poisson Process I
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 70/158
22:49 2013-9-16
that has many stages
22:49 2013-9-16
Poisson Process:
a continuous version of a Bernoilli Process
22:50 2013-9-16
interarrival time
22:51 2013-9-16
time slot, sth arrived, nothing arrived
22:53 2013-9-16
geometric distribution
22:54 2013-9-16
starting from scratch
22:58 2013-9-16
time to k arrivals: Pascal PMF
22:59 2013-9-16
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 71/158
a sequence of Bernoulli trials
23:00 2013-9-16
come in and start watching
23:02 2013-9-16
start watching at that time
23:08 2013-9-16
random independent Bernoulli trials
23:11 2013-9-16
the Poisson Process is the continuous
version of Bernoulli Process.
23:12 2013-9-16
arrival of person at the bank
23:14 2013-9-16
time slot
23:27 2013-9-16
time homogeneity
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 72/158
23:28 2013-9-16
different time slots are independent
23:28 2013-9-16
number of arrivals
23:28 2013-9-16
time interval
23:29 2013-9-16
disjoint time intervals
23:31 2013-9-16
arrival rate
23:31 2013-9-16
small interval probabilities
23:32 2013-9-16
approximately equality
23:32 2013-9-16
1st order terms, 2nd order terms
23:33 2013-9-16
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 73/158
λ: arrival rate
expected number of arrivals per unit time
23:37 2013-9-16
the intensity of this process
23:38 2013-9-16
Bernoulli Process: Binomial PMF
23:39 2013-9-16
split big interval into many small intervals
23:40 2013-9-16
disjoint time intervals are independent
23:41 2013-9-16
during each small time interval, either success
or failure, multiple arrivals are negligible....
as delta -> 0
23:42 2013-9-16
use Bernoulli Process to study Poisson Proces,
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 74/158
assuming Delta goes to zero(infinite small)
23:44 2013-9-16
binomial distribution
23:52 2013-9-16
Poisson PMF
23:52 2013-9-16
finely discretize [0,t]: approximately Bernoulli
23:55 2013-9-16
fix time, count:
expected number of arrivals // this is random variable
23:55 2013-9-16
arrival rate
23:57 2013-9-16
email arrivals
23:58 2013-9-16
example of Poisson Process:
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 75/158
radioative decay, email arrivals
car accident, weak light source emit photons
0:00 2013-9-17
arrival rate of e-mails
0:01 2013-9-17
When to use "Poisson Process model"?
whenever arrivals happening in a completely
random way without any additional structure,
the Poisson Process is a good model of arrivals
0:08 2013-9-17
the pmf formula for the Poisson distribution
0:09 2013-9-17
fix arrivals, find time?
the time it takes until the kth arrivals
(use pdf to analyze) -> this is what called "Erlang distribution"
fix time, find arrivals?
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 76/158
0:19 2013-9-17
exponential distribution // continuous version
geometric distribution // discrete version
0:20 2013-9-17
the geometric is just a discrete version
of exponential......
0:21 2013-9-17
Poisson Process is just the limit of Bernoulli Process
thus Poisson share the "memorylessness property" of
Bernoulli
0:22 2013-9-17
whatever happens in the past has no bearing in the future
0:23 2013-9-17
the time until the kth arrival
0:23 2013-9-17
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 77/158
minislot
0:24 2013-9-17
it's as if we start Poisson Process from scratch
0:24 2013-9-17
exponential distribution
0:25 2013-9-17
sum of k independent exponential distribution random variable
// Erlang distribution
0:26 2013-9-17
sum of k interarrival time
0:26 2013-9-17
the all have the same exponential distribution
0:26 2013-9-17
random number generator // rg
0:30 2013-9-17
warmup
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 78/158
0:30 2013-9-17
adding up Poisson Process
0:31 2013-9-17
Poisson random variable
0:33 2013-9-17
sum of Poisson random variable is still
Poisson random variable
0:33 2013-9-17
number of arrivals during a fixed time interval
0:34 2013-9-17
merging of "Poisson Processes" is still Poisson
0:35 2013-9-17
color blind:
not aware of which bulb flashes? red or green?
0:39 2013-9-17
I'm multiplying probabilities here because
I'm making the assumption that the 2 processes
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 79/158
are independent.
0:44 2013-9-17
we keep the 1st order terms, while we throw
away the 2nd order terms.
0:44 2013-9-17
higher order terms
0:45 2013-9-17
so the probability of having simultaneously a red & green
flash during a little interval is negligible
0:50 2013-9-17
different intervals are independent of each other
0:50 2013-9-17
arrival rate == rate1 + rate2
0:52 2013-9-17
go to sleep...
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 80/158
///
21:58 2013-9-18 Wednesday
start probability lec15
21:59 2013-9-18
Poisson Process is a continuous version of
Bernoulli Process
22:03 2013-9-18
Defining characteristics of Poisson Process:
1. Time homogeneity
2. Independence
3. Small interval probabilities
22:15 2013-9-18
Poisson PMF
22:15 2013-9-18
once you have an arrival, it's as if
the experiment starts fresh
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 81/158
22:17 2013-9-18
exponential distribution
22:21 2013-9-18
the light bulb has not yet burned out
22:22 2013-9-18
used is exactly as good as new
22:25 2013-9-18
Time Yk to kth arrival time: Erlang(k)
22:30 2013-9-18
fish are caught according to a Poisson Process
///
11:27 2013-10-7
transient state, recurrent state
11:35 2013-10-7
end up at that trapping state
11:39 2013-10-7
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 82/158
it's a system of linear equations
11:40 2013-10-7
single absorbing state
11:44 2013-10-7
single recurrent class of states
11:46 2013-10-7
self-transition, self-arc
11:48 2013-10-7
mean recurrence time of s

20:43 2013-10-4 Friday
exponential distribution is the 1st act of Poisson process
///
9:04 2013-10-6 Sunday
review Poisson Process lec 15
9:08 2013-10-6
Poisson Process is a continuous version of Bernoulli Process
9:12 2013-10-6
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 83/158
exponential distribution
9:14 2013-10-6
random amount of time
9:20 2013-10-6
"used is exact as good as new"
9:22 2013-10-6
you can use the convolution formula to find Y2 = T1 + T2
9:26 2013-10-6
what is the defining characteristic of Poisson Process?
random arrivals
9:28 2013-10-6
P(0,2) // the probability of catching 0 fish in the next 2 hours
9:37 2013-10-6
Poisson == fish in French
9:59 2013-10-6
the exponential variable is the 1st action of Poisson movie
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 84/158
10:20 2013-10-6
splitting == thinning out
--------------------------------------------------------------
10:39 2013-10-6
Markov process
10:40 2013-10-6
we're now have dependence between different times
rather than have memorylessness
10:41 2013-10-6
new state == f(old state, noise)
10:42 2013-10-6
the evolution of some state
10:44 2013-10-6
checkout counter model
10:46 2013-10-6
customer arrivals & customer departures
11:03 2013-10-6
discrete finite state Markov chain
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 85/158
11:06 2013-10-6
finite state space
11:07 2013-10-6
transition probability
11:09 2013-10-6
Markov property/assumption:
given current state, the past does not matter
11:16 2013-10-6
state variable
11:16 2013-10-6
Which is the right state-variable?
11:18 2013-10-6
possible transition, transition probability
11:25 2013-10-6
n-step transition probabilities
11:25 2013-10-6
total probability theorem
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 86/158
11:31 2013-10-6
rij, r == recursion
pij, p == probability
12:09 2013-10-6
settles to some steady-state value
12:17 2013-10-6
limiting case
12:18 2013-10-6
convergence?
12:18 2013-10-6
Does the limit depends on initial state?
12:32 2013-10-6
recurrent state, transient state
------------------------------------------------
12:43 2013-10-6
Markov chain II
14:12 2013-10-6
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 87/158
self transition
14:12 2013-10-6
your chain is not peoriodic
14:14 2013-10-6
steady-state probabilities
14:20 2013-10-6
trajectory
14:52 2013-10-6
birth-death process
14:53 2013-10-6
transition probability
14:53 2013-10-6
upward transition, downward transition
17:22 2013-10-6
state space
17:26 2013-10-6
steady-state behavior
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 88/158
17:28 2013-10-6
trajectory
17:37 2013-10-6
recurrent state, transient state
17:39 2013-10-6
recurrent class
17:40 2013-10-6
you're trapped
17:41 2013-10-6
periodic states
17:47 2013-10-6
suppose the chain has some self-transition somewhere
17:48 2013-10-6
certain group of states
17:50 2013-10-6
initial state
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 89/158
17:50 2013-10-6
steady state probabilities
17:54 2013-10-6
this requires that the chain is not periodic, otherwise
it will oscillatting
17:55 2013-10-6
steady-state convergence theorem:
1. recurrent states are all in a single class
2. single recurrent class is not periodic
17:59 2013-10-6
initial states get forgotten
18:01 2013-10-6
steady state probability of state k:
Πk(initial state does not matter)
18:06 2013-10-6
balance equations
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 90/158
18:28 2013-10-6
the chain settles into steady-state
18:31 2013-10-6
self-arcs
18:33 2013-10-6
birth-death process
18:39 2013-10-6
it's a crude model for the supermarket counter
18:52 2013-10-6
p/q == load factor
18:59 2013-10-6
there is no Bayes in other direction
---------------------------------------------------
19:09 2013-10-6
Markov chain III
19:10 2013-10-6
transient state
19:10 2013-10-6
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 91/158
recurrent class of recurrent states
19:20 2013-10-6
conditional probability,
unconditional probability
19:25 2013-10-6
recurrent class
19:29 2013-10-6
balanced equation
normalization equation
19:38 2013-10-6
How long it takes to forget the initial state?
19:40 2013-10-6
this Markov chain has a much slower time scale
19:41 2013-10-6
steady-state approximation
19:41 2013-10-6
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 92/158
Erlang distribution
19:47 2013-10-6
let's assume phone calls are originated according to a
Poisson Process
19:52 2013-10-6
exponential distribution
19:54 2013-10-6
exponential distribution of a phone call duration
20:08 2013-10-6
upward transition, downward transition
20:37 2013-10-6
absorption probability
20:45 2013-10-6
once we get there, we're stuck in there
-------------------------------------------------
21:10 2013-10-6
review Markov chains III
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 93/158
21:15 2013-10-6
recurrent class
21:19 2013-10-6
balanced equation + normalization equation
21:47 2013-10-6
originate of phone call: Poisson Process
duration of phone call: exponential distribution
21:51 2013-10-6
continuous Markov chains,
discrete-time Markov chains
22:07 2013-10-6
back of the envelope calculation
22:34 2013-10-6
absorption probability
22:46 2013-10-6
probability of absorption
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 94/158
22:48 2013-10-6
getting stuck inside that lump
22:52 2013-10-6
expected time to absorption
23:26 2013-10-6
go to bed
//
9:55 2013-10-7 Monday
review Markov chain III
10:02 2013-10-7
so here we got a 2nd recurrent class
10:25 2013-10-7
the transition probability of 1 to 1 with 100 steps
10:26 2013-10-7
steady-state probability
11:01 2013-10-7
this is the transition probability of downward transition
11:09 2013-10-7
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 95/158
this is the probability that you wanted to be small in
a well engineered system.
11:11 2013-10-7
mean duration of calls
11:17 2013-10-7
standard real-world application of Markov chains
11:23 2013-10-7
event of interest
----------------------------------------
13:13 2013-10-7
start cs50 section1
13:17 2013-10-7
preprocessing
13:18 2013-10-7
clang: from c to assembly
13:19 2013-10-7
what does -lcs50 mean?
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 96/158
l == library
13:22 2013-10-7
cs50 Appliance
13:23 2013-10-7
function prototype + function implementation
13:40 2013-10-7
' ' // single quote
13:40 2013-10-7
" " // double quote
13:42 2013-10-7
# the hash symbol
14:00 2013-10-7
bitwise operator
14:09 2013-10-7
xor == exclusive OR
14:14 2013-10-7
the binary representation of 'A'
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 97/158
14:18 2013-10-7
the difference between uppercase & lowercase?
just flip a single bit
14:31 2013-10-7
leading zeros
14:35 2013-10-7
left-shift operator
14:36 2013-10-7
left shift // multiply by 2
right shift // divide by 2
14:40 2013-10-7
1 byte == 8 bits
--------------------------------------------
15:34 2013-10-7
start cs50 section 2
15:38 2013-10-7
pset == problem set
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 98/158
16:01 2013-10-7
How can you convert between capital 'A' & lowercase 'a'
just using bitwise operator?
'A' 01000001
'a' 01100001
just need to flip 1 single bit!
16:08 2013-10-7
loop over a string
16:38 2013-10-7
sanity check
16:46 2013-10-7
How to swap 2 numbers without using temp,
just using bitwise operators?
16:47 2013-10-7
^ // bitwise exclusive OR
// bitwise xor
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 99/158
16:50 2013-10-7
integer overflow
16:55 2013-10-7
what does "xor is commutative" mean?
a xor b == b xor a
16:58 2013-10-7
anything xor itself is always zero!
so a ^ b ^ b == a
17:03 2013-10-7
memory register
17:05 2013-10-7
spell checker
17:07 2013-10-7
fopen()
fclose()
fscanf()
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 100/158
17:08 2013-10-7
man == manual
17:09 2013-10-7
man fopen
17:12 2013-10-7
FILE * // pointer to FILE structure
17:15 2013-10-7
format string
17:17 2013-10-7
scanf() is the opposite of printf()
17:24 2013-10-7
sprintf()
17:25 2013-10-7
format specifier
17:51 2013-10-7
ο() is the upperbound
Ω() is the lowerbound
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 101/158
17:53 2013-10-7
lowerbound on insertion sort is Ω(n)
17:53 2013-10-7
worst-case scenario
17:58 2013-10-7
binary search
18:06 2013-10-7
main() has its stack frame
18:14 2013-10-7
recursive function
18:15 2013-10-7
simple case + recursive decomposition
19:23 2013-10-9 Wednesday
probability lec 19, WLLN
19:24 2013-10-9
WLLN == Weak Law of Large Numbers
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 102/158
19:24 2013-10-9
limit theorems
19:26 2013-10-9
i.i.d. == independent identical distribution
19:27 2013-10-9
sample mean
19:28 2013-10-9
expected value is a number,
sample mean is a random variable
19:47 2013-10-9
...the sample mean converges to the true mean
19:47 2013-10-9
WLLN == Weak Law of Large Numbers
SLLN == Strong Law of Large Numbers
19:51 2013-10-9
Markov Inequality
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 103/158
20:08 2013-10-9
standard deviation
20:28 2013-10-9
convergence in probability
20:29 2013-10-9
the probability of falling outside this band converges to zero
21:16 2013-10-9
i.p. == in probability
21:30 2013-10-9
this tail probability is very small
21:30 2013-10-9
the 2nd moment of the random variable goes to infinity
21:31 2013-10-9
WLLN == Weak Law of Large Numbers
convergence of the sample mean
21:41 2013-10-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 104/158
expectations give you numbers, whereas sample mean give you
random numbers
21:45 2013-10-9
Chebyshev Inequality
21:45 2013-10-9
tail probability
21:47 2013-10-9
convergence in probability
21:47 2013-10-9
sample mean converges in probability to the true mean
WLLN == Weak Law of Large Numbers
21:48 2013-10-9
sample mean is a good estimate of the true mean
as your sample size increases
21:52 2013-10-9
the pollster's problem
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 105/158
21:56 2013-10-9
f hat is a random variable, it is
an estimate to f(the number)
22:01 2013-10-9
I can not give you a hard guarantee
22:02 2013-10-9
this accuracy requirement will be satisfied
with high confidence
22:03 2013-10-9
2 terms:
accuracy, confidence
22:03 2013-10-9
Chebyshev's inequality
22:05 2013-10-9
variance, standard deviation
22:11 2013-10-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 106/158
sample size, accuracy, confidence
22:15 2013-10-9
i.i.d. random variables
i.i.d. == independent identical distribution
22:16 2013-10-9
also my distribution becomes wider & wider,
my variance increases
22:19 2013-10-9
what the weak law(WLLN)(Weak Law of Large Numbers) tells us is that
we're going to get a distribution that very highly concentrated
around the true mean which is mean
22:20 2013-10-9
density == pdf(Probability Density Function)
22:23 2013-10-9
things does become interesting as we scale with
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 107/158
square root of n instead of n itself
22:26 2013-10-9
the central limit theorem
22:30 2013-10-9
standardized random variable:
zero mean, unit variance
22:32 2013-10-9
let Z be a standardized normal r.v.
22:33 2013-10-9
r.v. == Random Variable
22:34 2013-10-9
CDF == Cumulative Distribution Function
22:35 2013-10-9
standard normal CDF
22:39 2013-10-9
the central limit theorem just tells us that
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 108/158
pretending Sn is a normal random variable
-------------------------------------------------
22:45 2013-10-9
introduction to probability, lec 20,
central limit theorem
22:51 2013-10-9
standard normal r.v. Z
22:57 2013-10-9
normal random variables
22:58 2013-10-9
What exactly does the Central Limit Theorem say?
CDF of Zn converges to normal CDF
23:03 2013-10-9
the CDFs sit essentially on top of each other, although the
2 pmfs looks quite different
23:05 2013-10-9
linear function of normal is normal
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 109/158
23:06 2013-10-9
limit theorem
23:06 2013-10-9
we just pretend that Sn is normal, because we assume
Zn is normal
23:11 2013-10-9
normal distribution
23:16 2013-10-9
the polling problem
23:16 2013-10-9
CLT == Central Limit Theorem
23:20 2013-10-9
w.p. == with probability
23:22 2013-10-9
event of interest
23:37 2013-10-9
probability of error
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 110/158
23:39 2013-10-9
Z is a standard normal random variable
23:41 2013-10-9
normal table
23:45 2013-10-9
binomial distribution
23:46 2013-10-9
Bernoulli(p)
Binomial(n, p)
23:48 2013-10-9
standardized variable
23:49 2013-10-9
standard normal random variable
23:55 2013-10-9
using the Central Limit Theorem pretending that
Sn is normal
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 111/158
0:05 2013-10-10
CLT == Central Limit Theorem
0:06 2013-10-10
De Moivre-Laplace CLT
0:07 2013-10-10
Zn is approximately normal
0:12 2013-10-10
Poisson Process, arrival rate
0:13 2013-10-10
What is a Poisson Process?
a Poisson process is just a continuous version of
Bernoulli Process
0:14 2013-10-10
during the ith little interval
0:15 2013-10-10
because in the Poisson Process, the disjoint
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 112/158
intervals are independent
0:18 2013-10-10
X // the total number of arrivals
0:18 2013-10-10
i.i.d. == independent identically distributed
0:20 2013-10-10
finite mean & finite variance
0:21 2013-10-10
What is the flaw in here that use
the central limit theorem here?
0:30 2013-10-10
Bernoulli Process:
if p fixed, n -> infinity, then this is normal distribution
if np fixed, n -> infinity, whereas p -> 0, this is Poisson
distribution
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 113/158
/
21:55 2013-10-10 Thursday
to review introduction to probability, lec 19, 20
WLLN(Weak Law of Large Numbers), CLT(central limit theorem)
22:00 2013-10-10
the expected value is a number, whereas the sample mean
is a random variable
22:04 2013-10-10
Markov Inequality,
Chebyshev Inequality
22:05 2013-10-10
WLLN == Weak Law of Large Numbers
22:06 2013-10-10
weak law, strong law
22:09 2013-10-10
smallness of expected values, smallness of probabilities
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 114/158
22:26 2013-10-10
convergence in probability
22:40 2013-10-10
sample mean, true mean
22:41 2013-10-10
WLLN(convergence of the sample mean)
23:06 2013-10-10
Chebyshev's inequality is just an estimate,
it's not very accurate, not very tight
23:07 2013-10-10
limit theorem
23:07 2013-10-10
i.i.d. == independent identically distributed
23:08 2013-10-10
i.i.d. random variables
23:15 2013-10-10
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 115/158
CLT == Central Limit Theorem
23:18 2013-10-10
linear transformation
23:25 2013-10-10
normal random variable
--------------------------------------------
23:59 2013-10-10
sample mean is a r.v.
true mean is a number
0:02 2013-10-11
we're starting with a useful simple tool that
allow us to relate probability with expected values
0:02 2013-10-11
Markov Inequality, Chebyshev Inequality
0:03 2013-10-11
Weak Law of Large Numbers(WLLN):
the sample mean converges to the true mean
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 116/158
0:04 2013-10-11
our 1st tool: Markov Inequality
0:07 2013-10-11
What does Markov's Inequality tell us?
if the expected value is small,
then the probability that X is big is also small
0:18 2013-10-11
Chebyshev's Inequality
0:19 2013-10-11
limits of sequence
0:24 2013-10-11
deterministic sequence
0:27 2013-10-11
the probability of my random variables falling
outside the band becomes smaller & smaller as n
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 117/158
become infinity
0:29 2013-10-11
the limit of this probability becomes zero
the probability means random variables falling
outside the band
0:33 2013-10-11
i.p. == in probability
0:33 2013-10-11
converge in probability
0:35 2013-10-11
convergence in probability
0:36 2013-10-11
the tail probability is small
0:37 2013-10-11
finite mean & finite variance
0:37 2013-10-11
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 118/158
What does WLLN(Weak Law of Large Numbers) imply?
convergence of sample mean.
0:40 2013-10-11
expectation gives you numbers, whereas the sample
mean gives you random variables
0:42 2013-10-11
having a large sample size helps to remove
the randomness from your experiment
0:43 2013-10-11
the variance of the sample mean becomes smaller
and smaller
0:46 2013-10-11
the sample mean converges in probability to the true mean,
this is the WLLN(Weak Law of Large Numbers) tells us.
0:48 2013-10-11
sample mean approaches the true mean as the sample size
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 119/158
increases
0:51 2013-10-11
this is an estimate f hat based on the sample we have
0:52 2013-10-11
f hat is a random variable, f is a number
0:58 2013-10-11
accuracy, confidence
1:06 2013-10-11
How can we cut some corners?
1:09 2013-10-11
CLT == Central Limit Theorem
1:09 2013-10-11
i.i.d. == independent identically distributed
1:13 2013-10-11
the CLT(Central Limit Theorem)
1:16 2013-10-11
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 120/158
Zn is a linear transformation of Sn
1:16 2013-10-11
standard normal
1:17 2013-10-11
Central Limit Theorem is a statement of the CDF
1:19 2013-10-11
CDF == Cumulative Distribution Function
1:21 2013-10-11
when n is large, you can pretend Zn is a standard
normal r.v.
1:21 2013-10-11
pretending Zn is normal is the same as pretending
Sn is normal
1:22 2013-10-11
linear function of normal r.v. is normal
//
10:35 2013-10-11
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 121/158
review lec 20, central limit theorem
10:38 2013-10-11
Why central limit theorem is universal?
any distribution, only finite means & finite variance
10:40 2013-10-11
noise usually can be described by a normal r.v.
10:45 2013-10-11
What the central limit theorem is really about?
it's about convergence of CDF of Zn, not about
convergence of PDF, PMF
10:52 2013-10-11
central limit theorem is a statement of CDFs,
not a statement about PDFs or PMFs
10:54 2013-10-11
you can just pretend that Sn is normal
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 122/158
11:12 2013-10-11
CLT == Central Limit Theorem
11:45 2013-10-11
Bernoulli(p) // Bernoulli variable
Binomial(n, p) // Binomial variable
11:57 2013-10-11
standard normal
12:08 2013-10-11
to approximate PMFs
12:10 2013-10-11
Poisson Process, arrival rate
12:14 2013-10-11
What is the flaw in this argument?
12:25 2013-10-11
Bernoulli distribution
Binomial process
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 123/158
19:52 2013-11-9
start introduction to probability, lec 21,
Bayesian Statistical Inference
20:00 2013-11-9
make sense of data
20:02 2013-11-9
signal processing in some sense is just
a inference problem!
20:12 2013-11-9
you have some data, you want to have
some inference from them!
20:21 2013-11-9
system identification
20:24 2013-11-9
the unknown quantity that you are trying
to estimate..
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 124/158
20:27 2013-11-9
How do we know the quantity that is unknown?
20:29 2013-11-9
prior distribution
20:30 2013-11-9
perhaps I have a prior distribution on...
20:31 2013-11-9
classical statistics // treat as unknown numbers
Bayesian // treat as random numbers
20:33 2013-11-9
hat^ in estimation usually means an estimate
of something!
20:34 2013-11-9
hypothesis testing
estimation
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 125/158
20:37 2013-11-9
prior: initial beliefs about what theta might be
20:37 2013-11-9
model of experiment
20:40 2013-11-9
posterior distribution
20:43 2013-11-9
probability distribution
20:47 2013-11-9
classical statistics
20:49 2013-11-9
Bayesian: assume a prior distribution of theta
20:49 2013-11-9
prior distribution
20:50 2013-11-9
assume a prior on theta
20:51 2013-11-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 126/158
posterior distribution
20:54 2013-11-9
MAP == Maximum a posterior probability
20:56 2013-11-9
What is the single answer you would give your boss?
20:58 2013-11-9
center of gravity
20:59 2013-11-9
conditional expectation
21:01 2013-11-9
you want to come up with a point estimate!
21:01 2013-11-9
LMS == Least Mean Squares
LMS estimation
21:04 2013-11-9
optimal estimate
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 127/158
21:05 2013-11-9
optimal mean squared error
21:07 2013-11-9
new conditional universe
21:11 2013-11-9
conditional expectation
21:15 2013-11-9
the conditional expectation estimator
is the optimal estimator!
21:17 2013-11-9
single random variables
21:21 2013-11-9
point estimate
-------------------------------------------
21:22 2013-11-9
review probability lec 21,
Bayesian statistical inference I
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 128/158
21:28 2013-11-9
polling: you sample, and based on these
samples you try to make some inference
21:32 2013-11-9
anywhere you have noise, inference comes in
21:32 2013-11-9
signal processing is just a inference problem!
21:33 2013-11-9
inference: make sense of data
21:41 2013-11-9
you are trying to build a model of the medium
through which your signal is propogating
21:47 2013-11-9
How do we model the quantity that is unknown?
21:48 2013-11-9
measuring apparatus
21:49 2013-11-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 129/158
I have a prior distribution on the possible
values of theta
21:50 2013-11-9
What is the two main kind of inference method?
What is their difference?
1. classical statistics // treat unknown as number
2. Bayesian // treat unknown as random variable
21:53 2013-11-9
estimator
21:53 2013-11-9
^ hat in estimation usually means that estimate
of something
21:55 2013-11-9
What is a prior?
priors are initial beliefs about what
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 130/158
theta might be
21:57 2013-11-9
model of the experimental apparatus
21:58 2013-11-9
this is the posterior distribution of theta
given the data you have seen
21:59 2013-11-9
What are the 2 main method of inference?
1. classical statistics
2. Bayesian
22:00 2013-11-9
What are the 2 main problems?
1. hypothesis testing
2. estimation
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 131/158
22:02 2013-11-9
What is a hypothesis testing?
?????????????????///
22:05 2013-11-9
multi-dimensional random variables
22:06 2013-11-9
your X and Theta are often vector of random
variables instead of single random variable
22:09 2013-11-9
instead of assuming theta is a known constant,
they would say that theta is picked randomly...
22:13 2013-11-9
converge in probability to...
22:13 2013-11-9
WLLN == Weak Law of Large Numbers
22:14 2013-11-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 132/158
assume a distribution on theta
22:15 2013-11-9
you would choose a appropriate prior distribution
for the theta, and then would use the Bayesian rule
to find that probability of different values of theta
based on the data that you have observed.
22:19 2013-11-9
posterior distribution
22:19 2013-11-9
What method can you use if you are interested in
a single answer?
1. MAP(Maximum a posterior distribution)
2. conditional expectation
22:22 2013-11-9
picking the point in the posterior pmf that
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 133/158
has the highest probability, that is the reasonable
thing to do, this is the optimal thing to do...
if you want to minimize the probability of incorrect
inference
22:23 2013-11-9
that is the method people ususally use if they
need to report a single answer, a single decision!
22:24 2013-11-9
MAP estimate == Maximum a posterior probability estimate
22:26 2013-11-9
choose the one that is most likely
22:27 2013-11-9
let me report the center of gravity of this figure
22:27 2013-11-9
this is the conditional expectation of theta given
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 134/158
the data that you have!
22:28 2013-11-9
but a single answer, a point estimate does not tell
you the whole story
22:29 2013-11-9
there is a lot more information conveyed by this
posterior distribution plot than any single answer
you report!
22:32 2013-11-9
What is a LMS estimation?
LMS == Least Mean Square
22:35 2013-11-9
the optimal estimate, the thing you should reporting
is the expected value of theta!
22:36 2013-11-9
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 135/158
this is a familiar quantity, it's just the
variance of the r.v.
22:37 2013-11-9
the original distribution of theta ==
the prior distribution of theta
22:38 2013-11-9
instead of working with the original distribution
of theta, now we working with the conditional
distribution of theta given that the data we have
observed
22:40 2013-11-9
LMS estimation of theta based on X
22:41 2013-11-9
conditional estimation of the r.v. based
on the observation you have
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 136/158
22:41 2013-11-9
you have your apparatus that creates the
measurements
22:42 2013-11-9
the estimator spits the conditional expectation
of theta given the particular data that you have
observed
22:44 2013-11-9
calculating machine
22:48 2013-11-9
the person that using this estimator
22:48 2013-11-9
the conditional expectation estimator
is the optimal estimator, it's the ultimate
estimating machine
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 137/158
22:50 2013-11-9
one complication is that we will deal with
vectors instead of single random variables
22:52 2013-11-9
implementing this calculating machine is
not easy!

10:36 2013-11-10 Sunday
start probability lec 22,
Bayesian statistical inferenceII
10:38 2013-11-10
What are the two main kind of estimators?
1. linear LMS estimator // LMS == Least Mean Square
2. MAP // MAP == Maximum a posterior probability
10:40 2013-11-10
prior density * conditional density == joint density
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 138/158
10:43 2013-11-10
either you give the posterior probability distribution, or
you use an estimator to report a single answer:
1. MAP estimator
2. LMS estimator // give the conditional expectation
10:50 2013-11-10
prior distribution + observation model
10:55 2013-11-10
How to find joint density?
joint denstity == prior density * conditional density
10:58 2013-11-10
optimal estimator is the conditional expectation
11:03 2013-11-10
g(x) = E[θ|X = x]
// the conditional expectation of θ given X = x
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 139/158
11:07 2013-11-10
squared error
11:07 2013-11-10
conditional mean squared error
11:08 2013-11-10
given the observed x, θ is still a random variable
11:09 2013-11-10
this is the posterior distribution of θ
11:11 2013-11-10
prior distribution, posterior distribution
11:11 2013-11-10
the posterior variance
11:18 2013-11-10
some observation may be rather informative,
other observation may be not so informative!
11:19 2013-11-10
so conditional expectations are really the
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 140/158
cornerstone of Bayesian estimation, they are
paticularly popular specially in engineering
context
11:22 2013-11-10
let's look at the expected value of the estimation
error
11:23 2013-11-10
estimation error
11:26 2013-11-10
θ hat is a function of x, once I told you the value
of x, you know what θ hat is going to be!
11:28 2013-11-10
estimation error
11:32 2013-11-10
θ hat is unbiased
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 141/158
11:34 2013-11-10
in the conditional universe
11:37 2013-11-10
law of iterated expectations
11:38 2013-11-10
unconditional expectation
11:40 2013-11-10
COV == covariance
11:53 2013-11-10
law of total variance
12:00 2013-11-10
linear LMS
12:00 2013-11-10
mean squared error
12:01 2013-11-10
optimal linear estimator
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 142/158
12:11 2013-11-10
linear LMS estimator
12:14 2013-11-10
X & θ are perfectly correlated!
// ρ == 0
12:16 2013-11-10
mean squared error
12:19 2013-11-10
linear LMS estimator with multiple data
12:22 2013-11-10
a system of linear equations
12:31 2013-11-10
linear LMS
12:31 2013-11-10
without loss of generality
12:36 2013-11-10
prior mean is treated as an observation
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 143/158
12:44 2013-11-10
estimation methods:
1. MAP // Maximum a posteriro probability
2. MSE // Mean Square Error
3. Linear MSE
----------------------------------------------
13:09 2013-11-10
review Bayesian Statistical Inference II
13:33 2013-11-10
MAP == Maximum a posterior probability
13:34 2013-11-10
What is a LMS estimator?
how to design an estimator g(x) that minimizes
E[(θ-g(X)) squared]?
------------------------------------------------
13:37 2013-11-10
start introduction to probability, lec 23,
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 144/158
classical statistical inference I
13:39 2013-11-10
ML == Maximum Likelyhood
13:39 2013-11-10
CI == Confidence Interval
13:40 2013-11-10
What is a classical statistics? What's the
difference between Bayesian statistics?
the big difference is that θ is treated
as a random variable in Bayesian statistics,
whereas θ is treated as a ordinary number
in classical statistics!
13:47 2013-11-10
in classical statistics we do not want to
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 145/158
assume a prior distribution on θ!
13:49 2013-11-10
What is a hypothesis testing?
???????????
13:50 2013-11-10
What are the three main problem in classical
statistics?
1. hypothesis testing
2. composite hypothses
3. estimation
13:52 2013-11-10
estimation
13:52 2013-11-10
estimation error
13:53 2013-11-10
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 146/158
our task is to design a estimator to make
the estimation error: Θ hat - θ
small
14:05 2013-11-10
ML == Maximum Likelihood
14:05 2013-11-10
What is a Maximum Likelihood Estimation?
pick θthat "makes data most likely"
14:07 2013-11-10
just choose a θ that makes the data x you
observed most likyly
14:22 2013-11-10
i.i.d. == independent identically distribution
14:23 2013-11-10
What are the Desirable properties of estimator?
1. unbiased
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 147/158
2. consistent
3. small MSE // Mean Squared Error
14:24 2013-11-10
biased coin, unbiased coin
14:26 2013-11-10
What is θ hat?
θ hat is an estimation to the true θ
14:29 2013-11-10
converge in probability to...
14:42 2013-11-10
biase estimator, unbias estimator
14:47 2013-11-10
What is a bais?
bias == estimation - true value
14:54 2013-11-10
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 148/158
sample mean
14:57 2013-11-10
WLLN == Weak Law of Large Numbers
14:57 2013-11-10
MSE == Mean Squared Error
14:59 2013-11-10
ML == Maximum Likelihood
15:02 2013-11-10
CI == Confidence Interval
15:10 2013-11-10
it's not the θ is random, it's that
CI(Confidence Interval) is random!
15:16 2013-11-10
CLT == Central Limit Theorem
15:17 2013-11-10
normal table
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 149/158
15:17 2013-11-10
standard normal
15:18 2013-11-10
sample mean, true mean
15:36 2013-11-10
we got a nice & consistent way of estimating
variances!
-------------------------------------------------
15:57 2013-11-10
review introduction to probability, lec 23,
classical statistical inference I
15:58 2013-11-10
maximum likelihood estimation
15:58 2013-11-10
CIs using estimated variance
----------------------------------------------------
17:52 2013-11-10
start probability lec 24,
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 150/158
classical inference II
17:54 2013-11-10
maximum likelihood estimation(ML)
17:56 2013-11-10
What is a Maximum Likelihood(ML) estimation?
pick a θ so that makes the observed data x
most likely to occur!
17:58 2013-11-10
sample mean constant
17:59 2013-11-10
sample mean, true mean
17:59 2013-11-10
CI == Confidence Interval
17:59 2013-11-10
confidence interval for sample mean
18:00 2013-11-10
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 151/158
unknown value are not random variable
but constant
18:02 2013-11-10
the randomness is respect to confidence
interval, not repect to θ
18:03 2013-11-10
How does one construct confidence intervals?
18:05 2013-11-10
CLT == Central Limit Theorem
18:07 2013-11-10
regression
18:16 2013-11-10
likelihood function
18:23 2013-11-10
linear regression
18:25 2013-11-10
W is a noise term that independent of the X
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 152/158
19:01 2013-11-10
linear regression essentially is ML method
ML == Maximum Likelihood
19:17 2013-11-10
linear model, quadratic model
20:01 2013-11-10
hypothesis testing
20:02 2013-11-10
default hypothesis, alternative hypothesis
20:03 2013-11-10
so that will be a reasonable way to approach
that problem!
20:05 2013-11-10
rejection region
20:12 2013-11-10
where to put the threshold
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 153/158
20:12 2013-11-10
LRT == Likelihood Ratio Test
20:18 2013-11-10
the probability of a false rejection
20:18 2013-11-10
false rejection

14:53 2013-11-11 Monday
hypothesis testing
14:53 2013-11-11
default hypothesis(null hypothesis) H0
14:53 2013-11-11
alternative hypothesis H1
14:54 2013-11-11
rejection region
14:55 2013-11-11
acceptance region, rejection region
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 154/158
14:57 2013-11-11
the 1st issue is design the shape of
your rejection region
14:58 2013-11-11
LRT == Likelihood Ratio Test
15:00 2013-11-11
false rejection probability
15:01 2013-11-11
critical value for making our decision
15:12 2013-11-11
you want to make a decision which one
of the two is true?
15:21 2013-11-11
tail probability
15:22 2013-11-11
it's a derived distribution problem
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 155/158
15:24 2013-11-11
find the probability distribution under H0!
15:26 2013-11-11
we have the null hypothesis that the coin
is fair
15:26 2013-11-11
null hypothesis <-> alternative hypothesis
null hypothesis == default hypothesis
15:28 2013-11-11
simple binary hypothesis testing
15:28 2013-11-11
composite hypotheses
15:30 2013-11-11
what does it mean to be an outlier?
15:33 2013-11-11
pick shape of rejection region
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 156/158
15:41 2013-11-11
significance level(confidence level)
15:45 2013-11-11
pick a critical value ξ
15:54 2013-11-11
What is a hypothesis?
15:54 2013-11-11
Is my die fair?
15:56 2013-11-11
choose form of my rejection region,
chi-square test
16:00 2013-11-11
LRT == Likelihood Ratio Test
16:01 2013-11-11
we want to choose our threshold ξ so that the probability
of false rejection is 5 percent!
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 157/158
16:03 2013-11-11
CLT + derived distribution problem
16:03 2013-11-11
CLT == Central Limit Theorem
16:04 2013-11-11
you want to find the distribution under
the hypothesis H0
16:08 2013-11-11
binomial random variable
16:13 2013-11-11
Is my pmf correct? Is my pdf correct?
16:17 2013-11-11
histogram
16:21 2013-11-11
How do you choose the bin size?
16:22 2013-11-11
empirical CDF
2014年3月20日6.041 introduction to probability.txt
file:///D:/!!!baked/!!!All%20video%20notes/6.041%20introduction%20to%20probability.txt 158/158
16:23 2013-11-11
CDF == Cumulative Distribution Function
16:30 2013-11-11
the assumed CDF(the CDF under the hypothesis H0)
16:32 2013-11-11
the probability distribution of this r.v.
16:37 2013-11-11
give me a model or pdf of these data!
16:38 2013-11-11
when you setup a model like a linear regression
model,
16:43 2013-11-11
Why most published research findings are false?
an obvious bias is that you only publish result
when you see something

你可能感兴趣的:(MIT 6.041 introduction to probability video note)