作为学术生涯的最后一门课,选了一门据说是最难的,上下来的感觉也确实是难得不行,不太懂……
决定照着ppt和上课的笔记整理一下,以此争取达到复习的目的。
(意思是有些虽然写出来了,但自己都不见得明白,有的部分存疑后续去询问之后再做修改)
在随机算法的问题中有大量不等式常被使用,为了在运用时能想得起来,有些甚至要背熟。
Randomized Algorithm - Chapter 3.2 (P45)
n个随机事件各自发生的概率之和,不小于这n个事件中至少有一个发生的概率
Let E i E_i Ei be a random event, then we have
P r [ ∪ i = 1 n E i ] ≤ ∑ i = 1 n P r ( E i ) Pr[\cup_{i=1}^{n}E_i] \le \sum_{i=1}^{n}Pr(E_i) Pr[∪i=1nEi]≤i=1∑nPr(Ei)
Let Y Y Y be a random variable assuming only non-negative values. Then
for all t > 0 , P r [ Y ≥ t ] ≤ E [ Y ] t \text{for all } t>0,~Pr[Y \ge t]\le \frac{E[Y]}{t} for all t>0, Pr[Y≥t]≤tE[Y]
Let X X X be a random variable with expectation μ X \mu_X μX and standard deviation σ X \sigma_X σX, then
for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ 1 t 2 \text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2} for any t>0, Pr[∣X−μX∣≥tσX]≤t21
Randomized Algorithm - Chapter 4.1 (P67)
切尔诺夫约束有三种表现方式,在多个独立的泊松实验中
Let X 1 , X 2 , ⋯   , X n X_1, X_2, \cdots, X_n X1,X2,⋯,Xn be independent Poisson trials such that,
for 1 ≤ i ≤ n , P r [ X i = 1 ] = p i 1 \le i \le n,~Pr[X_i=1]=p_i 1≤i≤n, Pr[Xi=1]=pi, where 0 < p i < 1 0<p_i<1 0<pi<1. Then
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any δ > 0 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any δ>0,
P r [ X > ( 1 + δ ) μ ] < [ e δ ( 1 + δ ) ( 1 + δ ) ] μ Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu} Pr[X>(1+δ)μ]<[(1+δ)(1+δ)eδ]μ
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ X < ( 1 − δ ) μ ] < [ e − δ ( 1 − δ ) ( 1 − δ ) ] μ Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu} Pr[X<(1−δ)μ]<[(1−δ)(1−δ)e−δ]μ
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu} Pr[∣X−μ∣>δμ]<2e−3δ2μ
Let X X X be a random variable with expectation μ X \mu_X μX and standard deviation σ X \sigma_X σX, then
for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ 1 t 2 \text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2} for any t>0, Pr[∣X−μX∣≥tσX]≤t21
P r ( ∣ X − μ X ∣ ≥ t σ X ) = P r ( ( X − μ X ) 2 ≥ ( t σ X ) 2 ) set Y ≜ ( X − μ X ) 2 ≥ 0 P r ( Y ≥ ( t σ ) 2 ) ≤ E ( Y ) ( t σ X ) 2 ∵ E ( Y ) = E ( ( X − μ X ) 2 ) = σ X 2 ∴ P r ( Y ≥ ( t σ ) 2 ) ≤ σ X 2 ( t σ X ) 2 = 1 t 2 \begin{aligned} Pr \left( |X-\mu_X| \ge t\sigma_X \right) \\ = Pr \left( (X-\mu_X)^2 \ge (t\sigma_X)^2 \right) \\ \textbf{set } Y \triangleq (X-\mu_X)^2 \ge 0 \\ Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{E(Y)}{(t\sigma_X)^2} \\ \because E(Y) = E\left( (X-\mu_X)^2 \right) = \sigma_X^2 \\ \therefore Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{\sigma_X^2}{(t\sigma_X)^2} = \frac{1}{t^2} \\ \end{aligned} Pr(∣X−μX∣≥tσX)=Pr((X−μX)2≥(tσX)2)set Y≜(X−μX)2≥0Pr(Y≥(tσ)2)≤(tσX)2E(Y)∵E(Y)=E((X−μX)2)=σX2∴Pr(Y≥(tσ)2)≤(tσX)2σX2=t21
Let X 1 , X 2 , ⋯   , X n X_1, X_2, \cdots, X_n X1,X2,⋯,Xn be independent Poisson trials such that,
for 1 ≤ i ≤ n , P r [ X i = 1 ] = p i 1 \le i \le n,~Pr[X_i=1]=p_i 1≤i≤n, Pr[Xi=1]=pi, where 0 < p i < 1 0<p_i<1 0<pi<1. Then
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any δ > 0 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any δ>0,
P r [ X > ( 1 + δ ) μ ] < [ e δ ( 1 + δ ) ( 1 + δ ) ] μ Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu} Pr[X>(1+δ)μ]<[(1+δ)(1+δ)eδ]μ
对于随机变量 (RandomVariable):
R . V . x 1 , x 2 , ⋯   , x n P r ( X i = 1 ) = p i , P r ( X i = 0 ) = 1 − p i μ = ∑ i = 1 n p i , X = ∑ i = 1 n x i , E ( X ) = μ P r ( X > ( 1 + δ ) μ ) ≤ E ( X ) ( 1 + δ ) μ = 1 1 + δ = P r ( e λ X > e λ ( 1 + δ ) μ ) ≤ E ( e λ X ) e λ ( 1 + δ ) μ ≤ e μ ( e λ − 1 ) e λ ( 1 + δ ) μ \begin{aligned} & R.V. ~x_1, x_2, \cdots, x_n \\ & Pr(X_i=1) = p_i, Pr(X_i=0) = 1-p_i \\ & \mu = \sum_{i=1}^{n}p_i, X = \sum_{i=1}^{n}x_i, E(X)=\mu \\ & Pr(X>(1+\delta)\mu) \le \frac{E(X)}{(1+\delta)\mu} = \frac{1}{1+\delta} \\ =~& Pr(e^{\lambda X}>e^{\lambda(1+\delta)\mu}) \le \frac{E(e\lambda X)}{e^{\lambda(1+\delta)\mu}}\le \frac{e^{\mu(e^{\lambda}-1)}}{e^{\lambda(1+\delta)\mu}} \\ \end{aligned} = R.V. x1,x2,⋯,xnPr(Xi=1)=pi,Pr(Xi=0)=1−piμ=i=1∑npi,X=i=1∑nxi,E(X)=μPr(X>(1+δ)μ)≤(1+δ)μE(X)=1+δ1Pr(eλX>eλ(1+δ)μ)≤eλ(1+δ)μE(eλX)≤eλ(1+δ)μeμ(eλ−1)
令 λ = l n ( 1 + δ ) \lambda = ln(1+\delta) λ=ln(1+δ),则上式化为 ( e δ ( 1 + δ ) ( 1 + δ ) ) μ \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} ((1+δ)(1+δ)eδ)μ,得证。
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ X < ( 1 − δ ) μ ] < [ e − δ ( 1 − δ ) ( 1 − δ ) ] μ Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu} Pr[X<(1−δ)μ]<[(1−δ)(1−δ)e−δ]μ
其中:
E ( e − λ X ) = E ( e − λ ( ∑ i = 1 n X i ) ) = E ( ∏ i = 1 n e − λ X i ) = ∏ i = 1 n E ( e − λ X i ) = ∏ i = 1 n ( p i ⋅ e − λ + ( 1 − p i ) ) = ∏ i = 1 n ( 1 + p i ( e − λ − 1 ) ) = e μ ( e − λ − 1 ) \begin{aligned} E(e^{-\lambda X}) &= E(e^{-\lambda(\sum_{i=1}^{n}X_i)}) \\ &= E(\prod_{i=1}^{n} e^{-\lambda X_i}) = \prod_{i=1}^{n}E(e^{-\lambda X_i}) \\ &= \prod_{i=1}^{n}(p_i \cdot e^{-\lambda} + (1-p_i)) \\ &= \prod_{i=1}^{n}( 1 + p_i (e^{-\lambda}-1)) \\ &= e^{\mu(e^{-\lambda}-1)} \end{aligned} E(e−λX)=E(e−λ(∑i=1nXi))=E(i=1∏ne−λXi)=i=1∏nE(e−λXi)=i=1∏n(pi⋅e−λ+(1−pi))=i=1∏n(1+pi(e−λ−1))=eμ(e−λ−1)
代入原式子, 有:
P r [ X < ( 1 − δ ) μ ] ≤ E ( e − λ X ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 + λ − λ δ ) \begin{aligned} Pr[X < (1-\delta)\mu] &\le \frac{E(e^{-\lambda X})}{e^{-\lambda (1-\delta) \mu}} \\ &= \frac{e^{\mu(e^{-\lambda}-1)}}{e^{-\lambda (1-\delta) \mu}} \\ &= e^{\mu(e^{-\lambda}-1+\lambda-\lambda\delta)} \end{aligned} Pr[X<(1−δ)μ]≤e−λ(1−δ)μE(e−λX)=e−λ(1−δ)μeμ(e−λ−1)=eμ(e−λ−1+λ−λδ)
令 f ( λ ) = e − λ − 1 + λ − λ δ f(\lambda) = e^{-\lambda}-1+\lambda-\lambda\delta f(λ)=e−λ−1+λ−λδ,
当 f ′ ( λ ) = − e − λ + 1 − δ = 0 f'(\lambda) = -e^{-\lambda} + 1 - \delta = 0 f′(λ)=−e−λ+1−δ=0 时, λ = − ln ( 1 − δ ) \lambda = -\ln (1-\delta) λ=−ln(1−δ)
故 P r [ X < ( 1 − δ ) μ ] < e μ f ( − l n ( 1 − δ ) ) = ( e − δ ( 1 − δ ) ( 1 − δ ) ) μ Pr[X<(1-\delta)\mu] < e^{\mu f(-ln(1-\delta))} = \left( \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right)^{\mu} Pr[X<(1−δ)μ]<eμf(−ln(1−δ))=((1−δ)(1−δ)e−δ)μ
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu} Pr[∣X−μ∣>δμ]<2e−3δ2μ
首先去掉绝对值符号:
P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ] Pr[|X-\mu| > \delta\mu] = Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu] Pr[∣X−μ∣>δμ]=Pr[X−μ>δμ]+Pr[X−μ<−δμ]
对于第一个部分:
P r [ X − μ > δ μ ] = P r [ X > ( δ + 1 ) μ ] < ( e δ ( 1 + δ ) ( 1 + δ ) ) μ = e μ ⋅ ( δ − ( 1 + δ ) ln ( 1 + δ ) ) < e − 3 δ 2 μ \begin{aligned} Pr[X-\mu > \delta\mu] &= Pr[X > (\delta+1)\mu] \\ &< \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} \\ &= e^{\mu \cdot (\delta - (1+\delta) \ln (1+\delta))} \\ &< e^{-\frac{3}{\delta^2}\mu} \end{aligned} Pr[X−μ>δμ]=Pr[X>(δ+1)μ]<((1+δ)(1+δ)eδ)μ=eμ⋅(δ−(1+δ)ln(1+δ))<e−δ23μ
同理可证 P r [ X − μ < − δ μ ] < e − 3 δ 2 μ Pr[X-\mu < -\delta\mu] < e^{-\frac{3}{\delta^2}\mu} Pr[X−μ<−δμ]<e−δ23μ
P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ] < e − 3 δ 2 μ + e − 3 δ 2 μ = 2 e − 3 δ 2 μ \begin{aligned} Pr[|X-\mu| > \delta\mu] &= Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu] \\ &< e^{-\frac{3}{\delta^2}\mu} + e^{-\frac{3}{\delta^2}\mu} \\ &= 2e^{-\frac{3}{\delta^2}\mu} \end{aligned} Pr[∣X−μ∣>δμ]=Pr[X−μ>δμ]+Pr[X−μ<−δμ]<e−δ23μ+e−δ23μ=2e−δ23μ
故 P r [ ∣ X − μ ∣ > δ μ ] < 2 e − 3 δ 2 μ Pr[|X-\mu|>\delta\mu]<2e^{-\frac{3}{\delta^2}\mu} Pr[∣X−μ∣>δμ]<2e−δ23μ 得证
原先以为往盒子里放球取球只是个抽屉原理或者排列组合的问题,
高等算法里把这研究得还要更深刻一些……
m m m balls, n n n bins. You randomly throw each ball to some bin.
X i X_i Xi : number of balls in the i i i-th bin.
Let k ≜ m a x ( X 1 , X 2 , ⋯   , X n ) k \triangleq max(X_1, X_2, \cdots, X_n) k≜max(X1,X2,⋯,Xn).
Question: expectation and distribution of k k k?
- m = o ( n ) m = o(\sqrt{n}) m=o(n)
prove P r ( k > 1 ) = o ( 1 ) Pr(k>1)=o(1) Pr(k>1)=o(1).
k = 1 w . h . p k=1~w.h.p k=1 w.h.p
m = 1 , P r ( k = 1 ) = 1 − o ( 1 ) m=1, Pr(k=1) = 1-o(1) m=1,Pr(k=1)=1−o(1)
m = 2 , { P r ( k = 1 ) = 1 − 1 / n P r ( k = 2 ) = 1 / n m=2, \begin{cases} Pr(k=1)=1-1/n \\ Pr(k=2)=1/n \end{cases} m=2,{Pr(k=1)=1−1/nPr(k=2)=1/n
m = ? , P r ( k = 1 ) = 1 − o ( 1 ) m= ? ~, Pr(k=1)=1-o(1) m=? ,Pr(k=1)=1−o(1)
对于这个 P r ( k = 1 ) = 1 − o ( 1 ) Pr(k=1)=1-o(1) Pr(k=1)=1−o(1),我们可以等价地视作:
P r ( m a x ( X 1 , X 2 , ⋯   , X n ) ≥ 2 ) = o ( 1 ) Pr(max(X_1, X_2, \cdots, X_n)\ge 2) = o(1) Pr(max(X1,X2,⋯,Xn)≥2)=o(1)
那么,根据 Useful Inequalities 中提到过的 Union Bound,有:
P r ( X 1 ≥ 2 o r X 2 ≥ 2 o r ⋯ o r X n ≥ 2 ) ≤ ∑ i = 1 n P r ( X i ≥ 2 ) = n ⋅ P r ( X 1 ≥ 2 ) \begin{aligned} Pr(X_1 \ge 2~or~X_2 \ge 2~or~\cdots~or~X_n \ge 2) ~&\le \sum_{i=1}^{n}Pr(X_i \ge 2) \\ & = n \cdot Pr(X_1 \ge 2) \end{aligned} Pr(X1≥2 or X2≥2 or ⋯ or Xn≥2) ≤i=1∑nPr(Xi≥2)=n⋅Pr(X1≥2)
其中,
P r ( X 1 ≥ 2 ) ≤ ( m 2 ) ( 1 n ) 2 = Θ ( m 2 n 2 ) P r ( X 1 ≥ 2 ) = ∑ k = 2 m P r ( X 1 = k ) = ∑ k = 2 m ( m k ) ⋅ ( 1 n ) k ( 1 − 1 n ) m − k = 1 − P r ( X 1 = 0 ) − P r ( X 1 = 1 ) = 1 − ( 1 − 1 n ) m − m ⋅ 1 n ⋅ ( 1 − 1 n ) m − 1 = Θ ( m 2 n 2 ) \begin{aligned} Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 = \Theta(\frac{m^2}{n^2}) \\ Pr(X_1 \ge 2) ~&= \sum_{k=2}^{m}Pr(X_1=k) \\ &= \sum_{k=2}^{m} \binom{m}{k}\cdot(\frac{1}{n})^k(1-\frac{1}{n})^{m-k} \\ &= 1- Pr(X_1=0) - Pr(X_1=1) \\ &= 1-(1-\frac{1}{n})^m - m\cdot \frac{1}{n} \cdot (1-\frac{1}{n})^{m-1} \\ & = \Theta(\frac{m^2}{n^2}) \end{aligned} Pr(X1≥2) Pr(X1≥2) ≤(2m)(n1)2=Θ(n2m2)=k=2∑mPr(X1=k)=k=2∑m(km)⋅(n1)k(1−n1)m−k=1−Pr(X1=0)−Pr(X1=1)=1−(1−n1)m−m⋅n1⋅(1−n1)m−1=Θ(n2m2)
代入原式子,故有:
n ⋅ P r ( X 1 ≥ 2 ) = Θ ( m 2 / n ) = o ( 1 ) ∴ m = o ( n ) n \cdot Pr(X_1 \ge 2) = \Theta(m^2/n) = o(1) \\ \therefore m = o(\sqrt{n}) n⋅Pr(X1≥2)=Θ(m2/n)=o(1)∴m=o(n)
- m = Θ ( n ) m = \Theta(\sqrt{n}) m=Θ(n); (Birthday Paradox)
+ compute P r ( k > 1 ) Pr(k>1) Pr(k>1) again.
+ k = 1 o r 2 w . h . p k=1~or~2~w.h.p k=1 or 2 w.h.p
m = Θ ( n ) = c n P r ( X 1 ≥ 2 ) ≤ ( m 2 ) ( 1 n ) 2 ≈ c 2 2 n P r ( k > 1 ) ≤ n ⋅ P r ( X 1 ≥ 2 ) ≤ c 2 2 P r ( k = 1 ) = n − 1 n ⋅ n − 2 n ⋅ n − 3 n ⋯ n − m + 1 n = P r ( E 1 ⋯ E m ) , E i ≜ P r ( E 1 ) P r ( E 2 ∣ E 1 ) P r ( E 3 ∣ E 1 E 2 ) ⋯ = ( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) \begin{aligned} m = \Theta(\sqrt{n})~&=c\sqrt{n} \\ Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 \approx \frac{c^2}{2n} \\ Pr(k > 1) ~&\le n \cdot Pr(X_1 \ge 2) \le \frac{c^2}{2} \\ Pr(k = 1) ~& = \frac{n-1}{n} \cdot \frac{n-2}{n} \cdot \frac{n-3}{n} \cdots \frac{n-m+1}{n} \\ &= Pr(E_1 \cdots E_m) ~, E_i \triangleq Pr(E_1)Pr(E_2|E_1)Pr(E_3|E_1E_2)\cdots \\ &= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n}) \end{aligned} m=Θ(n) Pr(X1≥2) Pr(k>1) Pr(k=1) =cn≤(2m)(n1)2≈2nc2≤n⋅Pr(X1≥2)≤2c2=nn−1⋅nn−2⋅nn−3⋯nn−m+1=Pr(E1⋯Em) ,Ei≜Pr(E1)Pr(E2∣E1)Pr(E3∣E1E2)⋯=(1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)
根据 Union Bound:
P r ( k = 1 ) = ( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) ≥ ( 1 − m − 1 n ) m − 1 (Union Bound) ∼ ( 1 − m − 1 n ) n m − 1 ⋅ ( m − 1 ) 2 n ∼ ( 1 e ) m 2 n \begin{aligned} Pr(k = 1) ~&= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n})\\ &\ge (1-\frac{m-1}{n})^{m-1} ~~~~\textbf{ (Union Bound)} \\ &\sim (1-\frac{m-1}{n})^{\frac{n}{m-1}\cdot{\frac{(m-1)^2}{n}}} \sim (\frac{1}{e})^{\frac{m^2}{n}} \end{aligned} Pr(k=1) =(1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)≥(1−nm−1)m−1 (Union Bound)∼(1−nm−1)m−1n⋅n(m−1)2∼(e1)nm2
又因为 1 − x ≤ e − x 1-x \le e^{-x} 1−x≤e−x:
( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) ≤ e − 1 / n ⋅ e − 2 / n ⋅ e − 3 / n ⋯ e − ( m − 1 ) / n ≈ e − m 2 / 2 n < 1 ∴ P r ( k ≥ 2 ) = 1 − P r ( k = 1 ) ≥ 1 − e − c 2 / 2 \begin{aligned} &(1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n}) \\ \le~ & e^{-1/n} \cdot e^{-2/n} \cdot e^{-3/n} \cdots e^{-(m-1)/n} \\ \approx~ & e^{-m^2/2n} < 1 \\ \therefore ~ & Pr(k \ge 2) = 1 - Pr(k = 1) \ge 1- e^{-c^2/2} \end{aligned} ≤ ≈ ∴ (1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)e−1/n⋅e−2/n⋅e−3/n⋯e−(m−1)/ne−m2/2n<1Pr(k≥2)=1−Pr(k=1)≥1−e−c2/2
而对于 k ≥ 3 k \ge 3 k≥3时:
(这段的板书顺序较为混乱,资质愚钝足足半个小时仍无法看懂,暂且搁置)
为了 case 3 的证明,我们需要事先准备一个阶乘的近似界
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x (xm)x≤(xm)≤(xem)x
先证 ( m x ) = m ! x ! ( m − x ) ! ∼ m x x ! \tbinom{m}{x} = \frac{m!}{x!(m-x)!} \sim \frac{m^x}{x!} (xm)=x!(m−x)!m!∼x!mx
lim m → ∞ ( m x ) m x x ! = lim m → ∞ m ( m − 1 ) ( m − 2 ) ⋯ ( m − x + 1 ) m x = lim m → ∞ 1 ⋅ ( 1 − 1 m ) ( 1 − 2 m ) ⋯ ( 1 − x − 1 m ) = 1 \begin{aligned} \lim\limits_{m \rightarrow \infty}\frac{\tbinom{m}{x}}{\frac{m^x}{x!}} &= \lim\limits_{m \rightarrow \infty}\frac{m(m-1)(m-2)\cdots(m-x+1)}{m^x} \\ &= \lim\limits_{m \rightarrow \infty} 1\cdot(1-\frac{1}{m})(1-\frac{2}{m})\cdots(1-\frac{x-1}{m}) \\ &= 1 \end{aligned} m→∞limx!mx(xm)=m→∞limmxm(m−1)(m−2)⋯(m−x+1)=m→∞lim1⋅(1−m1)(1−m2)⋯(1−mx−1)=1
这里,我们需要引入阶乘的逼近公式:斯特林公式(Stirling’s formula):
n ! ∼ 2 π n ( n e ) n n! \sim \sqrt{2 \pi n}(\frac{n}{e})^n n!∼2πn(en)n
m x x ! ∼ m x 2 π x ( x e ) x = e x m x 2 π x x x = e x 2 π x ( m x ) x ≤ ( e m x ) x \frac{m^x}{x!} \sim \frac{m^x}{\sqrt{2\pi x}(\frac{x}{e})^x}=\frac{e^xm^x}{\sqrt{2\pi x}x^x}=\frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \le (\frac{em}{x})^x x!mx∼2πx(ex)xmx=2πxxxexmx=2πxex(xm)x≤(xem)x
并且
e x 2 π x > 1 \frac{e^x}{\sqrt{2\pi x}} > 1 2πxex>1
所以
e x 2 π x ( m x ) x ≥ ( m x ) x \frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \ge (\frac{m}{x})^x 2πxex(xm)x≥(xm)x
即
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x (xm)x≤(xm)≤(xem)x
- m = n m=n m=n
+ find suitable x x x, such that P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x)=1-o(1) Pr(k≤x)=1−o(1)
+ k = Θ ( ln n ln ln n ) w . h . p k=\Theta(\frac{\ln n}{\ln \ln n})~w.h.p k=Θ(lnlnnlnn) w.h.p
令 x = ln n ln l n n x = \frac{\ln n}{\ln ln n} x=lnlnnlnn,先证下界:
P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x) = 1-o(1) Pr(k≤x)=1−o(1)
即证:
P r ( k ≥ x ) = o ( 1 ) Pr(k \ge x) = o(1) Pr(k≥x)=o(1)
于是,根据 Union Bound 有:
P r ( k ≥ x ) ≤ n ⋅ P r ( X 1 ≥ x ) ≤ n ⋅ ( m x ) ( 1 n ) x = n ⋅ ( n x ) ( 1 n ) x Pr(k \ge x) \le n \cdot Pr(X_1 \ge x) \le n \cdot \binom{m}{x}\left( \frac{1}{n} \right)^x = n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x Pr(k≥x)≤n⋅Pr(X1≥x)≤n⋅(xm)(n1)x=n⋅(xn)(n1)x
上一小节我们通过 斯特林公式(Stirling’s formula) 得到:
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x (xm)x≤(xm)≤(xem)x
代入,有:
n ⋅ ( n x ) ( 1 n ) x ≤ n ⋅ ( e n x ) x ( 1 n ) x = n ⋅ ( e x ) x = o ( 1 ) n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x \le n\cdot \left( \frac{en}{x} \right)^x \left( \frac{1}{n} \right)^x = n\cdot \left( \frac{e}{x} \right)^x = o(1) n⋅(xn)(n1)x≤n⋅(xen)x(n1)x=n⋅(xe)x=o(1)
再证上界:
P r ( k ≥ c ⋅ x ) = 1 − o ( 1 ) Pr(k \ge c \cdot x) = 1-o(1) Pr(k≥c⋅x)=1−o(1)
即证:
P r ( k ≤ c ⋅ x ) = P r ( E 1 ∧ ⋯ ∧ E n ) Pr(k \le c \cdot x) = Pr(E_1 \land \cdots \land E_n) Pr(k≤c⋅x)=Pr(E1∧⋯∧En)
其中, E i E_i Ei 表示:
x i ≤ c ⋅ x , Y i = { 1 , E i 没发生 0 , E i 发生 x_i \le c \cdot x,~Y_i=\begin{cases} 1, ~E_i\text{ 没发生}\\ 0, ~E_i\text{ 发生} \end{cases} xi≤c⋅x, Yi={1, Ei 没发生0, Ei 发生
则有:
P r ( k ≤ c ⋅ x ) = P r ( k ≤ c ⋅ x ) = P r ( ∀ i , Y i = 0 ) = P r ( ∑ i = 1 n Y i = 0 ) Pr(k \le c \cdot x) = Pr(k \le c \cdot x)=Pr(\forall i, Y_i=0) = Pr(\sum_{i=1}^{n}Y_i=0) Pr(k≤c⋅x)=Pr(k≤c⋅x)=Pr(∀i,Yi=0)=Pr(i=1∑nYi=0)
而上式不大于:
P r ( ∣ ∑ i = 1 n − E ( ∑ i = 1 n Y i ) ∣ ≥ E ( ∑ i = 1 n Y i ) ) ≤ σ 2 ( ∑ i = 1 n Y i ) ( E ( ∑ i = 1 n Y i ) ) 2 Pr \left( \left|\sum_{i=1}^{n} - E(\sum_{i=1}^{n}Y_i) \right| \ge E(\sum_{i=1}^{n}Y_i) \right) \le \frac{\sigma^2(\sum_{i=1}^{n}Y_i)}{(E(\sum_{i=1}^{n}Y_i))^2} Pr(∣∣∣∣∣i=1∑n−E(i=1∑nYi)∣∣∣∣∣≥E(i=1∑nYi))≤(E(∑i=1nYi))2σ2(∑i=1nYi)
(期望与方差的推导较长,暂时搁置,事后有时间再补), 故:
P r ( k < c x ) = P r ( Y 1 + Y 2 + ⋯ + Y n = 0 ) Pr(k<cx)=Pr(Y_1+Y_2+\cdots+Y_n=0) Pr(k<cx)=Pr(Y1+Y2+⋯+Yn=0)
≤ V a r ( ∑ i = 1 n Y i ) E 2 ( ∑ i = 1 n Y i ) = O ( n ( n 1 − c ) 2 ) ∼ 1 n 1 / 3 , ∴ c = 1 / 3 \le \frac{Var(\sum_{i=1}^{n}Y_i)}{E^2(\sum_{i=1}^{n}Y_i)} = O\left(\frac{n}{(n^{1-c})^2}\right) \sim \frac{1}{n^{1/3}},~~~\therefore c=1/3 ≤E2(∑i=1nYi)Var(∑i=1nYi)=O((n1−c)2n)∼n1/31, ∴c=1/3
ln n 3 ln ln n < k < ln n ln ln n \frac{\ln n}{3\ln\ln n}<k<\frac{\ln n}{\ln\ln n} 3lnlnnlnn<k<lnlnnlnn
Consider the case with n n n balls and n n n bins,
let X X X be the random variable of the number of empty bins. Compute E ( X ) E(X) E(X), and the deviation between X X X and E ( X ) E(X) E(X).
the result should be in the form P r ( ∣ X − E ( X ) ∣ > a ) < b Pr(|X-E(X)|>a)<b Pr(∣X−E(X)∣>a)<b
令 Z i Z_i Zi 表示第 i i i 个盒子里是否没有球: 没有球时为 Z i = 1 Z_i=1 Zi=1,反之为 Z i = 0 Z_i=0 Zi=0
则有
Y = ∑ i = 1 n Z i Y=\sum_{i=1}^{n}Z_i Y=i=1∑nZi
E ( Y ) = E ( ∑ i = 1 n Z i ) = ∑ i = 1 n E ( Z i ) = n E ( Z 1 ) E(Y)=E(\sum_{i=1}^{n}Z_i)=\sum_{i=1}^{n}E(Z_i)=nE(Z_1) E(Y)=E(i=1∑nZi)=i=1∑nE(Zi)=nE(Z1)
其中
E ( Z 1 ) = p ( Z 1 = 0 ) ⋅ 1 + p ( Z 1 = 1 ) ⋅ 0 = 1 − ( 1 − 1 n ) n = 1 − e − 1 E(Z_1)=p(Z_1=0)\cdot 1 + p(Z_1=1)\cdot 0 = 1 - (1-\frac{1}{n})^n = 1-e^{-1} E(Z1)=p(Z1=0)⋅1+p(Z1=1)⋅0=1−(1−n1)n=1−e−1
所以
E ( X ) = E ( n − Y ) = n − E ( Y ) = e − 1 n E(X) = E(n-Y) = n-E(Y) = e^{-1}n E(X)=E(n−Y)=n−E(Y)=e−1n
对于 λ > 0 \lambda > 0 λ>0
μ = E [ Z ] = n ( 1 − 1 n ) n ∼ n e − 1 \mu = E[Z] = n(1-\frac{1}{n})^n \sim ne^{-1} μ=E[Z]=n(1−n1)n∼ne−1
P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − λ 2 2 n ) Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2}{2n}) Pr[∣Z−μ∣≥λ]≤2⋅exp(−2nλ2)
特别地, 当 m ≫ n m \gg n m≫n 时:
μ = E [ Z ] = n ( 1 − 1 n ) m ∼ n e − m / n \mu = E[Z] = n(1-\frac{1}{n})^m \sim ne^{-m/n} μ=E[Z]=n(1−n1)m∼ne−m/n
P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − λ 2 ( n − 1 / 2 ) n 2 − μ 2 ) Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2(n-1/2)}{n^2-\mu^2}) Pr[∣Z−μ∣≥λ]≤2⋅exp(−n2−μ2λ2(n−1/2))
- m ≥ n ln n m \ge n\ln n m≥nlnn
+ k = Θ ( m n ) w . h . p k=\Theta (\frac{m}{n})~w.h.p k=Θ(nm) w.h.p
要证:
P r ( k ≥ c ⋅ m n ) = o ( 1 ) Pr(k \ge c \cdot \frac{m}{n}) = o(1) Pr(k≥c⋅nm)=o(1)
即证:
P r ( x 1 ≥ c m n o r x 2 ≥ c m n o r ⋯ o r x n ≥ c m n ) Pr(x_1 \ge c\frac{m}{n}~~or~~x_2 \ge c\frac{m}{n}~~or~\cdots~or~~x_n \ge c\frac{m}{n}) Pr(x1≥cnm or x2≥cnm or ⋯ or xn≥cnm)
而根据 Union Bound,
P r ( k ≥ c ⋅ m n ) ≤ n ⋅ P r ( x 1 ≥ c m n ) Pr(k \ge c \cdot \frac{m}{n}) \le n \cdot Pr(x_1 \ge c \frac{m}{n}) Pr(k≥c⋅nm)≤n⋅Pr(x1≥cnm)
先证上界:
P r ( x 1 ≥ c m n ) ≤ ( m c m n ) ( 1 n ) c m n ≤ ( e m c m n ) c m n ( 1 n ) c m n = ( e c ) c m n Pr \left(x_1 \ge c\frac{m}{n} \right) \le \binom{m}{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} \le \left( \frac{em}{c\frac{m}{n}} \right)^{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} = \left( \frac{e}{c} \right)^{c\frac{m}{n}} Pr(x1≥cnm)≤(cnmm)(n1)cnm≤(cnmem)cnm(n1)cnm=(ce)cnm
由于 m ≥ n ln n m \ge n\ln n m≥nlnn,
P r ( k ≥ c m n ) = ( e c ) c m n ≤ ( e c ) c ln n = o ( 1 / n ) Pr(k \ge c\frac{m}{n})= \left( \frac{e}{c} \right)^{c\frac{m}{n}} \le \left( \frac{e}{c} \right)^{c\ln n} = o(1/n) Pr(k≥cnm)=(ce)cnm≤(ce)clnn=o(1/n)
再证下界,根据 Chernoff’s Bound:
P r ( ∣ Y 1 + ⋯ + Y n − E ( Y 1 + ⋯ + Y n ) ∣ ) ≤ ? Pr\left( \left| Y_1 + \cdots + Y_n - E(Y_1 + \cdots + Y_n) \right| \right) \le~? Pr(∣Y1+⋯+Yn−E(Y1+⋯+Yn)∣)≤ ?
其中, Y i Y_i Yi 指 i i i-th ball 扔进了第一个盒子, X 1 = ∑ i = 1 m Y i , Y i = { 1 , 1 / n 0 , 1 − 1 / n X_1 = \sum_{i=1}^{m}Y_i,~~Y_i=\begin{cases} 1,~~1/n \\ 0,~~1-1/n \end{cases} X1=∑i=1mYi, Yi={1, 1/n0, 1−1/n
P r ( ∣ X 1 − m / n ∣ > c 1 m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ⋅ m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ln n ) = 2 1 n c 1 2 3 = o ( 1 n ) Pr( |X_1 - m/n| > c_1\frac{m}{n} ) \le 2 \cdot exp(-\frac{c_1^2}{3}\cdot\frac{m}{n}) \le 2\cdot exp(-\frac{c_1^2}{3}\ln n) = 2 \frac{1}{n^{\frac{c1^2}{3}}} = o(\frac{1}{n}) Pr(∣X1−m/n∣>c1nm)≤2⋅exp(−3c12⋅nm)≤2⋅exp(−3c12lnn)=2n3c121=o(n1)