切诺夫界 (Chernoff bounds)

以下内容来自此处.
在本文中我们将首先给出若干结论, 再给出切诺夫界及其证明.
X X X为一随机变量, a ∈ R a\in \mathbb{R} aR, 则对于任意 s > 0 s>0 s>0, 由马尔科夫不等式有公式1:
Pr ⁡ ( X ≥ a ) = Pr ⁡ ( e s X ≥ e s a ) ≤ E ( e s X ) e s a \Pr(X\ge a) = \Pr(e^{sX}\ge e^{sa}) \le \frac{E(e^{sX})}{e^{sa}} Pr(Xa)=Pr(esXesa)esaE(esX)
类似的, 对于任意 s > 0 s>0 s>0, 由马尔科夫不等式有公式2:
Pr ⁡ ( X ≤ a ) = Pr ⁡ ( e − s X ≥ e − s a ) ≤ E ( e − s X ) e − s a \Pr(X\le a) = \Pr(e^{-sX} \ge e^{-sa}) \le \frac{E(e^{-sX})}{e^{-sa}} Pr(Xa)=Pr(esXesa)esaE(esX)

M X ( s ) = E ( e s X ) M_X(s) = E(e^{sX}) MX(s)=E(esX), 则由泰勒展开得
M X ( s ) = E ( 1 + s X + 1 2 s 2 X 2 + 1 3 ! s 3 X 3 + ⋯   ) = ∑ i = 0 ∞ 1 i ! s i E ( X i ) M_X(s) = E(1 + sX + \frac{1}{2}s^2X^2 + \frac{1}{3!}s^3X^3 + \cdots) = \sum_{i = 0}^\infty\frac{1}{i!}s^iE(X^i) MX(s)=E(1+sX+21s2X2+3!1s3X3+)=i=0i!1siE(Xi)

引理1. X 1 , ⋯   , X n X_1, \cdots, X_n X1,,Xn为独立随机向量, X = ∑ i = 1 n X i X=\sum_{i=1}^nX_i X=i=1nXi, 则
M X ( s ) = ∏ i = 1 n M X i ( s ) . M_X(s) = \prod_{i=1}^nM_{X_i}(s). MX(s)=i=1nMXi(s).
证明:
M X ( s ) = E ( e s X ) = E ( e s ∑ i = 1 n X i ) = E ( ∏ i = 1 n e s X i ) = ∏ i = 1 n E ( e s X i ) = ∏ i = 1 n M X i ( s ) M_X(s) = E(e^{sX}) = E(e^{s\sum_{i=1}^nX_i}) = E(\prod_{i=1}^n e^{sX_i}) = \prod_{i=1}^nE(e^{sX_i}) = \prod_{i=1}^nM_{X_i}(s) MX(s)=E(esX)=E(esi=1nXi)=E(i=1nesXi)=i=1nE(esXi)=i=1nMXi(s)

引理2. 假设 Y Y Y为一随机变量, 并且 Pr ⁡ ( Y = 1 ) = p , Pr ⁡ ( Y = 0 ) = 1 − p \Pr(Y=1)=p, \Pr(Y=0) = 1-p Pr(Y=1)=p,Pr(Y=0)=1p, 则对于任意 s ∈ R s\in\mathbb{R} sR, 有
M Y ( s ) = E ( e s Y ) ≤ e p ( e s − 1 ) M_Y(s)= E(e^{sY})\le e^{p(e^s - 1)} MY(s)=E(esY)ep(es1)
证明:
M Y ( s ) = E ( e s Y ) = p ⋅ e s + ( 1 − p ) ⋅ 1 = 1 + p ( e s − 1 ) M_Y(s) = E(e^{sY}) = p\cdot e^s + (1-p)\cdot 1 = 1 + p(e^s - 1) MY(s)=E(esY)=pes+(1p)1=1+p(es1)
因为 1 + y ≤ e y 1+y\le e^y 1+yey, 令 y = p ( e s − 1 ) y = p(e^s -1) y=p(es1), 则有 M Y ( s ) ≤ e p ( e s − 1 ) M_Y(s)\le e^{p(e^s - 1)} MY(s)ep(es1)

切诺夫界. X = ∑ i = 1 n X i X=\sum_{i=1}^n X_i X=i=1nXi, 其中 X 1 , ⋯   , X n X_1, \cdots, X_n X1,,Xn相互独立, 且 Pr ⁡ ( X i = 1 ) = p i , Pr ⁡ ( X i = 0 ) = 1 − p i \Pr(X_i = 1) = p_i, \Pr(X_i = 0) = 1-p_i Pr(Xi=1)=pi,Pr(Xi=0)=1pi. 又令 μ = E ( X ) = ∑ i = 1 n p i \mu = E(X) = \sum_{i=1}^np_i μ=E(X)=i=1npi, 则有:

  1. 上尾 (Upper Tail): ∀ δ > 0 , Pr ⁡ ( X ≥ ( 1 + δ ) μ ) ≤ e − δ 2 2 + δ μ \forall\delta>0, \Pr(X\ge (1+\delta)\mu) \le e^{-\frac{\delta^2}{2 + \delta}\mu} δ>0,Pr(X(1+δ)μ)e2+δδ2μ
  2. 下尾 (Lower Tail): ∀ 0 < δ < 1 , Pr ⁡ ( X ≤ ( 1 − δ ) μ ) ≤ e − δ 2 2 μ \forall 0<\delta<1, \Pr(X\le(1 - \delta)\mu)\le e^{-\frac{\delta^2}{2}\mu} 0<δ<1,Pr(X(1δ)μ)e2δ2μ

证明: 由引理1和引理2得,
M X ( s ) = ∏ i = 1 n M X i ( s ) ≤ ∏ i = 1 n e p i ( e s − 1 ) = e ( e s − 1 ) ∑ i = 1 n p i = e ( e s − 1 ) μ M_X(s) = \prod_{i = 1}^nM_{X_i}(s)\le \prod_{i=1}^ne^{p_i(e^s-1)}=e^{(e^s -1)\sum_{i=1}^np_i} = e^{(e^s - 1)\mu} MX(s)=i=1nMXi(s)i=1nepi(es1)=e(es1)i=1npi=e(es1)μ
我们首先证明切诺夫界的上尾.
因为由公式1有 Pr ⁡ ( X ≤ a ) ≤ E ( e s X ) e s a \Pr(X\le a) \le \frac{E(e^{sX})}{e^{sa}} Pr(Xa)esaE(esX). 令 a = ( 1 + δ ) μ , s = ln ⁡ ( 1 + δ ) a=(1+\delta)\mu, s = \ln(1+\delta) a=(1+δ)μ,s=ln(1+δ). 则有
Pr ⁡ ( X ≤ ( 1 + δ ) μ ) ≤ E ( e s X ) e s a = e s μ e s ( 1 + δ ) μ \Pr(X\le (1+\delta)\mu)\le \frac{E(e^{sX})}{e^{sa}} = \frac{e^{s\mu}}{e^{s(1+\delta)\mu}} Pr(X(1+δ)μ)esaE(esX)=es(1+δ)μesμ
而当 s > 0 s > 0 s>0时有 s < e s − 1 s < e^s - 1 s<es1, 所以
Pr ⁡ ( X ≤ ( 1 + δ ) μ ) ≤ e ( e s − 1 ) μ e s ( 1 + δ ) μ = ( e ( e s − 1 ) e s ( 1 + δ ) ) μ = ( e δ ( 1 + δ ) 1 + δ ) μ \Pr(X\le (1+\delta)\mu) \le \frac{e^{(e^s - 1)\mu}}{e^{s(1+\delta)\mu}} = (\frac{e^{(e^s -1)}}{e^{s(1+\delta)}})^\mu=(\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu Pr(X(1+δ)μ)es(1+δ)μe(es1)μ=(es(1+δ)e(es1))μ=((1+δ)1+δeδ)μ
ln ⁡ ( e δ ( 1 + δ ) 1 + δ ) μ = μ ( δ − ( 1 + δ ) ln ⁡ ( 1 + δ ) ) \ln(\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu = \mu(\delta - (1+\delta)\ln(1+\delta)) ln((1+δ)1+δeδ)μ=μ(δ(1+δ)ln(1+δ))
因为 ∀ x > 0 , ln ⁡ ( 1 + x ) ≥ x 1 + x / 2 \forall x>0, \ln(1+x)\ge\frac{x}{1 + x/2} x>0,ln(1+x)1+x/2x, 所以有
μ ( δ − ( 1 + δ ) ln ⁡ ( 1 + δ ) ) ≤ − δ 2 2 + δ μ \mu(\delta - (1+\delta)\ln(1+\delta))\le -\frac{\delta^2}{2+\delta}\mu μ(δ(1+δ)ln(1+δ))2+δδ2μ
所以
Pr ⁡ ( X ≤ ( 1 + δ ) μ ) ≤ ( e δ ( 1 + δ ) 1 + δ ) μ ≤ e − δ 2 2 + δ μ \Pr(X\le(1+\delta)\mu)\le (\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu\le e^{-\frac{\delta^2}{2+\delta}\mu} Pr(X(1+δ)μ)((1+δ)1+δeδ)μe2+δδ2μ

下尾的证明过程类似, 只是我们需要令 s = ln ⁡ ( 1 − δ ) s=\ln(1 - \delta) s=ln(1δ)并且使用如下不等式:
∀ 0 < δ < 1 , ln ⁡ ( 1 − δ ) ≥ − δ + δ s 2 \forall 0 < \delta < 1, \ln(1-\delta)\ge -\delta + \frac{\delta^s}{2} 0<δ<1,ln(1δ)δ+2δs

切诺夫界的非伯努利分布版本: X 1 , ⋯   , X n X_1, \cdots, X_n X1,,Xn为随机变量, 其中 a ≤ X i ≤ b , i = 1 , ⋯   , n a \le X_i \le b, i = 1, \cdots, n aXib,i=1,,n. 又令 X = ∑ i = 1 n X i , μ = E ( X ) X=\sum_{i=1}^nX_i, \mu=E(X) X=i=1nXi,μ=E(X), 则对于任意 δ > 0 \delta > 0 δ>0有:

  1. 上尾: Pr ⁡ ( X ≥ ( 1 + δ ) μ ) ≤ e − 2 δ 2 μ 2 n ( b − a ) 2 \Pr(X\ge (1 + \delta)\mu)\le e^{-\frac{2\delta^2\mu^2}{n(b-a)^2}} Pr(X(1+δ)μ)en(ba)22δ2μ2
  2. 下尾: Pr ⁡ ( X ≤ ( 1 − δ ) μ ) ≤ e − δ 2 μ 2 n ( b − a ) 2 \Pr(X\le (1-\delta)\mu)\le e^{-\frac{\delta^2\mu^2}{n(b-a)^2}} Pr(X(1δ)μ)en(ba)2δ2μ2

你可能感兴趣的:(数学)