一、填空题
Solution: k ( k − 1 ) n ( n − 1 ) \frac{k(k-1)}{n(n-1)} n(n−1)k(k−1).
P ( X 1 = 1 , X 2 = 1 ∣ ∑ i = 1 n X i = k ) = P ( X 1 = 1 , X 2 = 1 , ∑ i = 3 n X i = k − 2 ) C n k ( 1 2 ) n = 1 2 ⋅ 1 2 ⋅ C n − 2 k − 2 ( 1 2 ) n − 2 C n k ( 1 2 ) n = k ( k − 1 ) n ( n − 1 ) . \begin{aligned} P\left( X_1=1,X_2=1\left| \sum_{i=1}^n{X_i}=k \right. \right) &=\frac{P\left( X_1=1,X_2=1,\sum_{i=3}^n{X_i}=k-2 \right)}{C_{n}^{k}\left( \frac{1}{2} \right) ^n}\\ &=\frac{\frac{1}{2}\cdot \frac{1}{2}\cdot C_{n-2}^{k-2}\left( \frac{1}{2} \right) ^{n-2}}{C_{n}^{k}\left( \frac{1}{2} \right) ^n}=\frac{k\left( k-1 \right)}{n\left( n-1 \right)}.\\ \end{aligned} P(X1=1,X2=1 i=1∑nXi=k)=Cnk(21)nP(X1=1,X2=1,∑i=3nXi=k−2)=Cnk(21)n21⋅21⋅Cn−2k−2(21)n−2=n(n−1)k(k−1).
Solution: 1 3 ( 1 − ( − 1 2 ) n ) \frac{1}{3}\left( 1-\left( -\frac{1}{2} \right) ^n \right) 31(1−(−21)n).
考虑状态法, 设 a n , b n , c n a_n,b_n,c_n an,bn,cn 分别是它第 n n n 次之后位于三点的概率, 有 a 0 = 1 , b 0 = 0 , c 0 = 0 a_0=1,b_0=0,c_0=0 a0=1,b0=0,c0=0, 以及 a 1 = 0 , b 1 = 1 2 , c 1 = 1 2 a_1=0,b_1=\frac{1}{2},c_1=\frac{1}{2} a1=0,b1=21,c1=21. 显然所求概率应为 p n = 1 − a n − 1 2 p_n = \frac{1-a_{n-1}}{2} pn=21−an−1, 即它上一次之后不在 A A A 点的概率, 再等分给他可以前往的两点.
用全概率公式有
{ a n + 1 = 0 ⋅ a n + 1 2 ⋅ b n + 1 2 ⋅ c n , b n + 1 = 1 2 ⋅ a n + 0 ⋅ b n + 1 2 ⋅ c n , a n + 1 = 1 2 ⋅ a n + 1 2 ⋅ b n + 0 ⋅ c n , ⇒ ( a n + 1 b n + 1 c n + 1 ) = ( 0 1 2 1 2 1 2 0 1 2 1 2 1 2 0 ) ( a n b n c n ) , \begin{cases} a_{n+1}=0\cdot a_n+\frac{1}{2}\cdot b_n+\frac{1}{2}\cdot c_n,\\ b_{n+1}=\frac{1}{2}\cdot a_n+0\cdot b_n+\frac{1}{2}\cdot c_n,\\ a_{n+1}=\frac{1}{2}\cdot a_n+\frac{1}{2}\cdot b_n+0\cdot c_n,\\ \end{cases}\quad \Rightarrow \quad \left( \begin{array}{c} a_{n+1}\\ b_{n+1}\\ c_{n+1}\\ \end{array} \right) =\left( \begin{matrix} 0& \frac{1}{2}& \frac{1}{2}\\ \frac{1}{2}& 0& \frac{1}{2}\\ \frac{1}{2}& \frac{1}{2}& 0\\ \end{matrix} \right) \left( \begin{array}{c} a_n\\ b_n\\ c_n\\ \end{array} \right) , ⎩ ⎨ ⎧an+1=0⋅an+21⋅bn+21⋅cn,bn+1=21⋅an+0⋅bn+21⋅cn,an+1=21⋅an+21⋅bn+0⋅cn,⇒ an+1bn+1cn+1 = 021212102121210 anbncn ,
可以用这个矩阵 n n n 次方去算, 该方法称为马尔科夫链. 但是根据对称性, 我们知道 b n = c n b_n=c_n bn=cn, 故 b n = c n = 1 − a n 2 b_n=c_n=\frac{1-a_n}{2} bn=cn=21−an, 到最后只剩 a n a_n an 一个序列了, 我们反表示出 a n = 1 − 2 p n + 1 a_n=1-2p_{n+1} an=1−2pn+1, b n = c n = p n + 1 b_n=c_n=p_{n+1} bn=cn=pn+1, 代入第一个全概率公式即为
1 − 2 p n + 2 = p n + 1 , ⇒ p n + 1 − 1 3 = − 1 2 ( p n + 1 − 1 3 ) , 1-2p_{n+2}=p_{n+1},\quad \Rightarrow \quad p_{n+1}-\frac{1}{3}=-\frac{1}{2}\left( p_{n+1}-\frac{1}{3} \right) , 1−2pn+2=pn+1,⇒pn+1−31=−21(pn+1−31),
代入 p 0 = 0 p_0=0 p0=0, 解得
p n = 1 3 + ( − 1 2 ) n ( 0 − 1 3 ) = 1 3 ( 1 − ( − 1 2 ) n ) . p_n=\frac{1}{3}+\left( -\frac{1}{2} \right) ^n\left( 0-\frac{1}{3} \right) =\frac{1}{3}\left( 1-\left( -\frac{1}{2} \right) ^n \right) . pn=31+(−21)n(0−31)=31(1−(−21)n).
Solution: 0.8 0.8 0.8.
利用全概率公式得
P ( o u t : A C B A ) = 1 3 ⋅ ( 0. 8 2 ⋅ 0. 1 2 ) + 1 3 ⋅ ( 0. 1 3 ⋅ 0.8 ) + 1 3 ⋅ ( 0. 1 3 ⋅ 0.8 ) = 0.00266667 , P\left( \mathrm{out}:ACBA \right) =\frac{1}{3}\cdot \left( 0.8^2\cdot 0.1^2 \right) +\frac{1}{3}\cdot \left( 0.1^3\cdot 0.8 \right) +\frac{1}{3}\cdot \left( 0.1^3\cdot 0.8 \right) =0.00266667, P(out:ACBA)=31⋅(0.82⋅0.12)+31⋅(0.13⋅0.8)+31⋅(0.13⋅0.8)=0.00266667,
利用贝叶斯公式得
P ( i n : A A A A ) = 1 3 ⋅ ( 0. 8 2 ⋅ 0. 1 2 ) P ( o u t : A C B A ) = 0.8. P\left( \mathrm{in}:AAAA \right) =\frac{\frac{1}{3}\cdot \left( 0.8^2\cdot 0.1^2 \right)}{P\left( \mathrm{out}:ACBA \right)}=0.8. P(in:AAAA)=P(out:ACBA)31⋅(0.82⋅0.12)=0.8.
Solution: 是.
p p p 值依赖于观测到的样本, 属于统计量.
Solution: 2.
(1) (3) 显然错误, (2) (4) 正确.
Solution: A.
设三角形的边 B C = a BC=a BC=a, B B B 为原点, B C BC BC 为 x x x 轴, 则 Q ∼ U ( 0 , a ) Q\sim U(0,a) Q∼U(0,a), P ∼ U ( Δ A B C ) P\sim U(\Delta ABC) P∼U(ΔABC). 先取定 Q = ( q , 0 ) Q=(q,0) Q=(q,0), 连接 A Q AQ AQ, P P P 要落在 Δ A B Q \Delta ABQ ΔABQ 里才能满足题设条件, 故有
P r ( P ∈ Δ A B Q ∣ Q = q ) = S Δ A B Q S Δ A B C = q a , \mathrm{Pr}\left( P\in \Delta ABQ|Q=q \right) =\frac{S_{\Delta ABQ}}{S_{\Delta ABC}}=\frac{q}{a}, Pr(P∈ΔABQ∣Q=q)=SΔABCSΔABQ=aq,
再让 q q q 动起来, 有
P r ( P Q ∩ A B ) = ∫ 0 a q a ⋅ 1 a d q = 1 2 . \mathrm{Pr}\left( PQ\cap AB \right) =\int_0^a{\frac{q}{a}\cdot \frac{1}{a}}dq=\frac{1}{2}. Pr(PQ∩AB)=∫0aaq⋅a1dq=21.
Solution: B.
注意 C, D 并不满足分子分母的独立性.
Solution: λ n λ + μ \frac{\lambda n}{\lambda+\mu} λ+μλn.
P ( X = k ∣ X + Y = n ) = P ( X = k , Y = n − k ) P ( X + Y = n ) = λ k k ! e − λ μ n − k ( n − k ) ! e − μ ( λ + μ ) n n ! e − ( λ + μ ) = C n k ( λ λ + μ ) k ( μ λ + μ ) n − k , P\left( X=k|X+Y=n \right) =\frac{P\left( X=k,Y=n-k \right)}{P\left( X+Y=n \right)}=\frac{\frac{\lambda ^k}{k!}e^{-\lambda}\frac{\mu ^{n-k}}{\left( n-k \right) !}e^{-\mu}}{\frac{\left( \lambda +\mu \right) ^n}{n!}e^{-\left( \lambda +\mu \right)}}=C_{n}^{k}\left( \frac{\lambda}{\lambda +\mu} \right) ^k\left( \frac{\mu}{\lambda +\mu} \right) ^{n-k}, P(X=k∣X+Y=n)=P(X+Y=n)P(X=k,Y=n−k)=n!(λ+μ)ne−(λ+μ)k!λke−λ(n−k)!μn−ke−μ=Cnk(λ+μλ)k(λ+μμ)n−k,
因此 X + Y = n X+Y=n X+Y=n 时, X X X 的条件分布是 B ( n , λ λ + μ ) B(n,\frac{\lambda}{\lambda +\mu}) B(n,λ+μλ), 故期望是 λ n λ + μ \frac{\lambda n}{\lambda+\mu} λ+μλn.
CLT,忘了,比较简单
忘了
二、计算分析题
Solution: (1) 根据独立性, 有
f ( x , y ) = 1 2 e − 1 2 x , x > 0 , 0 < 1 < y . f(x,y)=\frac{1}{2}e^{-\frac{1}{2}x},\quad x>0,\quad 0<1
(2) 作变量变换, 有
{ Z = X + Y , W = Y , ⇒ { z = x + y , w = y , ⇒ { x = z − w , y = w , ⇒ J = ∣ 1 − 1 0 1 ∣ = 1 , \begin{cases} Z=X+Y,\\ W=Y,\\ \end{cases}\Rightarrow \begin{cases} z=x+y,\\ w=y,\\ \end{cases}\Rightarrow \begin{cases} x=z-w,\\ y=w,\\ \end{cases}\Rightarrow J=\left| \begin{matrix} 1& -1\\ 0& 1\\ \end{matrix} \right|=1, {Z=X+Y,W=Y,⇒{z=x+y,w=y,⇒{x=z−w,y=w,⇒J= 10−11 =1,
因此有
f Z , W ( z , w ) = f ( z − w , w ) = 1 2 e − z 2 + w 2 , z > w , 0 < w < 1 , f_{Z,W}\left( z,w \right) =f\left( z-w,w \right) =\frac{1}{2}e^{-\frac{z}{2}+\frac{w}{2}},\quad z>w,0
积掉 W W W, 得
f Z ( z ) = 1 2 e − z 2 ∫ 0 min { z , 1 } e w 2 d w = e − z 2 ( e min { z , 1 } 2 − 1 ) = { e − z 2 ( e 1 2 − 1 ) , z > 1 , 1 − e − z 2 , 0 < z < 1. f_Z\left( z \right) =\frac{1}{2}e^{-\frac{z}{2}}\int_0^{\min \left\{ z,1 \right\}}{e^{\frac{w}{2}}dw}=e^{-\frac{z}{2}}\left( e^{\frac{\min \left\{ z,1 \right\}}{2}}-1 \right) =\begin{cases} e^{-\frac{z}{2}}\left( e^{\frac{1}{2}}-1 \right) ,& z>1,\\ 1-e^{-\frac{z}{2}},& 0
(3) Δ = 4 X 2 − 4 Y \Delta = 4X^2 -4Y Δ=4X2−4Y, 故所求概率为 P ( X 2 ≥ Y ) P(X^2\ge Y) P(X2≥Y), 有
P ( X 2 ≥ Y ) = ∫ 0 1 P ( X ≥ y ) f Y ( y ) d y = ∫ 0 1 e − y 2 d y = 8 ( 1 − 3 2 e − 1 2 ) = 0.721632. P\left( X^2\ge Y \right) =\int_0^1{P\left( X\ge \sqrt{y} \right) f_Y\left( y \right) dy}=\int_0^1{e^{-\frac{\sqrt{y}}{2}}dy}=8\left( 1-\frac{3}{2}e^{-\frac{1}{2}} \right) =0.721632. P(X2≥Y)=∫01P(X≥y)fY(y)dy=∫01e−2ydy=8(1−23e−21)=0.721632.
Solution: (1) 求总体期望 E ( X ) E(X) E(X), 利用 X a ∼ B e t a ( 2 , 1 ) \frac{X}{a}\sim Beta(2,1) aX∼Beta(2,1) 或直接积分有 E ( X ) = 2 3 a E(X)=\frac{2}{3}a E(X)=32a, 由替换原理, 得 a ^ 1 = 3 2 x ˉ \hat{a}_1=\frac{3}{2}\bar{x} a^1=23xˉ.
再写似然函数, 有
L ( a ) = 2 n ∏ i = 1 n x i a 2 n , a > max { x ( n ) , 1 } , L\left( a \right) =\frac{2^n\prod_{i=1}^n{x_i}}{a^{2n}},\quad a>\max \left\{ x_{\left( n \right)},1 \right\} , L(a)=a2n2n∏i=1nxi,a>max{x(n),1},
可以看出似然函数关于 a a a 递减, 故有
a ^ 2 = max { x ( n ) , 1 } = { 1 , x ( n ) < 1 , x ( n ) , x ( n ) ≥ 1. \hat{a}_2=\max \{x_{(n)},1\}=\begin{cases} 1,& x_{\left( n \right)}<1,\\ x_{\left( n \right)},& x_{\left( n \right)}\ge 1.\\ \end{cases} a^2=max{x(n),1}={1,x(n),x(n)<1,x(n)≥1.
(2) 由于 E ( x ˉ ) = 2 3 a E(\bar{x})=\frac{2}{3}a E(xˉ)=32a, 显然 a ^ 1 \hat{a}_1 a^1 无偏.
对于 a ^ 2 \hat{a}_2 a^2, 先求 x ( n ) x_{(n)} x(n) 的分布, 有
P ( x ( n ) ≤ t ) = P n ( X ≤ t ) = ( t a ) 2 n , f n ( t ) = 2 n t 2 n − 1 a 2 n , 0 < t < a , P\left( x_{\left( n \right)}\le t \right) =P^n\left( X\le t \right) =\left( \frac{t}{a} \right) ^{2n},\quad f_n\left( t \right) =\frac{2nt^{2n-1}}{a^{2n}},\quad 0
实际上即为 x ( n ) a ∼ B e ( 2 n , 1 ) \frac{x_{(n)}}{a}\sim Be(2n,1) ax(n)∼Be(2n,1), 故有
E ( a ^ 2 ) = ∫ 0 1 f n ( t ) d t + ∫ 1 a t f n ( t ) d t = 1 a 2 n + 2 n 2 n + 1 ( a − 1 a 2 n ) = 2 n 2 n + 1 ⋅ a + 1 2 n + 1 ⋅ 1 a 2 n , E\left( \hat{a}_2 \right) =\int_0^1{f_n\left( t \right) dt}+\int_1^a{tf_n\left( t \right) dt}=\frac{1}{a^{2n}}+\frac{2n}{2n+1}\left( a-\frac{1}{a^{2n}} \right) =\frac{2n}{2n+1}\cdot a+\frac{1}{2n+1}\cdot \frac{1}{a^{2n}}, E(a^2)=∫01fn(t)dt+∫1atfn(t)dt=a2n1+2n+12n(a−a2n1)=2n+12n⋅a+2n+11⋅a2n1,
由于 a > 1 a>1 a>1, 故 1 a 2 n < 1 \frac{1}{a^{2n}}<1 a2n1<1, 因此
2 n 2 n + 1 ⋅ a + 1 2 n + 1 ⋅ 1 a 2 n < 2 n 2 n + 1 ⋅ a + 1 2 n + 1 < a . \frac{2n}{2n+1}\cdot a+\frac{1}{2n+1}\cdot \frac{1}{a^{2n}}<\frac{2n}{2n+1}\cdot a+\frac{1}{2n+1}
故 a ^ 2 \hat{a}_2 a^2 不无偏. 直接乘一个不含 a a a 的数不可能修正为无偏估计, 但我们发现在求期望的过程中, 如果写成
∫ 0 1 f n ( t ) d t + 2 n + 1 2 n ∫ 1 a t f n ( t ) d t = 1 a 2 n + ( a − 1 a 2 n ) = a , \int_0^1{f_n\left( t \right) dt}+\frac{2n+1}{2n}\int_1^a{tf_n\left( t \right) dt}=\frac{1}{a^{2n}}+\left( a-\frac{1}{a^{2n}} \right) =a, ∫01fn(t)dt+2n2n+1∫1atfn(t)dt=a2n1+(a−a2n1)=a,
则恰好是无偏估计, 这对应的估计量是
a ~ 2 = { 1 , x ( n ) < 1 , 2 n + 1 2 n x ( n ) , x ( n ) ≥ 1. \tilde{a}_2=\begin{cases} 1,& x_{\left( n \right)}<1,\\ \frac{2n+1}{2n}x_{\left( n \right)},& x_{\left( n \right)}\ge 1.\\ \end{cases} a~2={1,2n2n+1x(n),x(n)<1,x(n)≥1.
(3) 记 T n = n ( a − a ^ 2 ) T_n = n(a-\hat{a}_2) Tn=n(a−a^2), 则有
P ( T n ≤ t ) = P ( n ( a − a ^ 2 ) ≤ t ) = P ( a − a ^ 2 ≤ t n ) = P ( a ^ 2 ≥ a − t n ) , P\left( T_n\le t \right) =P\left( n\left( a-\hat{a}_2 \right) \le t \right) =P\left( a-\hat{a}_2\le \frac{t}{n} \right) =P\left( \hat{a}_2\ge a-\frac{t}{n} \right) , P(Tn≤t)=P(n(a−a^2)≤t)=P(a−a^2≤nt)=P(a^2≥a−nt),
对于 t > 0 t>0 t>0, 总有 n n n 足够大使得 a − t n > 1 a-\frac{t}{n}>1 a−nt>1, 因此
P ( a ^ 2 ≥ a − t n ) = P ( x ( n ) ≥ a − t n ) = 1 − ( 1 − t a n ) 2 n → 1 − e − 2 t a , t > 0 , P\left( \hat{a}_2\ge a-\frac{t}{n} \right) =P\left( x_{\left( n \right)}\ge a-\frac{t}{n} \right) =1-\left( 1-\frac{t}{an} \right) ^{2n}\rightarrow 1-e^{-\frac{2t}{a}},\quad t>0, P(a^2≥a−nt)=P(x(n)≥a−nt)=1−(1−ant)2n→1−e−a2t,t>0,
这说明 n ( a − a ^ 2 ) → d E x p ( 2 a ) n(a-\hat{a}_2)\xrightarrow{d}Exp(\frac{2}{a}) n(a−a^2)dExp(a2).
Solution: (1) 在回归分析中,如果两个或两个以上自变量之间存在相关性,这种自变量之间的相关性,就称作多重共线性,也称作自变量间的相关性。多重共线性的存在违背了线性回归模型的基本假设,变量之间的线性相关性将会导致矩阵 X T X X^TX XTX 不满秩,进而导致最小二乘估计不唯一。
(2) 可以借助方差膨胀因子 VIF 来判断共线性,计算公式是
V I F j = 1 1 − R j 2 , VIF_j = \frac{1}{1-R_j^2}, VIFj=1−Rj21,
一般我们认为 VIF > 10 时,存在多重共线性,该特征需要删除。
我们也可以分析矩阵 X T X X^TX XTX 的特征值,如果该矩阵的最小特征值非常接近于 0,我们也认为存在多重共线性。
(3) 可利用逐步回归筛选并剔除引起多重共线性的变量,其具体步骤如下:先用被解释变量对每一个所考虑的解释变量做简单回归,然后以对被解释变量贡献最大的解释变量所对应的回归方程为基础,再逐步引入其余解释变量。经过逐步回归,使得最后保留在模型中的解释变量既是重要的,又没有严重多重共线性。
(4) 在模型中加入自变量时,要尽量使得:残差平方和缩小或决定系数增大,若某一自变量被引入模型后 SSE 减小很多,说明该变量对反映变量 y y y 的作用大,可被引入;反之,说明其对 y y y 的作用小,不应该被引入。此外,还可以根据赤池信息准则(AIC)、贝叶斯信息准则(BIC)、对数似然函数值(LLH)等方法判断。
Solution: (1) 每个 A i A_i Ai 都是两点分布, 其参数是
p i ( λ ) = P ( a i < x i < b i ) = e − λ a i − e − λ b i , p_i(\lambda) = P(a_i
因此有
L A ( λ ) = ∏ i = 1 n p i A i ( 1 − p i ) 1 − A i = ∏ i = 1 n p i A i ⋅ ∏ i = 1 n ( 1 − p i ) 1 − A i , L_A\left( \lambda \right) =\prod_{i=1}^n{p_{i}^{A_i}\left( 1-p_i \right) ^{1-A_i}}=\prod_{i=1}^n{p_{i}^{A_i}}\cdot \prod_{i=1}^n{\left( 1-p_i \right) ^{1-A_i}}, LA(λ)=i=1∏npiAi(1−pi)1−Ai=i=1∏npiAi⋅i=1∏n(1−pi)1−Ai,
故有
ℓ A ( λ ) = ∑ i = 1 n A i ln p i + ∑ i = 1 n ( 1 − A i ) ln ( 1 − p i ) . \ell _A\left( \lambda \right) =\sum_{i=1}^n{A_i\ln p_i}+\sum_{i=1}^n{\left( 1-A_i \right) \ln \left( 1-p_i \right)}. ℓA(λ)=i=1∑nAilnpi+i=1∑n(1−Ai)ln(1−pi).
而全样本对应的对数似然函数是指数分布的联合密度取对数, 即
ℓ X ( λ ) = n ln λ − λ ∑ i = 1 n x i . \ell _X\left( \lambda \right) =n\ln \lambda -\lambda \sum_{i=1}^n{x_i}. ℓX(λ)=nlnλ−λi=1∑nxi.
(2) 记 q i ( λ ) = a i e − λ a i − b i e − λ b i q_i(\lambda) = a_ie^{-\lambda a_i} - b_ie^{-\lambda b_i} qi(λ)=aie−λai−bie−λbi, 实际上即 q i = − ∂ p i ∂ λ q_i = -\frac{\partial p_i}{\partial \lambda} qi=−∂λ∂pi, 求导有
∂ ℓ A ∂ λ = ∑ i = 1 n A i − q i p i + ∑ i = 1 n ( 1 − A i ) q i 1 − p i = − ∑ i = 1 n q i ( A i p i − 1 − A i 1 − p i ) = − ∑ i = 1 n q i A i − p i p i ( 1 − p i ) , \begin{aligned} \frac{\partial \ell _A}{\partial \lambda}&=\sum_{i=1}^n{A_i\frac{-q_i}{p_i}}+\sum_{i=1}^n{\left( 1-A_i \right) \frac{q_i}{1-p_i}}\\ &=-\sum_{i=1}^n{q_i\left( \frac{A_i}{p_i}-\frac{1-A_i}{1-p_i} \right)}=-\sum_{i=1}^n{q_i\frac{A_i-p_i}{p_i\left( 1-p_i \right)},}\\ \end{aligned} ∂λ∂ℓA=i=1∑nAipi−qi+i=1∑n(1−Ai)1−piqi=−i=1∑nqi(piAi−1−pi1−Ai)=−i=1∑nqipi(1−pi)Ai−pi,
因此 MLE λ ^ \hat{\lambda} λ^ 满足
∑ i = 1 n q i ( λ ^ ) A i − p i ( λ ^ ) p i ( λ ^ ) ( 1 − p i ( λ ^ ) ) = 0. \sum_{i=1}^n{q_i\left( \hat{\lambda} \right) \frac{A_i-p_i\left( \hat{\lambda} \right)}{p_i\left( \hat{\lambda} \right) \left( 1-p_i\left( \hat{\lambda} \right) \right)}}=0. i=1∑nqi(λ^)pi(λ^)(1−pi(λ^))Ai−pi(λ^)=0.
(3) 先求 E [ x k ∣ A k ] E[x_k|A_k] E[xk∣Ak], 有
E [ x i ∣ A i = 1 ] = E [ x i I { a i < x i < b i } ] P ( a i < x i < b i ) = 1 λ + q i p i , E\left[ x_i\mid A_i=1 \right] =\frac{E\left[ x_iI_{\left\{ a_i
其中分子利用了
∫ a i b i λ x e − λ x d x = 1 λ ∫ λ a i λ b i u e − u d u = 1 λ [ ( λ a i + 1 ) e − λ a i − ( λ b i + 1 ) e − λ b i ] = q i + p i λ . \int_{a_i}^{b_i}{\lambda xe^{-\lambda x}dx}=\frac{1}{\lambda}\int_{\lambda a_i}^{\lambda b_i}{ue^{-u}du}=\frac{1}{\lambda}\left[ \left( \lambda a_i+1 \right) e^{-\lambda a_i}-\left( \lambda b_i+1 \right) e^{-\lambda b_i} \right] =q_i+\frac{p_i}{\lambda}. ∫aibiλxe−λxdx=λ1∫λaiλbiue−udu=λ1[(λai+1)e−λai−(λbi+1)e−λbi]=qi+λpi.
同理用 E [ x i I { x ∉ ( a i , b i ) } ] = E [ x i ] − E [ x i I { x i ∈ ( a i , b i ) } ] E[x_iI_{\{x\notin (a_i,b_i)\}}]=E[x_i]-E[x_iI_{\{x_i\in(a_i,b_i)\}}] E[xiI{x∈/(ai,bi)}]=E[xi]−E[xiI{xi∈(ai,bi)}], 有
E [ x i ∣ A i = 0 ] = 1 − p i λ − q i 1 − p i = 1 λ − q i 1 − p i . E\left[ x_i\mid A_i=0 \right] =\frac{\frac{1-p_i}{\lambda}-q_i}{1-p_i}=\frac{1}{\lambda}-\frac{q_i}{1-p_i}. E[xi∣Ai=0]=1−piλ1−pi−qi=λ1−1−piqi.
因此有 E 步是:
Q ( λ ∣ λ k ) = E [ ℓ X ( λ ) ∣ A , λ k ] = n ln λ − λ ∑ i = 1 n E [ x i ∣ A i , λ k ] = n ln λ − λ ∑ i = 1 n ( 1 λ k + q i ( λ k ) ( A i p i ( λ k ) − 1 − A i 1 − p i ( λ k ) ) ) = n ln λ − n λ λ k − λ ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) . \begin{aligned} Q\left( \lambda |\lambda _k \right) &=E\left[ \ell _X\left( \lambda \right) |A,\lambda _k \right]\\ &=n\ln \lambda -\lambda \sum_{i=1}^n{E\left[ x_i\mid A_i,\lambda _k \right]}\\ &=n\ln \lambda -\lambda \sum_{i=1}^n{\left( \frac{1}{\lambda _k}+q_i\left( \lambda _k \right) \left( \frac{A_i}{p_i\left( \lambda _k \right)}-\frac{1-A_i}{1-p_i\left( \lambda _k \right)} \right) \right)}\\ &=n\ln \lambda -\frac{n\lambda}{\lambda _k}-\lambda \sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)}.\\ \end{aligned} Q(λ∣λk)=E[ℓX(λ)∣A,λk]=nlnλ−λi=1∑nE[xi∣Ai,λk]=nlnλ−λi=1∑n(λk1+qi(λk)(pi(λk)Ai−1−pi(λk)1−Ai))=nlnλ−λknλ−λi=1∑nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk).
再考虑 M 步: 对 Q ( λ ∣ λ k ) Q(\lambda|\lambda_k) Q(λ∣λk) 求极大化(注意 λ k \lambda_k λk 是常数, 只有 λ \lambda λ 是变量), 可以求导得
∂ Q ∂ λ = n λ − n λ k − ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) , \frac{\partial Q}{\partial \lambda}=\frac{n}{\lambda}-\frac{n}{\lambda _k}-\sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)}, ∂λ∂Q=λn−λkn−i=1∑nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk),
解得极值点满足
1 λ = 1 λ k + 1 n ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) , \frac{1}{\lambda}=\frac{1}{\lambda _k}+\frac{1}{n}\sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)}, λ1=λk1+n1i=1∑nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk),
故有
λ k + 1 = 1 1 λ k + 1 n ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) . \lambda _{k+1}=\frac{1}{\frac{1}{\lambda _k}+\frac{1}{n}\sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)}}. λk+1=λk1+n1∑i=1nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk)1.
(4) 该序列满足
1 λ k + 1 = 1 λ k + 1 n ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) , \frac{1}{\lambda _{k+1}}= \frac{1}{\lambda_k} + \frac{1}{n}\sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)}, λk+11=λk1+n1i=1∑nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk),
记 D ( λ k ) = − ∑ i = 1 n q i ( λ k ) A i − p i ( λ k ) p i ( λ k ) ( 1 − p i ( λ k ) ) D(\lambda_k) =- \sum_{i=1}^n{q_i\left( \lambda _k \right)}\frac{A_i-p_i\left( \lambda _k \right)}{p_i\left( \lambda _k \right) \left( 1-p_i\left( \lambda _k \right) \right)} D(λk)=−∑i=1nqi(λk)pi(λk)(1−pi(λk))Ai−pi(λk), 这恰好是 ℓ A \ell_A ℓA 在 λ k \lambda_k λk 点的导数, 而
1 λ k + 1 = 1 λ k − 1 n ⋅ D ( λ k ) , \frac{1}{\lambda_{k+1}} = \frac{1}{\lambda_k} - \frac{1}{n} \cdot D(\lambda_k), λk+11=λk1−n1⋅D(λk),
该序列保证了 λ k \lambda_k λk 在导数的同方向迭代, 即保证了函数值 ℓ A \ell_A ℓA 的上升, 因此 { λ n } \{\lambda_n\} {λn} 一定收敛到 ℓ A \ell_A ℓA 的某个驻点, 即导数为 0 的点, 即 λ ^ \hat{\lambda} λ^.