Fisher的极大似然思想: 随机试验有多个可能结果, 但在一次实验中, 有且只有一个结果会出现. 如果在某次实验中, 结果 ω \omega ω出现了, 则认为该结果(事件{ ω \omega ω})发生的概率 P { ω } P\{\omega\} P{ω}最大.
假设总体 X X X是离散随机变量, 其分布律为:
P { X = a k } = p k ( θ ) ( k = 1 , 2 , . . . ) P\{X=a_k\}=p_k(\theta)(k=1, 2, ...) P{X=ak}=pk(θ)(k=1,2,...)
其中 θ ( θ ∈ Θ ) \theta(\theta\in \Theta) θ(θ∈Θ)是未知参数.
X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X X X的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. 即事件 { X 1 = x 1 , X 2 = x 2 , . . . , X n = x n } \{X_1=x_1, X_2=x_2, ..., X_n=x_n\} {X1=x1,X2=x2,...,Xn=xn}发生了.
由Fisher的极大似然思想可以得到, 概率: P { X 1 = x 1 , X 2 = x 2 , . . . , X n = x n } P\{X_1=x_1, X_2=x_2, ..., X_n=x_n\} P{X1=x1,X2=x2,...,Xn=xn}最大.
P { X 1 = x 1 , X 2 = x 2 , . . . , X n = x n } = P { X 1 = x 1 } P { X 2 = x 2 } ⋯ P { X n = x n } = P { X = x 1 } P { X = x 2 } ⋯ P { X = x n } = L ( θ ) \begin{aligned} &P\{X_1=x_1, X_2=x_2, ..., X_n=x_n\}\\ &=P\{X_1=x_1\}P\{X_2=x_2\}\cdots P\{X_n=x_n\}\\ &=P\{X=x_1\}P\{X=x_2\}\cdots P\{X=x_n\}=L(\theta) \end{aligned} P{X1=x1,X2=x2,...,Xn=xn}=P{X1=x1}P{X2=x2}⋯P{Xn=xn}=P{X=x1}P{X=x2}⋯P{X=xn}=L(θ)
定义1:
设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X X X的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值.
例子1: 设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X ∼ B ( 1 , p ) X\sim B(1,p) X∼B(1,p)的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. p p p是未知参数. 试写出似然函数.
解: P { X = x } = p x ( 1 − p ) 1 − x P\{X=x\}=p^x(1-p)^{1-x} P{X=x}=px(1−p)1−x其中 x ∈ { 0 , 1 } x\in \{0,1\} x∈{0,1}
L ( p ) = ∏ i = 1 n P { X i = x i } = ∏ i = 1 n p x i ( 1 − p ) 1 − x i = p n x ˉ ( 1 − p ) n ( 1 − x ˉ ) \begin{aligned} L(p)&=\prod_{i=1}^nP\{X_i=x_i\}\\ &=\prod_{i=1}^np^{x_i}(1-p)^{1-x_i}\\ &=p^{n\bar x}(1-p)^{n(1-\bar x)} \end{aligned} L(p)=i=1∏nP{Xi=xi}=i=1∏npxi(1−p)1−xi=pnxˉ(1−p)n(1−xˉ)
例子2: 设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X ∼ N ( μ , σ 2 ) X\sim N(\mu,\sigma^2) X∼N(μ,σ2)的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. μ , σ 2 \mu,\sigma^2 μ,σ2是未知参数. 试写出似然函数.
**解:**正态分布的密度函数 f ( x ) = 1 2 π σ e − ( x − μ ) 2 2 σ 2 f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}} f(x)=2πσ1e−2σ2(x−μ)2
则似然函数可以写为:
L ( μ , σ 2 ) = ∏ i = 1 n f ( x i ) = ∏ i = 1 n 1 2 π σ e − ( x i − μ ) 2 2 σ 2 = ( 1 2 π ) n ( σ 2 ) − n 2 e − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 \begin{aligned} L(\mu,\sigma^2)&=\prod_{i=1}^nf(x_i)\\ &=\prod_{i=1}^n\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x_i-\mu)^2}{2\sigma^2}}\\ &=(\frac{1}{\sqrt{2\pi}})^n(\sigma^2)^{-\frac{n}{2}}e^{-\frac{1}{2\sigma^2}\sum_{i=1}^n(x_i-\mu)^2} \end{aligned} L(μ,σ2)=i=1∏nf(xi)=i=1∏n2πσ1e−2σ2(xi−μ)2=(2π1)n(σ2)−2ne−2σ21∑i=1n(xi−μ)2
定义2
设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X X X的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. L ( θ ) ( θ ∈ Θ ) L(\theta)(\theta\in\Theta) L(θ)(θ∈Θ)是似然函数. 若存在统计量 θ ^ = θ ^ ( x 1 , x 2 , ⋯ , x n ) \hat \theta=\hat\theta(x_1,x_2,\cdots,x_n) θ^=θ^(x1,x2,⋯,xn)使得:
L ( θ ^ ) = sup θ ∈ Θ L ( θ ) L(\hat\theta)=\sup_{\theta\in\Theta}L(\theta) L(θ^)=θ∈ΘsupL(θ)
则称 θ ^ = θ ^ ( X 1 , X 2 , ⋯ , X n ) \hat \theta=\hat\theta(X_1,X_2,\cdots,X_n) θ^=θ^(X1,X2,⋯,Xn)为 θ \theta θ的极大似然估计量, 简记为MLE(Maximum Likehood Estimate)
说明:
若 l ( θ 1 , θ 2 , ⋯ , θ m ) l(\theta_1,\theta_2,\cdots,\theta_m) l(θ1,θ2,⋯,θm)关于 θ i ( i = 1 , 2 , ⋯ , m ) \theta_i(i=1,2,\cdots,m) θi(i=1,2,⋯,m)可导, 则称:
{ ∂ l ( θ 1 , θ 2 , ⋯ , θ m ) ∂ θ i = 0 ∂ l ( θ 1 , θ 2 , ⋯ , θ m ) ∂ θ i = 0 ⋮ ∂ l ( θ 1 , θ 2 , ⋯ , θ m ) ∂ θ i = 0 \left\{\begin{aligned} &\frac{\partial l(\theta_1,\theta_2,\cdots,\theta_m)}{\partial \theta_i}=0\\ &\frac{\partial l(\theta_1,\theta_2,\cdots,\theta_m)}{\partial \theta_i}=0\\ &\vdots\\ &\frac{\partial l(\theta_1,\theta_2,\cdots,\theta_m)}{\partial \theta_i}=0 \end{aligned} \right. ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧∂θi∂l(θ1,θ2,⋯,θm)=0∂θi∂l(θ1,θ2,⋯,θm)=0⋮∂θi∂l(θ1,θ2,⋯,θm)=0
为对数似然方程组.
例子3: 设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X ∼ B ( 1 , p ) X\sim B(1,p) X∼B(1,p)的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. p p p是未知参数. 试写出极大似然估计.
解: P { X = x } = p x ( 1 − p ) 1 − x P\{X=x\}=p^x(1-p)^{1-x} P{X=x}=px(1−p)1−x其中 x ∈ { 0 , 1 } x\in \{0,1\} x∈{0,1}
L ( p ) = ∏ i = 1 n P { X i = x i } = ∏ i = 1 n p x i ( 1 − p ) 1 − x i = p n x ˉ ( 1 − p ) n ( 1 − x ˉ ) \begin{aligned} L(p)&=\prod_{i=1}^nP\{X_i=x_i\}\\ &=\prod_{i=1}^np^{x_i}(1-p)^{1-x_i}\\ &=p^{n\bar x}(1-p)^{n(1-\bar x)} \end{aligned} L(p)=i=1∏nP{Xi=xi}=i=1∏npxi(1−p)1−xi=pnxˉ(1−p)n(1−xˉ)
则对数似然函数为:
l ( p ) = ln L ( p ) = n x ˉ ln p + n ( 1 − x ˉ ) ln ( 1 − p ) l(p)=\ln L(p)=n\bar x\ln p+n(1-\bar x)\ln(1-p) l(p)=lnL(p)=nxˉlnp+n(1−xˉ)ln(1−p)
对 l ( p ) l(p) l(p)求导:
d l ( p ) d p = n x ˉ 1 p − n ( 1 − x ˉ ) 1 1 − p = 0 ⇒ n x ˉ ( 1 − p ) − n ( 1 − x ˉ ) p = 0 ⇒ n x ˉ − n p = 0 ⇒ p ^ = x ˉ \begin{aligned} \frac{dl(p)}{dp}&=n\bar x\frac{1}{p}-n(1-\bar x)\frac{1}{1-p}=0\\ &\Rightarrow n\bar x(1-p)-n(1-\bar x)p=0\\ &\Rightarrow n\bar x-np=0\\ &\Rightarrow \hat p=\bar x \end{aligned} dpdl(p)=nxˉp1−n(1−xˉ)1−p1=0⇒nxˉ(1−p)−n(1−xˉ)p=0⇒nxˉ−np=0⇒p^=xˉ
例子4: 设 X 1 , X 2 , . . . , X n X_1, X_2, ..., X_n X1,X2,...,Xn是来自总体 X ∼ N ( μ , σ 2 ) X\sim N(\mu,\sigma^2) X∼N(μ,σ2)的样本, x 1 , x 2 , . . . , x n x_1, x_2, ..., x_n x1,x2,...,xn是样本的观测值. μ , σ 2 \mu,\sigma^2 μ,σ2是未知参数. 试写出似然函数.
**解:**正态分布的密度函数 f ( x ) = 1 2 π σ e − ( x − μ ) 2 2 σ 2 f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}} f(x)=2πσ1e−2σ2(x−μ)2
则似然函数可以写为:
L ( μ , σ 2 ) = ∏ i = 1 n f ( x i ) = ∏ i = 1 n 1 2 π σ e − ( x i − μ ) 2 2 σ 2 = ( 1 2 π ) n ( σ 2 ) − n 2 e − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 \begin{aligned} L(\mu,\sigma^2)&=\prod_{i=1}^nf(x_i)\\ &=\prod_{i=1}^n\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x_i-\mu)^2}{2\sigma^2}}\\ &=(\frac{1}{\sqrt{2\pi}})^n(\sigma^2)^{-\frac{n}{2}}e^{-\frac{1}{2\sigma^2}\sum_{i=1}^n(x_i-\mu)^2} \end{aligned} L(μ,σ2)=i=1∏nf(xi)=i=1∏n2πσ1e−2σ2(xi−μ)2=(2π1)n(σ2)−2ne−2σ21∑i=1n(xi−μ)2
则对数似然函数为:
l ( μ , σ 2 ) = − n 2 ln 2 π − n 2 ln σ 2 − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 l(\mu,\sigma^2)=-\frac{n}{2}\ln{2\pi}-\frac{n}{2}\ln \sigma^2-\frac{1}{2\sigma^2}\sum_{i=1}^n(x_i-\mu)^2 l(μ,σ2)=−2nln2π−2nlnσ2−2σ21i=1∑n(xi−μ)2
求导可得:
∂ l ∂ μ = 1 σ 2 ∑ i = 1 n ( x i − μ ) = 0 ∂ l ∂ σ 2 = − n 2 σ 2 + 1 2 σ 4 ∑ i = 1 n ( x i − μ ) 2 = 0 ⇒ μ ^ = 1 n ∑ i = 1 n x i = x ˉ ⇒ σ ^ 2 = 1 n ∑ i = 1 n ( x i − x ˉ ) 2 \begin{aligned} \frac{\partial l}{ \partial \mu}&=\frac{1}{\sigma^2}\sum_{i=1}^{n}(x_i-\mu)=0\\ \frac{\partial l}{ \partial \sigma^2}&=-\frac{n}{2\sigma^2}+\frac{1}{2\sigma^4}\sum_{i=1}^{n}(x_i-\mu)^2=0\\ &\Rightarrow \hat \mu=\frac{1}{n}\sum_{i=1}^{n}x_i=\bar x\\ &\Rightarrow \hat \sigma^2=\frac{1}{n}\sum_{i=1}^{n}(x_i-\bar x)^2 \end{aligned} ∂μ∂l∂σ2∂l=σ21i=1∑n(xi−μ)=0=−2σ2n+2σ41i=1∑n(xi−μ)2=0⇒μ^=n1i=1∑nxi=xˉ⇒σ^2=n1i=1∑n(xi−xˉ)2
定理: 设 θ ^ \hat \theta θ^是 θ \theta θ的极大似然估计, u = u ( θ ) u=u(\theta) u=u(θ)是函数 θ \theta θ的函数, 且有单值反函数:
θ = θ ( u ) \theta=\theta(u) θ=θ(u)
则 u ( θ ^ ) u(\hat \theta) u(θ^)是u的极大似然估计
例子5: 假设袋中有黑球和白球, 其中白球所占比例为 p ( 0 < p < 1 ) p(0 p(0<p<1)
问题: 矩估计是否有不变性?