正态分布最大似然估计的推导过程
对于一元正态分布要估计的参数为 θ = [ θ 1 , θ 2 ] T = [ μ , σ 2 ] T \theta=[\theta_1,\theta_2]^T=[\mu,\sigma^2]^T θ=[θ1,θ2]T=[μ,σ2]T,估计量为:
μ ^ = 1 N ∑ k = 1 N x k σ ^ 2 = 1 N ∑ k = 1 N ( x k − μ ^ ) 2 \begin{aligned} &\hat \mu=\frac{1}{N}\sum\limits_{k=1}^Nx_k \\ &\hat \sigma^2=\frac{1}{N}\sum\limits_{k=1}^N(x_k-\hat \mu)^2\end{aligned} μ^=N1k=1∑Nxkσ^2=N1k=1∑N(xk−μ^)2
对于多元正态分布有:
μ ^ = 1 N ∑ i = 1 N x i Σ ^ 2 = 1 N ∑ i = 1 N ( x i − μ ^ ) ( x i − μ ^ ) T \begin{aligned} &\hat \mu=\frac{1}{N}\sum\limits_{i=1}^Nx_i \\ &\hat \Sigma^2=\frac{1}{N}\sum\limits_{i=1}^N(x_i-\hat \mu)(x_i-\hat \mu)^T\end{aligned} μ^=N1i=1∑NxiΣ^2=N1i=1∑N(xi−μ^)(xi−μ^)T
独立同分布样本,样本集的联合分布为:
p ( H ∣ θ ) = ∏ i = 1 N p ( x i ∣ θ ) p( \mathscr{H} \mid \theta)=\prod_{i=1}^{N} p\left(x_{i} \mid \theta\right) p(H∣θ)=i=1∏Np(xi∣θ)
利用贝叶斯公式求theta的后验概率分布:
p ( θ ∣ H ) = p ( H ∣ θ ) p ( θ ) ∫ Θ p ( H ∣ θ ) p ( θ ) d θ p(\theta \mid \mathscr{H})=\frac{p(\mathscr{H} \mid \theta) p(\theta)}{\int_{\Theta} p(\mathscr{H} \mid \theta) p(\theta) \mathrm{d} \theta} p(θ∣H)=∫Θp(H∣θ)p(θ)dθp(H∣θ)p(θ)
由此可得theta的贝叶斯估计量为:
θ ∗ = ∫ Θ θ p ( θ ∣ H ) d θ \theta^{*}=\int_{\Theta} \theta p(\theta \mid \mathscr{H}) \mathrm{d} \theta θ∗=∫Θθp(θ∣H)dθ
或者直接由后验概率分布得到样本的概率密度函数:
p ( x ∣ H ) = ∫ Θ p ( x ∣ θ ) p ( θ ∣ H ) d θ p(x \mid \mathscr{H})=\int_{\Theta} p(x \mid \theta) p(\theta \mid \mathscr{H}) d \theta p(x∣H)=∫Θp(x∣θ)p(θ∣H)dθ
△△△△贝叶斯学习步骤△△△△
正态分布贝叶斯估计推导
假设样本模型均值 μ \mu μ是待估计参数,方差 σ 2 \sigma^2 σ2已知,且假定 μ \mu μ的先验分布满足 μ ∼ N ( μ 0 , σ 0 2 ) \mu \sim N(\mu_0,\sigma_0^2) μ∼N(μ0,σ02),由贝叶斯估计:
μ N = N σ 0 2 N σ 0 2 + σ 2 m N + σ 2 N σ 0 2 + σ 2 μ 0 σ N 2 = σ 0 2 σ 2 N σ 0 2 + σ 2 \begin{aligned} & \mu_N=\frac{N\sigma_0^2}{N\sigma_0^2+\sigma^2}m_N+\frac{\sigma^2}{N\sigma_0^2+\sigma^2}\mu_0 \\ &\sigma_N^2=\frac{\sigma_0^2\sigma^2}{N\sigma_0^2+\sigma^2} \end{aligned} μN=Nσ02+σ2Nσ02mN+Nσ02+σ2σ2μ0σN2=Nσ02+σ2σ02σ2
其中, m N = ∑ i = 1 N x i m_N=\sum\limits_{i=1}^Nx_i mN=i=1∑Nxi是所有观测样本的算术平均。
也可直接求出样本的概率密度函数 p ( x ∣ H ) ∼ N ( μ N , σ 2 + σ N 2 ) p(x|\mathscr{H}) \sim N(\mu_N,\sigma^2+\sigma_N^2) p(x∣H)∼N(μN,σ2+σN2)
直方图法,小区域范围内的概密:
p ( x ) ^ = k N V \hat{p(x)} =\frac{k}{NV} p(x)^=NVk
样本无穷多时 p ( x ) ^ \hat{p(x)} p(x)^收敛于 p ( x ) p(x) p(x)的条件是:
( 1 ) lim n → ∞ V n = 0 , ( 2 ) lim n → ∞ k n = ∞ , ( 3 ) lim n → ∞ k n n = 0 (1)\lim \limits_{n\rightarrow\infty} V_n=0, \ (2)\lim \limits_{n\rightarrow\infty} k_n=\infty, \ (3)\lim \limits_{n\rightarrow\infty} \frac{k_n}{n}=0 (1)n→∞limVn=0, (2)n→∞limkn=∞, (3)n→∞limnkn=0
( 1 ) lim n → ∞ V n = 0 , ( 2 ) lim n → ∞ k n = ∞ , ( 3 ) lim n → ∞ k n n = 0 (1)\lim \limits_{n\rightarrow\infty} V_n=0, \ (2)\lim \limits_{n\rightarrow\infty} k_n=\infty, \ (3)\lim \limits_{n\rightarrow\infty} \frac{k_n}{n}=0 (1)n→∞limVn=0, (2)n→∞limkn=∞, (3)n→∞limnkn=0
kn近邻法估计小区域的概密:
p ( x ) ^ = k N N V \hat{p(x)} =\frac{k_N}{NV} p(x)^=NVkN
方窗函数:
φ ( [ u 1 , u 2 , ⋯ , u d ] T ) = { 1 ∣ u j ∣ ⩽ 1 2 , j = 1 , 2 , ⋯ , d 0 其他 \varphi\left(\left[u_{1}, u_{2}, \cdots, u_{d}\right]^{\mathrm{T}}\right)= \begin{cases}1 & \left|u_{j}\right| \leqslant \frac{1}{2}, j=1,2, \cdots, d \\ 0 & \text { 其他 }\end{cases} φ([u1,u2,⋯,ud]T)={10∣uj∣⩽21,j=1,2,⋯,d 其他
parzen窗法小区域内的概密:
p ^ ( x ) = 1 N ∑ i = 1 N K ( x , x i ) \hat{p}(x)=\frac{1}{N} \sum_{i=1}^{N} K\left(x, x_{i}\right) p^(x)=N1i=1∑NK(x,xi)
其中窗函数与核函数的关系:
K ( x , x i ) = 1 V φ ( x − x i h ) K\left(x, x_{i}\right)=\frac{1}{V} \varphi\left(\frac{x-x_{i}}{h}\right) K(x,xi)=V1φ(hx−xi)
几种 p ^ ( x ) = 1 N ∑ i = 1 N K ( x , x i ) \hat{p}(x)=\frac{1}{N} \sum_{i=1}^{N} K\left(x, x_{i}\right) p^(x)=N1∑i=1NK(x,xi)中的核函数如下:
方窗:
$$
k(x,x_i)=
\left{
\begin{aligned}
&\frac{1}{h^d} \quad|xj-x_ij| \leq \frac{h}{2}, \quad j=1,2,\cdots,d \
&0 \quad其它
\end{aligned}
\right.
$$
其和方窗函数的关系:
k ( x , x i ) = 1 h d φ ( x − x i h ) k(x,x_i)=\frac{1}{h^d}\varphi(\frac{x-x_i}{h}) k(x,xi)=hd1φ(hx−xi)
高斯窗(正态窗)
k ( x , x i ) = 1 ( 2 π ) d ρ 2 d ∣ Q ∣ exp [ − 1 2 ( x − x i ) T Q − 1 ( x − x i ) ρ 2 ] k(x,x_i)=\frac{1}{\sqrt{(2\pi)^d\rho^{2d}|Q|}}\exp[{-\frac{1}{2}}\frac{(x-x_i)^TQ^{-1}(x-x_i)}{\rho^2}] k(x,xi)=(2π)dρ2d∣Q∣1exp[−21ρ2(x−xi)TQ−1(x−xi)]
如果是一维正态窗,有:
k ( x , x i ) = 1 2 π σ exp { − 1 2 ( x − x i σ ) 2 } k(x,x_i)=\frac{1}{\sqrt{2\pi}\sigma}\exp\{-\frac{1}{2}(\frac{x-x_i}{\sigma})^2\} k(x,xi)=2πσ1exp{−21(σx−xi)2}
超球窗:
$$
k(x,x_i)=
\left{
\begin{aligned}
&V^{-1} \quad||x-x_i|| \leq \rho \
&0 \quad其它
\end{aligned}
\right.
$$