为什么样本的方差和样本的二阶中心矩并不一样

问题引入:

很多同学会很奇怪一个问题,那就是为什么样本的二阶中心矩 m 2 = 1 n ∑ i = 1 n ( X i − X ‾ ) 2 m_2=\frac{1}{n} \sum_{i = 1}^{n}(X_i - \overline{X})^2 m2=n1i=1n(XiX)2和样本的方差 S 2 = 1 n − 1 ∑ i = 1 n ( X i − X ‾ ) 2 S^2 = \frac{1}{n-1} \sum_{i=1}^{n}(X_i - \overline{X})^2 S2=n11i=1n(XiX)2相差一个常数因子 n − 1 n \frac{n-1}{n} nn1,即
m 2 = n − 1 n S 2 m_2 = \frac{n-1}{n}S^2 m2=nn1S2
这就涉及到估计量的无偏性问题。


1.发现问题

首先我们假设总体分布的期望和方差分别为 μ \mu μ σ 2 \sigma^2 σ2
下面我们来看看问题所在。
首先我们假设 S 2 S^2 S2 m 2 m_2 m2一样为 1 n ∑ i = 1 n ( X i − X ‾ ) 2 \frac{1}{n}\sum_{i=1}^{n}(X_i-\overline{X})^2 n1i=1n(XiX)2,那么有如下推导:
S 2 = 1 n ∑ i = 1 n ( X i − X ‾ ) 2 = 1 n ∑ i = 1 n ( X i − μ + μ − X ‾ ) 2 = 1 n ∑ i = 1 n [ ( X i − μ ) 2 + 2 ( X i − μ ) ( μ − X ‾ ) + ( μ − X ‾ ) 2 ] = 1 n ∑ i = 1 n ( X i − μ ) 2 + 2 n ∑ i = 1 n ( X i − μ ) ( μ − X ‾ ) + 1 n ∑ i = 1 n ( μ − X ‾ ) 2 = 1 n ∑ i = 1 n ( X i − μ ) 2 + 2 n ( μ − X ‾ ) ∑ i = 1 n ( X i − μ ) + 1 n ∑ i = 1 n ( μ − X ‾ ) 2 S^2 = \frac{1}{n}\sum_{i=1}^{n}(X_i - \overline{X})^2 = \frac{1}{n} \sum_{i=1}^{n}(X_i - \mu + \mu - \overline{X})^2 \\ =\frac{1}{n} \sum_{i = 1}^{n}[(X_i - \mu)^2 + 2(X_i-\mu)(\mu-\overline{X}) + (\mu - \overline{X})^2] \\=\frac{1}{n}\sum_{i=1}^{n}(X_i-\mu)^2 + \frac{2}{n}\sum_{i=1}^{n}(X_i-\mu)(\mu-\overline{X})+\frac{1}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2\\=\frac{1}{n}\sum_{i=1}^{n}(X_i-\mu)^2 + \frac{2}{n}(\mu-\overline{X})\sum_{i=1}^{n}(X_i-\mu)+\frac{1}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2 S2=n1i=1n(XiX)2=n1i=1n(Xiμ+μX)2=n1i=1n[(Xiμ)2+2(Xiμ)(μX)+(μX)2]=n1i=1n(Xiμ)2+n2i=1n(Xiμ)(μX)+n1i=1n(μX)2=n1i=1n(Xiμ)2+n2(μX)i=1n(Xiμ)+n1i=1n(μX)2
注意此时  ∵ ∑ i = 1 n X i = ∑ i = 1 n X ‾ \because \sum_{i=1}^{n}X_i=\sum_{i=1}^{n}\overline{X} i=1nXi=i=1nX ∴ \therefore  继续推导有:
S 2 = 1 n ∑ i = 1 n ( X i − μ ) 2 + 2 n ( μ − X ‾ ) ∑ i = 1 n ( X ‾ − μ ) + 1 n ∑ i = 1 n ( μ − X ‾ ) 2 = 1 n ∑ i = 1 n ( X i − μ ) 2 − 2 n ∑ i = 1 n ( μ − X ‾ ) 2 + 1 n ∑ i = 1 n ( μ − X ‾ ) 2 = 1 n ∑ i = 1 n ( X i − μ ) 2 − 1 n ∑ i = 1 n ( μ − X ‾ ) 2 = V a r ( X i ) − V a r ( X ‾ ) = σ 2 − V a r ( X ‾ ) S^2 = \frac{1}{n}\sum_{i=1}^{n}(X_i-\mu)^2 + \frac{2}{n}(\mu-\overline{X})\sum_{i=1}^{n}(\overline{X}-\mu)+\frac{1}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2 \\ =\frac{1}{n}\sum_{i=1}^{n}(X_i-\mu)^2 - \frac{2}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2+\frac{1}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2\\ = \frac{1}{n}\sum_{i=1}^{n}(X_i-\mu)^2 - \frac{1}{n}\sum_{i=1}^{n}(\mu-\overline{X})^2\\=Var(X_i)-Var(\overline{X})=\sigma^2 - Var(\overline{X}) S2=n1i=1n(Xiμ)2+n2(μX)i=1n(Xμ)+n1i=1n(μX)2=n1i=1n(Xiμ)2n2i=1n(μX)2+n1i=1n(μX)2=n1i=1n(Xiμ)2n1i=1n(μX)2=Var(Xi)Var(X)=σ2Var(X)
再次打断, ∵ \because 方差有如下性质:
1.若 c c c常数,则 V a r ( c X ) = c 2 V a r ( X ) Var(cX)=c^2Var(X) Var(cX)=c2Var(X)[推导很简单,不解释了]
2. V a r ( X 1 + X 2 + . . . + X n ) = V a r ( X 1 ) + V a r ( X 2 ) + . . . + V a r ( X n ) Var(X_1+X_2+...+X_n) = Var(X_1) + Var(X_2) +...+Var(X_n) Var(X1+X2+...+Xn)=Var(X1)+Var(X2)+...+Var(Xn)
且每个样本 X i X_i Xi都可视为一个随机变量,其分布同于总体分布,因此其方差也同于总体方差
∴ V a r ( X ‾ ) = V a r ( 1 n ∑ i = 1 n X i ) = 1 n 2 V a r ( ∑ i = 1 n X i ) = 1 n 2 ∑ i = 1 n V a r ( X i ) = 1 n 2 ⋅ n ⋅ V a r ( X i ) = 1 n σ 2 \therefore Var(\overline{X})=Var(\frac{1}{n}\sum_{i=1}^{n}X_i)=\frac{1}{n^2}Var(\sum_{i=1}^{n}X_i)\\=\frac{1}{n^2}\sum_{i=1}^{n}Var(X_i)=\frac{1}{n^2}\cdot n\cdot Var(X_i) = \frac{1}{n}\sigma^2 Var(X)=Var(n1i=1nXi)=n21Var(i=1nXi)=n21i=1nVar(Xi)=n21nVar(Xi)=n1σ2
∴ \therefore  继续推导有:
S 2 = σ 2 − V a r ( X ‾ ) = σ 2 − 1 n σ 2 = n − 1 n σ 2 S^2=\sigma^2 - Var(\overline{X})=\sigma^2 - \frac{1}{n}\sigma^2=\frac{n-1}{n}\sigma^2 S2=σ2Var(X)=σ2n1σ2=nn1σ2
至此就会发现用 m 2 m_2 m2来估计 σ 2 \sigma^2 σ2是不准确的。


2.证明问题

okay,那么接下来我们需要证明样本方差 S 2 = 1 n − 1 ∑ i = 1 n ( X i − X ‾ ) 2 S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(X_i-\overline{X})^2 S2=n11i=1n(XiX)2是总体分布方差 σ 2 \sigma^2 σ2无偏估计
∑ i = 1 n ( X i − X ‾ ) 2 = ∑ i = 1 n [ ( X i − μ ) − ( X ‾ − μ ) ] 2 = ∑ i = 1 n ( X i − μ ) 2 − 2 ( X ‾ − μ ) ∑ i = 1 n ( X i − μ ) + ∑ i = 1 n ( X ‾ − μ ) 2 = ∑ i = 1 n ( X i − μ ) 2 − 2 ( X ‾ − μ ) ∑ i = 1 n ( X i − μ ) + n ( X ‾ − μ ) 2 \sum_{i=1}^{n}(X_i-\overline{X})^2=\sum_{i=1}^{n}[(X_i-\mu)-(\overline{X}-\mu)]^2\\=\sum_{i=1}^{n}(X_i-\mu)^2-2(\overline{X}-\mu)\sum_{i=1}^{n}(X_i - \mu)+\sum_{i=1}^{n}(\overline{X}-\mu)^2 \\=\sum_{i=1}^{n}(X_i-\mu)^2-2(\overline{X}-\mu)\sum_{i=1}^{n}(X_i - \mu)+n(\overline{X}-\mu)^2 i=1n(XiX)2=i=1n[(Xiμ)(Xμ)]2=i=1n(Xiμ)22(Xμ)i=1n(Xiμ)+i=1n(Xμ)2=i=1n(Xiμ)22(Xμ)i=1n(Xiμ)+n(Xμ)2
注意此时  ∵ ∑ i = 1 n ( X i − μ ) = n ( X ‾ − μ ) \because \sum_{i=1}^{n}(X_i-\mu)=n(\overline{X}-\mu) i=1n(Xiμ)=n(Xμ) ∴ \therefore 继续有:
∑ i = 1 n ( X i − X ‾ ) 2 = ∑ i = 1 n ( X i − μ ) 2 − 2 n ( X ‾ − μ ) 2 + n ( X ‾ − μ ) 2 = ∑ i = 1 n ( X i − μ ) 2 − n ( X ‾ − μ ) 2 \sum_{i=1}^{n}(X_i-\overline{X})^2=\sum_{i=1}^{n}(X_i-\mu)^2-2n(\overline{X}-\mu)^2+n(\overline{X}-\mu)^2\\=\sum_{i=1}^{n}(X_i-\mu)^2-n(\overline{X}-\mu)^2 i=1n(XiX)2=i=1n(Xiμ)22n(Xμ)2+n(Xμ)2=i=1n(Xiμ)2n(Xμ)2
∵ \because 样本均值是总体分布均值的无偏估计
∴ E ( X i ) = E ( X ‾ ) = μ \therefore E(X_i)=E(\overline{X})=\mu E(Xi)=E(X)=μ

∴ \therefore 有:
E [ ∑ i = 1 n ( X i − μ ) 2 ] = ∑ i = 1 n E ( X i − μ ) 2 = ∑ i = 1 n V a r ( X i ) = n σ 2 E [ n ( X ‾ − μ ) 2 ] = n E ( X ‾ − μ ) 2 = n V a r ( X ‾ ) = n ⋅ 1 n 2 ∑ i = 1 n V a r ( X i ) = n ⋅ n n 2 ⋅ σ 2 = σ 2 E[\sum_{i=1}^{n}(X_i-\mu)^2]=\sum_{i=1}^nE(X_i-\mu)^2=\sum_{i=1}^{n}Var(X_i)=n\sigma^2\\ E[n(\overline{X}-\mu)^2]=nE(\overline{X}-\mu)^2=nVar(\overline{X})=n\cdot \frac{1}{n^2}\sum_{i=1}^{n}Var(X_i)=n\cdot \frac{n}{n^2}\cdot \sigma^2\\=\sigma^2 E[i=1n(Xiμ)2]=i=1nE(Xiμ)2=i=1nVar(Xi)=nσ2E[n(Xμ)2]=nE(Xμ)2=nVar(X)=nn21i=1nVar(Xi)=nn2nσ2=σ2
因此最终有:
E ( S 2 ) = 1 n − 1 E ( ∑ i = 1 n ( X i − X ‾ ) 2 ) = 1 n − 1 E ( ∑ i = 1 n ( X i − μ ) 2 − n ( X ‾ − μ ) 2 ) = 1 n − 1 ( n σ 2 − σ 2 ) = σ 2 E(S^2)=\frac{1}{n-1}E(\sum_{i=1}^{n}(X_i-\overline{X})^2)=\frac{1}{n-1}E(\sum_{i=1}^{n}(X_i-\mu)^2-n(\overline{X}-\mu)^2)\\=\frac{1}{n-1}(n\sigma^2-\sigma^2)=\sigma^2 E(S2)=n11E(i=1n(XiX)2)=n11E(i=1n(Xiμ)2n(Xμ)2)=n11(nσ2σ2)=σ2
至此说明了样本方差 S 2 = 1 n − 1 ∑ i = 1 n ( X i − X ‾ ) 2 S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(X_i-\overline{X})^2 S2=n11i=1n(XiX)2是总体分布方差 σ 2 \sigma^2 σ2无偏估计

你可能感兴趣的:(概率论与数理统计,统计学,机器学习)