1809年,高斯在研究《天体运动理论》的过程中发现其中的误差分布是正态分布。
比较接近原著的推导可以参考:https://zhuanlan.zhihu.com/p/387653090,但原著中高斯的数学直觉太强,有点难以理解,个人认为从以下方式理解更容易理解,故分享。
设误差密度函数为f(x),有n个独立观测值x1, x2, … , xn,真值为X。f(x)表示误差为x的概率,而误差=观测值-真值。假设每次观测都是独立且随机的,高斯认为误差密度函数f(x)应具有以下特点:
以下几点,高斯在原书中没有明确提到,但根据上面5点,可以认为是他的潜在假定:
极大似然函数为: L ( x ) = f ( x 1 − X ) f ( x 2 − X ) . . . f ( x n − X ) L(x) = f(x_1-X)f(x_2-X)...f(x_n-X) L(x)=f(x1−X)f(x2−X)...f(xn−X),我们希望L(x)最大,此时导数为0,即求令L(x)最大的f(x)。为了方便计算,做对数转换:
ln L ( x ) = ∑ i = 1 n ln f ( x i − X ) \ln{L(x)} = \sum_{i=1}^{n}\ln{f(x_i-X)} lnL(x)=∑i=1nlnf(xi−X)
再对两边求导:
d ln L ( x ) d x = − ∑ i = 1 n f ′ ( x i − X ) f ( x i − X ) = 0 \frac{d\ln{L(x)}}{dx} = -\sum_{i=1}^{n} \frac{f^{\prime}(x_i-X)}{f(x_i-X)} = 0 dxdlnL(x)=−∑i=1nf(xi−X)f′(xi−X)=0,我们希望求L(x)的最大值,所以令其导数为0。
记 g ( x ) = f ′ ( x ) f ( x ) g(x) = \frac{f^{\prime}(x)}{f(x)} g(x)=f(x)f′(x),则 ∑ i = 1 n g ( x i − X ) = 0 \sum_{i=1}^{n} g(x_i-X) = 0 ∑i=1ng(xi−X)=0,根据“高斯关于误差函数的设定”可知,g(x)是实数域R上的奇函数。可将 ∑ i = 1 n g ( x i − X ) \sum_{i=1}^{n} g(x_i-X) ∑i=1ng(xi−X)看成一个多元函数,现欲求多元函数的极值点,所以希望对所有参数的偏导都为0。此时高斯假设真值X的估计为 x ˉ \bar{x} xˉ,则有以下方程组:
g ′ ( x 1 − x ˉ ) ( 1 − 1 n ) + g ′ ( x 2 − x ˉ ) ( − 1 n ) + . . . + g ′ ( x n − x ˉ ) ( − 1 n ) = 0 g^{\prime}(x_1-\bar{x})(1-\frac{1}{n}) + g^{\prime}(x_2-\bar{x})(-\frac{1}{n}) + ... + g^{\prime}(x_n-\bar{x})(-\frac{1}{n}) = 0 g′(x1−xˉ)(1−n1)+g′(x2−xˉ)(−n1)+...+g′(xn−xˉ)(−n1)=0
g ′ ( x 1 − x ˉ ) ( − 1 n ) + g ′ ( x 2 − x ˉ ) ( 1 − 1 n ) + . . . + g ′ ( x n − x ˉ ) ( − 1 n ) = 0 g^{\prime}(x_1-\bar{x})(-\frac{1}{n}) + g^{\prime}(x_2-\bar{x})(1-\frac{1}{n}) + ... + g^{\prime}(x_n-\bar{x})(-\frac{1}{n}) = 0 g′(x1−xˉ)(−n1)+g′(x2−xˉ)(1−n1)+...+g′(xn−xˉ)(−n1)=0
…
g ′ ( x 1 − x ˉ ) ( − 1 n ) + g ′ ( x 2 − x ˉ ) ( − 1 n ) + . . . + g ′ ( x n − x ˉ ) ( 1 − 1 n ) = 0 g^{\prime}(x_1-\bar{x})(-\frac{1}{n}) + g^{\prime}(x_2-\bar{x})(-\frac{1}{n}) + ... + g^{\prime}(x_n-\bar{x})(1-\frac{1}{n}) = 0 g′(x1−xˉ)(−n1)+g′(x2−xˉ)(−n1)+...+g′(xn−xˉ)(1−n1)=0
利用齐次线性方程组解得: x = C ( 1 , 1 , . . . , 1 ) ⊤ x = C(1,1,...,1)^{\top} x=C(1,1,...,1)⊤
即 g ′ ( x 1 − x ˉ ) = g ′ ( x 2 − x ˉ ) = . . . = g ′ ( x n − x ˉ ) = C g^{\prime}(x_1-\bar{x}) = g^{\prime}(x_2-\bar{x}) = ... = g^{\prime}(x_n-\bar{x}) = C g′(x1−xˉ)=g′(x2−xˉ)=...=g′(xn−xˉ)=C
由于g(x)导数为C,则设 g ( x ) = C x + b g(x) = Cx + b g(x)=Cx+b
由于 ∑ i = 1 n g ( x i − X ) = ∑ i = 1 n C ( x i − x ˉ ) + n b = 0 \sum_{i=1}^{n} g(x_i-X) = \sum_{i=1}^{n}C(x_i-\bar{x}) + nb = 0 ∑i=1ng(xi−X)=∑i=1nC(xi−xˉ)+nb=0,所以 b = 0 b = 0 b=0
可得 f ′ ( x ) f ( x ) = C x \frac{f^{\prime}(x)}{f(x)} = Cx f(x)f′(x)=Cx,则 f ( x ) = k e 1 2 C x 2 f(x) = k e^{\frac{1}{2}Cx^2} f(x)=ke21Cx2
欲使 ∫ − ∞ + ∞ f ( x ) d x = 1 \int_{-\infty}^{+\infty}f(x) dx = 1 ∫−∞+∞f(x)dx=1,求得 C = − 1 σ 2 C = -\frac{1}{\sigma^2} C=−σ21
利用 ∫ − ∞ + ∞ e − x 2 d x = π \int_{-\infty}^{+\infty} e^{-x^2} dx = \sqrt{\pi} ∫−∞+∞e−x2dx=π,求得 k = 1 2 π σ k = \frac{1}{\sqrt{2\pi}\sigma} k=2πσ1
最后得正态分布概率分布函数: f ( x ) = 1 2 π σ e − x 2 2 σ 2 f(x) = \frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{x^2}{2\sigma^2}} f(x)=2πσ1e−2σ2x2