一、基准方程推导
总体回归模型
y i = β 1 + β 2 x 2 i + β 3 x 3 i + ⋯ + β k x k i + ε i y_i= \beta_1+\beta_2x_{2i}+\beta_3x_{3i}+\cdots+\beta_kx_{ki}+\varepsilon_i yi=β1+β2x2i+β3x3i+⋯+βkxki+εi
样本回归模型
y i = β ^ 1 + β ^ 2 x 2 i + β ^ 3 x 3 i + ⋯ + β ^ k x k i + e i y_i= \hat\beta_1+\hat\beta_2x_{2i}+\hat\beta_3x_{3i}+\cdots+\hat\beta_kx_{ki}+e_i yi=β^1+β^2x2i+β^3x3i+⋯+β^kxki+ei
( y 1 y 2 ⋮ y i ) n × 1 ( 1 x 21 x 31 ⋯ x k 1 1 x 22 x 32 ⋯ x k 2 ⋮ ⋮ ⋮ ⋮ ⋮ 1 x 2 k x 3 k ⋯ x k k ) n × k ( β 1 β 2 ⋮ β k ) n × 1 + ( e 1 e 2 ⋮ e k ) n × 1 \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_i \end{pmatrix}_{n\times 1} \begin{pmatrix} 1 & x_{21} & x_{31} & \cdots & x_{k1} \\ 1 & x_{22} & x_{32} & \cdots & x_{k2} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_{2k} & x_{3k} & \cdots & x_{kk} \end{pmatrix}_{n\times k} \begin{pmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_k \end{pmatrix}_{n \times 1} + \begin{pmatrix} e_1 \\ e_2 \\ \vdots \\ e_k \end{pmatrix}_{n \times 1} ⎝⎜⎜⎜⎛y1y2⋮yi⎠⎟⎟⎟⎞n×1⎝⎜⎜⎜⎛11⋮1x21x22⋮x2kx31x32⋮x3k⋯⋯⋮⋯xk1xk2⋮xkk⎠⎟⎟⎟⎞n×k⎝⎜⎜⎜⎛β1β2⋮βk⎠⎟⎟⎟⎞n×1+⎝⎜⎜⎜⎛e1e2⋮ek⎠⎟⎟⎟⎞n×1
Y = X β ^ + e Y=X \pmb{\hat\beta} + \pmb{e} Y=Xβ^β^β^+eee
残差平方和
∑ e i 2 = ∑ ( y i − y ^ i ) 2 = ∑ [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] 2 \sum{e_i^2}= \sum(y_i-\hat y_i)^2= \sum{[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]^2} ∑ei2=∑(yi−y^i)2=∑[yi−(β^1+β^2x2i+⋯+β^kxki)]2
对残差平方和各参数( β 1 , β 2 ⋯ , β k \beta_1,\beta_2\cdots,\beta_k β1,β2⋯,βk)求导,并令其等于0
{ ∑ 2 e i ( − 1 ) = − 2 ∑ [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 ∑ 2 e i ( − x 2 i ) = − 2 ∑ x 2 i [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 ⋮ ∑ 2 e i ( − x k i ) = − 2 ∑ x k i [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 \begin{aligned} \begin{cases} \sum{2e_i}(-1) & =-2\sum{[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \\ \sum{2e_i}(-x_{2i}) &=-2\sum{x_{2i}[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \\ \vdots \\ \sum{2e_i}(-x_{ki}) &=-2\sum{x_{ki}[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \end{cases} \end{aligned} ⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧∑2ei(−1)∑2ei(−x2i)⋮∑2ei(−xki)=−2∑[yi−(β^1+β^2x2i+⋯+β^kxki)]=−2∑x2i[yi−(β^1+β^2x2i+⋯+β^kxki)]=−2∑xki[yi−(β^1+β^2x2i+⋯+β^kxki)]=0=0=0
得到正规方程组
{ ∑ e i = ∑ [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 ∑ e i x 2 i = ∑ x 2 i [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 ⋮ ∑ e i x k i = ∑ x k i [ y i − ( β ^ 1 + β ^ 2 x 2 i + ⋯ + β ^ k x k i ) ] = 0 \begin{aligned} \begin{cases} \sum{e_i} & =\sum{[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \\ \sum{e_i}x_{2i} &=\sum{x_{2i}[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \\ \vdots \\ \sum{e_i}x_{ki} &=\sum{x_{ki}[y_i-(\hat\beta_1+\hat\beta_2x_{2i}+\cdots +\hat\beta_kx_{ki} )]} &=0 \end{cases} \end{aligned} ⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧∑ei∑eix2i⋮∑eixki=∑[yi−(β^1+β^2x2i+⋯+β^kxki)]=∑x2i[yi−(β^1+β^2x2i+⋯+β^kxki)]=∑xki[yi−(β^1+β^2x2i+⋯+β^kxki)]=0=0=0
写成矩阵形式
( 1 1 ⋯ 1 x 21 x 22 ⋯ x 2 i ⋮ ⋮ ⋮ ⋮ x k 1 x k 2 ⋯ x k i ) ( e 1 e 2 ⋮ e 3 ) = X τ e = 0 \begin{pmatrix} 1 & 1 & \cdots & 1 \\ x_{21} & x_{22} & \cdots & x_{2i} \\ \vdots & \vdots & \vdots & \vdots \\ x_{k1} & x_{k2} & \cdots & x_{ki} \end{pmatrix} \begin{pmatrix} e_1 \\ e_2 \\ \vdots \\ e_3 \end{pmatrix} = X^{\tau}\pmb{e}=\pmb{0} ⎝⎜⎜⎜⎛1x21⋮xk11x22⋮xk2⋯⋯⋮⋯1x2i⋮xki⎠⎟⎟⎟⎞⎝⎜⎜⎜⎛e1e2⋮e3⎠⎟⎟⎟⎞=Xτeee=000
对样本回归模型 Y = X β ^ + e Y=X \pmb{\hat\beta} + \pmb{e} Y=Xβ^β^β^+eee两边同左乘 X τ X^{\tau} Xτ
X τ Y = X τ X β ^ + X τ e X^{\tau}Y = X^{\tau}X \pmb{\hat\beta} + X^{\tau}\pmb{e} \\ XτY=XτXβ^β^β^+Xτeee
由于 X τ e = 0 X^{\tau}\pmb{e}=\pmb{0} Xτeee=000,所以 X τ Y = X τ X β ^ X^{\tau} Y = X^{\tau} X \pmb{\hat\beta} XτY=XτXβ^β^β^
得到参数估计量
β ^ = ( X τ X ) − 1 X τ Y \pmb{\hat\beta} = (X^{\tau}X)^{-1}X^{\tau}Y β^β^β^=(XτX)−1XτY
二、矩阵形式推导
Q ( β ^ ) = e τ e = ( Y − X β ^ ) τ ( Y − X β ^ ) = ( Y τ − β ^ τ X τ ) ( Y − X β ^ ) = Y τ Y − Y τ X β ^ − β ^ τ X τ Y + β ^ τ X τ X β ^ = Y τ Y − 2 β ^ τ X τ Y + β ^ τ X τ X β ^ \begin{aligned} Q(\pmb{\hat\beta}) = \pmb{e^{\tau}}\pmb{e} &= (Y-X\pmb{\hat\beta})^{\tau}(Y-X\pmb{\hat\beta}) \\ & = (Y^{\tau}-\pmb{\hat\beta}^{\tau}X^{\tau})(Y-X\pmb{\hat\beta}) \\ & = Y^{\tau}Y-Y^{\tau}X\pmb{\hat\beta}-\pmb{\hat\beta}^{\tau}X^{\tau}Y+\pmb{\hat\beta}^{\tau}X^{\tau}X\pmb{\hat\beta} \\ & = Y^{\tau}Y-2\pmb{\hat\beta}^{\tau}X^{\tau}Y+\pmb{\hat\beta}^{\tau}X^{\tau}X\pmb{\hat\beta} \end{aligned} Q(β^β^β^)=eτeτeτeee=(Y−Xβ^β^β^)τ(Y−Xβ^β^β^)=(Yτ−β^β^β^τXτ)(Y−Xβ^β^β^)=YτY−YτXβ^β^β^−β^β^β^τXτY+β^β^β^τXτXβ^β^β^=YτY−2β^β^β^τXτY+β^β^β^τXτXβ^β^β^
对残差平方和各参数( β 1 , β 2 ⋯ , β k \beta_1,\beta_2\cdots,\beta_k β1,β2⋯,βk)求导,并令其等于0
∂ Q ( β ^ ) ∂ β ^ = − 2 X τ Y + 2 X τ X β ^ = 0 \frac{\partial Q(\pmb{\hat\beta})}{\partial \pmb{\hat\beta}} =-2X^{\tau}Y+2X^{\tau}X\pmb{\hat\beta}=\pmb{0} ∂β^β^β^∂Q(β^β^β^)=−2XτY+2XτXβ^β^β^=000
得到参数估计量
β ^ = ( X τ X ) − 1 X τ Y \pmb{\hat\beta} = (X^{\tau}X)^{-1}X^{\tau}Y β^β^β^=(XτX)−1XτY