Box-cox 变化公式如下:
y ( λ ) = { y λ − 1 λ λ ≠ 0 l n ( y ) λ = 0 y^{(\lambda)}=\left\{ \begin{aligned} \frac{y^{\lambda} - 1}{\lambda} && \lambda \ne 0 \\ ln(y) && \lambda = 0 \end{aligned} \right. y(λ)=⎩ ⎨ ⎧λyλ−1ln(y)λ=0λ=0
y ( λ ) = { ( y + a ) λ − 1 λ λ ≠ 0 l n ( y + a ) λ = 0 y^{(\lambda)}=\left\{ \begin{aligned} \frac{(y + a)^{\lambda} - 1}{\lambda} && \lambda \ne 0 \\ ln(y + a) && \lambda = 0 \end{aligned} \right. y(λ)=⎩ ⎨ ⎧λ(y+a)λ−1ln(y+a)λ=0λ=0
根据参数 λ \lambda λ的取值不同,box-cox变换包含了三类函数族:对数函数族、指数函数族、导致函数。
变换的目标是使得变换后因变量线性回归模型的等方差、不相关、正太等假设:
y ( λ ) = [ y 1 ( λ ) y 2 ( λ ) . . . y n ( λ ) ] ∼ N ( X β , σ 2 I ) \bold{y}^{(\lambda)} = \left[\begin{array}{c} y_1^{(\lambda)} \\ y_2^{(\lambda)} \\ ... \\ y_n^{(\lambda)} \end{array}\right]\sim\mathcal{N}(\bold{X}\bold{\beta}, \sigma^2\bold{I}) y(λ)= y1(λ)y2(λ)...yn(λ) ∼N(Xβ,σ2I)
L ( β , σ 2 ) = ( 1 2 π σ ) n e x p ( − 1 2 σ 2 ( y ( λ ) − X β ) ′ ( y ( λ ) − X β ) ) J L(\beta,\sigma^2) = (\frac{1}{\sqrt{2\pi}\sigma})^nexp(-\frac{1}{2\sigma^2}(\bold{y}^{(\lambda)} - \bold{X\beta})'(\bold{y}^{(\lambda)} - \bold{X\beta}))\bold{J} L(β,σ2)=(2πσ1)nexp(−2σ21(y(λ)−Xβ)′(y(λ)−Xβ))J
J = ∏ i = 1 n ∣ d y i ( λ ) d y i ∣ = ∏ i = 1 n y i λ − 1 \bold{J} = \prod_{i=1}^n|\frac{dy_i^{(\lambda)}}{dy_i}| = \prod_{i=1}^ny_i^{\lambda - 1} J=i=1∏n∣dyidyi(λ)∣=i=1∏nyiλ−1
当 λ \lambda λ固定, J J J是不依赖 β , σ 2 \beta,\sigma^2 β,σ2的常数。
求得 β , σ 2 \beta,\sigma^2 β,σ2的最大似然估计为:
β ^ = ( X ′ X ) − 1 X ′ y ( λ ) \hat{\beta} = (X'X)^{-1}X'y^{(\lambda)} β^=(X′X)−1X′y(λ)
σ ^ 2 = 1 n y ( λ ) ′ ( I − X ( X ′ X ) − 1 X ′ ) y ( λ ) = 1 n S S E ( λ , y ( λ ) ) , S S E ( λ , y ( λ ) ) = y ( λ ) ′ ( I − X ( X ′ X ) − 1 X ′ ) y ( λ ) \hat{\sigma}^2 = \frac{1}{n}y^{(\lambda)'}(I - X(X'X)^{-1}X')y^{(\lambda)} = \frac{1}{n}SSE(\lambda, y^{(\lambda)}), SSE(\lambda, y^{(\lambda)}) = y^{(\lambda)'}(I - X(X'X)^{-1}X')y^{(\lambda)} σ^2=n1y(λ)′(I−X(X′X)−1X′)y(λ)=n1SSE(λ,y(λ)),SSE(λ,y(λ))=y(λ)′(I−X(X′X)−1X′)y(λ)
对应的似然函数为:
L ( β ^ , σ ^ 2 ) = ( 2 π e S S E ( λ , y ( λ ) ) n ) − n 2 ∗ J L(\hat{\beta}, \hat{\sigma}^2) = (2\pi e \frac{SSE(\lambda, y^{(\lambda)})}{n})^{-\frac{n}{2}} * J L(β^,σ^2)=(2πenSSE(λ,y(λ)))−2n∗J
l n L ( β ^ , σ ^ 2 ) = − n 2 l n ( S S E ( λ , y λ ) ) + l n ( J ) = − n 2 l n ( S S E ( λ , z ( λ ) ) ) lnL(\hat{\beta},\hat{\sigma}^2) = -\frac{n}{2}ln(SSE(\lambda,y^{\lambda})) + ln(J) = -\frac{n}{2}ln(SSE(\lambda, z^{(\lambda)})) lnL(β^,σ^2)=−2nln(SSE(λ,yλ))+ln(J)=−2nln(SSE(λ,z(λ)))
z ( λ ) = y ( λ ) J z^{(\lambda)} = \frac{y^{(\lambda)}}{\bold{J}} z(λ)=Jy(λ)
为了找出 λ \lambda λ的极大似然估计,使得 S S E ( λ , z ( λ ) ) SSE(\lambda,z^{(\lambda)}) SSE(λ,z(λ))达到最小即可。