LDA(Fisher)线性判别分析

LDA(Fisher)线性判别分析

对于二分类问题若存在一个 y i = W x i y_i=Wx_i yi=Wxi将样本 X \pmb X X投影到一维空间上

LDA(Fisher)线性判别分析_第1张图片

为了使两个样本能够较好的分开,应该是的每一个同类的样本的方差(离散程度)尽可能的小,而不同类的样本的尽可能的远

设样本可以分为 w 1 w_1 w1 w 2 w_2 w2两类

则我们可以计算

各类样本的类内均值向量
m i = 1 N i ∑ x ∈ w i x m i ˉ = 1 N i ∑ y ∈ w i y m_i=\frac{1}{N_i}\sum_{x \in w_i}x\\ \bar{m_i}=\frac{1}{N_i}\sum_{y \in w_i}y mi=Ni1xwixmiˉ=Ni1ywiy

各类样本的类内离散度矩阵
S i = ∑ x ∈ w i ( x − m i ) ( x − m i ) T S i ˉ = ∑ y ∈ w i ( y − m i ˉ ) 2 S_i=\sum_{x \in w_i}(x-m_i){(x-m_i)}^T\\ \bar{S_i}=\sum_{y \in w_i}{(y-\bar{m_i})}^2 Si=xwi(xmi)(xmi)TSiˉ=ywi(ymiˉ)2
总体样本的类内离散度矩阵
S w = S 1 + S 2 S w ˉ = S 1 ˉ + S 2 ˉ S_w=S_1+S_2\\ \bar{S_w}=\bar{S_1}+\bar{S_2} Sw=S1+S2Swˉ=S1ˉ+S2ˉ
样本的类间离散度矩阵
S b = ( m 1 − m 2 ) ( m 1 − m 2 ) T S b ˉ = ( m 1 ˉ − m 2 ˉ ) ( m 1 ˉ − m 2 ˉ ) T S_b=(m_1-m_2){(m_1-m_2)}^T\\ \bar{S_b}=(\bar{m_1}-\bar{m_2}){(\bar{m_1}-\bar{m_2})}^T Sb=(m1m2)(m1m2)TSbˉ=(m1ˉm2ˉ)(m1ˉm2ˉ)T
Fisher准则函数
J F ( W ) = ( m 1 ˉ − m 2 ˉ ) 2 S 1 ˉ + S 2 ˉ J_F(W)=\frac{{(\bar{m_1}-\bar{m_2})}^2}{\bar{S_1}+\bar{S_2}} JF(W)=S1ˉ+S2ˉ(m1ˉm2ˉ)2
由此我们优化的目标时使得 J F ( W ) J_F(W) JF(W)最大
( m 1 ˉ − m 2 ˉ ) 2 = W ( m 1 − m 2 ) ( m 1 − m 2 ) T W T = W S b W T \begin{aligned}{ (\bar{m_1}-\bar{m_2})}^2&=W(m_1-m_2)(m_1-m_2)^TW^T\\ &=WS_bW^T\\ \end{aligned} (m1ˉm2ˉ)2=W(m1m2)(m1m2)TWT=WSbWT

S i ˉ = ∑ y ∈ w i ( y − m i ˉ ) 2 = ∑ x ∈ w i ( W x − W m i ) 2 = ∑ x ∈ w i W ( x − m i ) ( x − m i ) W T = W S i W T \begin{aligned} \bar{S_i}&=\sum_{y \in w_i}{(y-\bar{m_i})}^2\\ &=\sum_{x \in w_i}(Wx-Wm_i)^2\\ &=\sum_{x \in w_i}W(x-m_i)(x-m_i)W^T\\ &=WS_iW^T \end{aligned} Siˉ=ywi(ymiˉ)2=xwi(WxWmi)2=xwiW(xmi)(xmi)WT=WSiWT

J F ( W ) = ( m 1 ˉ − m 2 ˉ ) 2 S 1 ˉ + S 2 ˉ = W S b W T W ( S 1 + S 2 ) W T = W S b W T W S w W T \begin{aligned} J_F(W)&=\frac{{(\bar{m_1}-\bar{m_2})}^2}{\bar{S_1}+\bar{S_2}}\\ &=\frac{WS_bW^T}{W(S_1+S_2)W^T}\\ &=\frac{WS_bW^T}{WS_wW^T} \end{aligned} JF(W)=S1ˉ+S2ˉ(m1ˉm2ˉ)2=W(S1+S2)WTWSbWT=WSwWTWSbWT

采用拉格朗日乘数法
L ( W , λ ) = W S b W T − λ ( W S w W T − c ) L(W,\lambda)=WS_bW^T-\lambda(WS_wW^T-c) L(W,λ)=WSbWTλ(WSwWTc)
W W W求偏导数
∂ L ( W , λ ) ∂ W = S b W − λ S w W \begin{aligned} \frac{\partial L(W,\lambda)}{\partial W}=S_bW-\lambda S_wW \end{aligned} WL(W,λ)=SbWλSwW
偏导数为0
∂ L ( W , λ ) ∂ W = S b W − λ S w W = 0 \begin{aligned} \frac{\partial L(W,\lambda)}{\partial W}=S_bW-\lambda S_wW=0 \end{aligned} WL(W,λ)=SbWλSwW=0
则存在
S b W = λ S w W S_bW=\lambda S_wW SbW=λSwW
因为 S w S_w Sw为非奇异矩阵可得到
S w − 1 S b W = λ W S_w^{-1}S_bW=\lambda W Sw1SbW=λW
可以视为求矩阵 S w − 1 S b S_w^{-1}S_b Sw1Sb的特征向量
S w − 1 S b W = S w − 1 ( m 1 − m 2 ) ( m 1 − m 2 ) T W S_w^{-1}S_bW=S_w^{-1}(m_1-m_2)(m_1-m_2)^TW Sw1SbW=Sw1(m1m2)(m1m2)TW
( m 1 − m 2 ) T W (m_1-m_2)^TW (m1m2)TW为一个标量设为R,则
λ W = S w − 1 ( m 1 − m 2 ) R \lambda W=S_w^{-1}(m_1-m_2)R λW=Sw1(m1m2)R
于是
W = R λ S w − 1 ( m 1 − m 2 ) W=\frac{R}{\lambda}S_w^{-1}(m_1-m_2) W=λRSw1(m1m2)
由于寻找对是W的方向上的向量,所以
W ∗ = S w − 1 ( m 1 − m 2 ) W^*=S_w^{-1}(m_1-m_2) W=Sw1(m1m2)
综上所述,存在 W ∗ W^* W使得LDA可以较好的解决二分类问题。

你可能感兴趣的:(机器学习,机器学习)