Suppose the dataset contains two positive samples x ( 1 ) = [ 1 , 1 ] T x^{(1)}=[1,1]^T x(1)=[1,1]T and x ( 2 ) = [ 2 , 2 ] T x^{(2)}=[2,2]^T x(2)=[2,2]T, and two negative samples x ( 3 ) = [ 0 , 0 ] T x^{(3)}=[0,0]^T x(3)=[0,0]T and x ( 4 ) = [ − 1 , 0 ] T x^{(4)}=[-1,0]^T x(4)=[−1,0]T. Please calculate the SVM decision hyperplane.
min λ J ( λ ) = 1 2 ∑ i = 1 N ∑ j = 1 N λ i λ j y ( i ) y ( j ) ( x ( i ) ) T x ( j ) − ∑ i = 1 N λ i \min_\lambda\ {\mathcal{J}(\lambda)} = \frac{1}{2}\sum_{i=1}^N\sum_{j=1}^N \lambda_i\lambda_jy^{(i)}y^{(j)}(x^{(i)})^Tx^{(j)} - \sum_{i=1}^N\lambda_i λmin J(λ)=21i=1∑Nj=1∑Nλiλjy(i)y(j)(x(i))Tx(j)−i=1∑Nλi
s . t . λ i ⩾ 0 , ∑ i = 1 N λ i y ( i ) = 0 s.t. \ \ \ \ \ \ \ \ \lambda_i \geqslant 0,\ \ \ \ \ \ \sum_{i=1}^N\lambda_iy^{(i)}=0 s.t. λi⩾0, i=1∑Nλiy(i)=0
由 D a t a s e t D : { x : { [ 1 , 1 ] , [ 2 , 2 ] , [ 0 , 0 ] , [ − 1 , 0 ] } , y : { 1 , 1 , − 1 , − 1 } } Dataset\ D:\{x:\{[1,1],[2,2],[0,0],[-1,0]\},y:\{1,1,-1,-1\}\} Dataset D:{x:{[1,1],[2,2],[0,0],[−1,0]},y:{1,1,−1,−1}}可得下式:
min λ J ( λ ) = 1 2 ( 2 λ 1 2 + 8 λ 2 2 + λ 4 2 + 8 λ 1 λ 2 + 2 λ 1 λ 4 + 4 λ 2 λ 4 ) − λ 1 − λ 2 − λ 3 − λ 4 s . t λ 1 ⩾ 0 , λ 2 ⩾ 0 , λ 3 ⩾ 0 , λ 4 ⩾ 0 λ 1 + λ 2 − λ 3 − λ 4 = 0 \min_\lambda\ {\mathcal{J}(\lambda)} = \frac{1}{2}(2\lambda_1^2+8\lambda_2^2+\lambda_4^2+8\lambda_1\lambda_2+2\lambda_1\lambda_4+4\lambda_2\lambda_4) \\- \lambda_1-\lambda_2-\lambda_3-\lambda_4\\ s.t \ \ \ \ \ \ \ \lambda_1 \geqslant 0,\lambda_2\geqslant 0,\lambda_3\geqslant 0,\lambda_4\geqslant 0\\ \lambda_1+\lambda_2-\lambda_3-\lambda_4 = 0 λmin J(λ)=21(2λ12+8λ22+λ42+8λ1λ2+2λ1λ4+4λ2λ4)−λ1−λ2−λ3−λ4s.t λ1⩾0,λ2⩾0,λ3⩾0,λ4⩾0λ1+λ2−λ3−λ4=0
since λ 1 + λ 2 = λ 3 + λ 4 → λ 3 = λ 1 + λ 2 − λ 4 \lambda_1+\lambda_2 = \lambda_3+\lambda_4 \to \lambda_3 = \lambda_1+\lambda_2 - \lambda_4 λ1+λ2=λ3+λ4→λ3=λ1+λ2−λ4:
min λ J ( λ ) = λ 1 2 + 4 λ 2 2 + 1 2 λ 4 2 + 4 λ 1 λ 2 + λ 1 λ 4 + 2 λ 2 λ 4 − 2 λ 1 − 2 λ 2 s . t λ 1 ⩾ 0 , λ 2 ⩾ 0 ⟹ 求 偏 导 { ∂ J ∂ λ 1 = 2 λ 1 + 4 λ 2 + λ 4 − 2 = 0 ∂ J ∂ λ 2 = 4 λ 1 + 8 λ 2 + 2 λ 4 − 2 = 0 ∂ J ∂ λ 4 = λ 1 + 2 λ 2 + λ 4 = 0 \min_\lambda\ {\mathcal{J}(\lambda)} = \lambda_1^2+4\lambda_2^2+\frac{1}{2}\lambda_4^2+4\lambda_1\lambda_2+\lambda_1\lambda_4+2\lambda_2\lambda_4 - 2\lambda_1-2\lambda_2\\ s.t \ \ \ \ \ \ \ \lambda_1 \geqslant 0,\lambda_2\geqslant 0 \\ \\ \Longrightarrow ^{求偏导}\\ \left\{\begin{matrix} \frac{\partial \mathcal{J}}{\partial \lambda_1} = 2\lambda_1 +4\lambda_2+\lambda_4-2=0 \\ \frac{\partial \mathcal{J}}{\partial \lambda_2} = 4\lambda_1 +8\lambda_2+2\lambda_4-2=0 \\ \frac{\partial \mathcal{J}}{\partial \lambda_4} = \lambda_1 +2\lambda_2+\lambda_4=0 \end{matrix}\right. λmin J(λ)=λ12+4λ22+21λ42+4λ1λ2+λ1λ4+2λ2λ4−2λ1−2λ2s.t λ1⩾0,λ2⩾0⟹求偏导⎩⎨⎧∂λ1∂J=2λ1+4λ2+λ4−2=0∂λ2∂J=4λ1+8λ2+2λ4−2=0∂λ4∂J=λ1+2λ2+λ4=0
Lagrange无解,所以极小值在边界上:
同理可得: