min 1 2 ∥ w ∥ 2 s . t . y i ( w ⋅ x i + b ) ≥ 1 , i = 1 , 2 , … , 5 \min \frac{1}{2} {\parallel w \parallel}^2 \\ s.t. y_i(w \cdot x_i + b) \ge 1,i=1,2,\ldots,5 min21∥w∥2s.t.yi(w⋅xi+b)≥1,i=1,2,…,5
将 ( x 1 , y 1 ) , … , ( x 5 , y 5 ) (x_1,y_1),\ldots,(x_5,y_5) (x1,y1),…,(x5,y5)代入,得
min 1 2 ( w 1 2 + w 2 2 ) s . t . w 1 + 2 w 2 + b ≥ 1 ① 2 w 1 + 3 w 2 + b ≥ 1 ② 3 w 1 + 3 w 2 + b ≥ 1 ③ − 2 w 1 − w 2 − b ≥ 1 ④ − 3 w 1 − 2 w 2 − b ≥ 1 ⑤ \min \frac{1}{2} (w_1^2 + w_2^2) \\ s.t. w_1 + 2 w_2 + b\ge 1 \quad① \\ \quad\;2 w_1 + 3 w_2 +b\ge 1 \quad ② \\ \quad\; 3w_1 + 3 w_2 + b\ge 1 \quad ③\\ \quad-2w_1 - w_2 - b\ge 1 \quad ④ \\ \quad-3w_1 - 2w_2 - b\ge 1 \quad ⑤ min21(w12+w22)s.t.w1+2w2+b≥1①2w1+3w2+b≥1②3w1+3w2+b≥1③−2w1−w2−b≥1④−3w1−2w2−b≥1⑤
① + ④ , 得 − w 1 + w 2 ≥ 2 ① + ⑤ , 得 − 2 w 2 ≥ 2 ② + ④ , 得 2 w 2 ≥ 2 ② + ⑤ , 得 − w 1 + w 2 ≥ 2 ③ + ④ , 得 w 1 + 2 w 2 ≥ 2 ③ + ⑤ , 得 w 2 ≥ 2 ① + ④, 得\; -w_1 + w_2 \ge 2 \\ ① + ⑤,得\; -2 w_2 \ge 2\\②+④, 得\; 2w_2 \ge 2 \\ ② +⑤, 得\; -w_1 + w_2 \ge 2 \\ ③ + ④, 得 \; w_1 + 2w_2 \ge 2 \\③ + ⑤, 得\; w_2 \ge 2 ①+④,得−w1+w2≥2①+⑤,得−2w2≥2②+④,得2w2≥2②+⑤,得−w1+w2≥2③+④,得w1+2w2≥2③+⑤,得w2≥2
由上述可知
− w 1 + w 2 ≥ 2 w 1 ≤ − 1 w 2 ≥ 2 -w_1 + w_2 \ge 2 \\ w_1 \le -1\\w_2 \ge 2 −w1+w2≥2w1≤−1w2≥2
为使 w 1 2 + w 1 2 w_1^2 + w_1^2 w12+w12最小,令 w 1 = − 1 , w 2 = 2 w_1 = -1, w_2 = 2 w1=−1,w2=2
代入 ① − ⑤ ① - ⑤ ①−⑤得, b = − 2 b=-2 b=−2
y 1 ( w ⋅ x 1 + b ) = y 3 ( w ⋅ x 3 + b ) = y 5 ( w ⋅ x 5 + b ) = 1 y_1(w \cdot x_1 + b) = y_3(w \cdot x_3 + b) = y_5(w \cdot x_5 + b) = 1 y1(w⋅x1+b)=y3(w⋅x3+b)=y5(w⋅x5+b)=1
支持向量为 x 1 = ( 1 , 2 ) T , x 3 = ( 3 , 3 ) T , x 5 = ( 3 , 2 ) T x_1=(1,2)^T, x_3=(3,3)^T,x_5=(3,2)^T x1=(1,2)T,x3=(3,3)T,x5=(3,2)T
最大间隔分离超平面为 − x 1 + 2 x 2 − 2 = 0 -x_1 +2x_2-2=0 −x1+2x2−2=0
分类决策函数为 f ( x ) = s i g n ( − x 1 + 2 x 2 − 2 ) f(x)=sign(-x_1+2x_2-2) f(x)=sign(−x1+2x2−2)
min w , b , ξ 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 s . t . y i ( w ⋅ x i + b ) ≥ 1 − ξ i , i = 1 , 2 , … , N ξ i ≥ 0 , i = 1 , 2 , … , N \min_{w,b,\xi} \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 \\s.t. \quad y_i(w \cdot x_i + b) \ge 1 - \xi_i, i =1,2,\ldots,N \\ \xi_i \ge0, i=1,2,\ldots,N w,b,ξmin21∥w∥2+Ci=1∑Nξi2s.t.yi(w⋅xi+b)≥1−ξi,i=1,2,…,Nξi≥0,i=1,2,…,N
对应的拉格朗日函数是 L ( w , b , ξ , α , γ ) = 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 + ∑ i = 1 N α i ( 1 − ξ i − y i ( w ⋅ x i + b ) ) − ∑ i = 1 N γ i ξ i L(w,b,\xi,\alpha,\gamma) = \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 +\sum_{i=1}^N \alpha_i ( 1 - \xi_i -y_i(w \cdot x_i + b) ) -\sum_{i=1}^N \gamma_i \xi_i L(w,b,ξ,α,γ)=21∥w∥2+C∑i=1Nξi2+∑i=1Nαi(1−ξi−yi(w⋅xi+b))−∑i=1Nγiξi
使用KKT条件得到
∂ L ∂ w = w − ∑ i = 1 N α i y i x i = 0 ∂ L ∂ b = − ∑ i = 1 N α i y i = 0 ∂ L ∂ ξ i = 2 C ξ i − α i − γ i = 0 \frac{\partial L}{\partial w} = w - \sum_{i=1}^N \alpha_i y_i x_i = 0 \\ \frac{\partial L}{\partial b} = -\sum_{i=1}^N \alpha_i y_i = 0 \\ \frac{\partial L}{\partial \xi_i} = 2C \xi_i - \alpha_i - \gamma_i=0 ∂w∂L=w−i=1∑Nαiyixi=0∂b∂L=−i=1∑Nαiyi=0∂ξi∂L=2Cξi−αi−γi=0
因此
w = ∑ i = 1 N α i y i x i ∑ i = 1 N α i y i = 0 2 C ξ i = α i + γ i w = \sum_{i=1}^N \alpha_i y_i x_i \\ \sum_{i=1}^N \alpha_i y_i = 0 \\ 2C \xi_i = \alpha_i + \gamma_i w=i=1∑Nαiyixii=1∑Nαiyi=02Cξi=αi+γi
代入拉格朗日函数可得
min w , b , ξ L ( w , b , ξ , α , γ ) = 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i 2 + ∑ i = 1 N α i − ∑ i = 1 N ( α i + γ i ) ξ i − ∑ i = 1 N α i y i w ⋅ x i − ∑ i = 1 N α i y i b = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 2 ∑ i = 1 N ( α i + γ i ) ξ i = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 2 ∑ i = 1 N ( α i + γ i ) α i + γ i 2 C = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 4 C ∑ i = 1 N ( α i + γ i ) 2 \min_{w,b,\xi} L(w,b,\xi,\alpha,\gamma) = \frac{1}{2} {\parallel w \parallel}^2 + C \sum_{i=1}^N {\xi_i}^2 +\sum_{i=1}^N \alpha_i -\sum_{i=1}^N (\alpha_i+\gamma_i) \xi_i -\sum_{i=1}^N \alpha_i y_i w \cdot x_i - \sum_{i=1}^N \alpha_i y_i b \\ =- \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{2}\sum_{i=1}^N (\alpha_i+\gamma_i) \xi_i \\= - \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{2}\sum_{i=1}^N (\alpha_i+\gamma_i)\frac{\alpha_i+\gamma_i}{2C}\\= - \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{4C}\sum_{i=1}^N (\alpha_i+\gamma_i)^2 w,b,ξminL(w,b,ξ,α,γ)=21∥w∥2+Ci=1∑Nξi2+i=1∑Nαi−i=1∑N(αi+γi)ξi−i=1∑Nαiyiw⋅xi−i=1∑Nαiyib=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−21i=1∑N(αi+γi)ξi=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−21i=1∑N(αi+γi)2Cαi+γi=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−4C1i=1∑N(αi+γi)2
对偶问题为
max α W ( α ) = − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j x i x j + ∑ i = 1 N α i − 1 4 C ∑ i = 1 N ( α i + γ i ) 2 s . t . ∑ i = 1 N α i y i = 0 α i ≥ 0 , γ i ≥ 0 , i = 1 , 2 , … , N \max_{\alpha} W(\alpha) =- \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y_i y_j x_i x_j + \sum_{i=1}^N \alpha_i - \frac{1}{4C}\sum_{i=1}^N (\alpha_i+\gamma_i)^2\\s.t. \quad \sum_{i=1}^N \alpha_i y_i=0 \\ \alpha_i \ge0, \gamma_i \ge0,i=1,2,\ldots,N αmaxW(α)=−21i=1∑Nj=1∑Nαiαjyiyjxixj+i=1∑Nαi−4C1i=1∑N(αi+γi)2s.t.i=1∑Nαiyi=0αi≥0,γi≥0,i=1,2,…,N
对 p p p进行数学归纳。
当 p = 1 p=1 p=1时, K ( x , z ) = x ⋅ z K(x,z) = x \cdot z K(x,z)=x⋅z, 则 ϕ ( x ) = x \phi(x) = x ϕ(x)=x
假设 p = k p=k p=k时, K ( x , z ) = ( x ⋅ z ) k = ϕ k ( x ) ⋅ ϕ k ( z ) K(x,z) = (x \cdot z )^k=\phi_k(x) \cdot \phi_k(z) K(x,z)=(x⋅z)k=ϕk(x)⋅ϕk(z)
当 p = k p=k p=k时, K ( x , z ) = ( x ⋅ z ) k + 1 = ( x ⋅ z ) k ( x ⋅ z ) = ϕ k ( x ) ⋅ ϕ k ( z ) ( x ⋅ z ) K(x,z) = (x \cdot z )^{k+1} = (x \cdot z )^{k} (x \cdot z) = \phi_k(x) \cdot \phi_k(z) (x \cdot z) K(x,z)=(x⋅z)k+1=(x⋅z)k(x⋅z)=ϕk(x)⋅ϕk(z)(x⋅z)
不妨设 ϕ k ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f m ( x ) ) T , x = ( x 1 , x 2 , … , x n ) T \phi_k(x) =( f_1(x),f_2(x),\ldots,f_m(x))^T, x = (x_1,x_2,\ldots,x_n)^T ϕk(x)=(f1(x),f2(x),…,fm(x))T,x=(x1,x2,…,xn)T
则 K ( x , z ) = ( f 1 ( x ) f 1 ( z ) + f 2 ( x ) f 2 ( z ) + … + f m ( x ) f m ( z ) ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) = f 1 ( x ) f 1 ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) + f 2 ( x ) f 2 ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) + … + f m ( x ) f m ( z ) ( x 1 z 1 + x 2 z 2 + … + x n z n ) = ( f 1 ( x ) x 1 ) ( f 1 ( z ) z 1 ) + ( f 1 ( x ) x 2 ) ( f 1 ( z ) z 2 ) + … + ( f 1 ( x ) x n ) ( f 1 ( z ) z n ) + ( f 2 ( x ) x 1 ) ( f 2 ( z ) z 1 ) + … + ( f 2 ( x ) x n ) ( f 2 ( z ) z n ) + ( f m ( x ) x 1 ) ( f m ( z ) z 1 ) + … + ( f m ( x ) x n ) ( f m ( z ) z n ) : = ϕ k + 1 ( x ) ⋅ ϕ k + 1 ( z ) K(x,z) =(f_1(x)f_1(z) + f_2(x)f_2(z) + \ldots + f_m(x)f_m(z))(x_1 z_1+x_2 z_2+ \ldots +x_n z_n) \\ =f_1(x)f_1(z) (x_1 z_1+x_2 z_2+ \ldots +x_n z_n) +f_2(x)f_2(z)(x_1 z_1+x_2 z_2+ \ldots +x_n z_n)+ \ldots + f_m(x)f_m(z)(x_1 z_1+x_2 z_2+ \ldots +x_n z_n) \\ =(f_1(x)x_1)(f_1(z)z_1) +(f_1(x)x_2)(f_1(z)z_2) + \ldots +(f_1(x)x_n)(f_1(z)z_n) + (f_2(x)x_1)(f_2(z)z_1) + \ldots \\ +(f_2(x)x_n)(f_2(z)z_n) + (f_m(x)x_1)(f_m(z)z_1) + \ldots +(f_m(x)x_n)(f_m(z)z_n) \\ :=\phi_{k+1}(x) \cdot \phi_{k+1}(z) K(x,z)=(f1(x)f1(z)+f2(x)f2(z)+…+fm(x)fm(z))(x1z1+x2z2+…+xnzn)=f1(x)f1(z)(x1z1+x2z2+…+xnzn)+f2(x)f2(z)(x1z1+x2z2+…+xnzn)+…+fm(x)fm(z)(x1z1+x2z2+…+xnzn)=(f1(x)x1)(f1(z)z1)+(f1(x)x2)(f1(z)z2)+…+(f1(x)xn)(f1(z)zn)+(f2(x)x1)(f2(z)z1)+…+(f2(x)xn)(f2(z)zn)+(fm(x)x1)(fm(z)z1)+…+(fm(x)xn)(fm(z)zn):=ϕk+1(x)⋅ϕk+1(z)
其中 ϕ k + 1 ( x ) = ( f 1 ( x ) x 1 , f 1 ( x ) x 2 , … , f 1 ( x ) x n , f 2 ( x ) x 1 , … , f 2 ( x ) x n , … , f m ( x ) x 1 , … , f m ( x ) x n ) T \phi_{k+1}(x)=(f_1(x)x_1, f_1(x)x_2, \ldots, f_1(x)x_n, f_2(x)x_1, \ldots, f_2(x)x_n, \ldots,f_m(x)x_1, \ldots, f_m(x)x_n )^T ϕk+1(x)=(f1(x)x1,f1(x)x2,…,f1(x)xn,f2(x)x1,…,f2(x)xn,…,fm(x)x1,…,fm(x)xn)T
因此 K ( x , z ) = ( x ⋅ z ) p K(x,z) = (x \cdot z )^p K(x,z)=(x⋅z)p是正定核。