感知机 —— 算法(对偶形式)

算法流程

输入:线性可分的数据集 T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋅ ⋅ ⋅ , ( x N , y N ) } T= \left\{ (x_1,y_1), (x_2,y_2),···,(x_N,y_N)\right\} T={ (x1,y1),(x2,y2),,(xN,yN)},其中 x i ∈ χ = R n x_i \in\chi=\mathbf{R}^n xiχ=Rn y i ∈ Y = { − 1 , + 1 } , i = 1 , 2 , ⋅ ⋅ ⋅ , N y_i\in Y=\left\{-1,+1\right\},i=1,2,···,N yiY={ 1,+1},i=1,2,,N;学习率 η ( 0 < η ≤ 1 ) \eta(0<\eta \le1) η(0<η1)
输出: a , b a,b a,b;感知机模型 f ( x ) = s i g n ( ∑ j = 1 N α j y j x j ⋅ x i + b ) f(x)=sign(\sum_{j=1}^N\alpha_jy_jx_j·x_i+b) f(x)=sign(j=1Nαjyjxjxi+b)
其中 a = ( a 1 , a 2 , ⋅ ⋅ ⋅ a N ) T a=(a_1,a_2,···a_N)^T a=(a1,a2,aN)T

  • 解的过程:
    (1) a ← 0 , b ← 0 a\gets0,b\gets0 a0,b0
    (2)在训练集中选取数据 ( x i , y j ) (x_i,y_j) (xi,yj)
    (3)如果 y i ( w ⋅ x i + b ) = y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ≤ 0 y_i(w·x_i+b)=y_i(\sum_{j=1}^N\alpha_jy_jx_j·x_i+b)\le0 yi(wxi+b)=yi(j=1Nαjyjxjxi+b)0
    a i ← a i + η , b ← b + η y i a_i \gets a_i+\eta , b \gets b+ \eta y_i aiai+η,bb+ηyi
    (4)转至(2)直至没有误分类数据。


  • 对偶形式中训练实例仅以内积的形式出现。为了方便,可以预先将训练集中实例间的内积计算出来并以矩阵的形式存储,这个矩阵就是所谓的Gram矩阵
    G = [ x i ⋅ x j ] N ∗ N = [ < x 1 ⋅ x 1 > < x 1 ⋅ x 2 > ⋅ ⋅ ⋅ < x 1 ⋅ x n > < x 2 ⋅ x 1 > < x 2 ⋅ x 2 > ⋅ ⋅ ⋅ < x 2 ⋅ x n > ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ < x n ⋅ x 1 > < x n ⋅ x 2 > ⋅ ⋅ ⋅ < x n ⋅ x n > ] G=[x_i·x_j]_{N*N} = \begin{bmatrix} <x_1·x_1> \quad <x_1·x_2> \quad ··· \quad <x_1·x_n> \\ <x_2·x_1> \quad <x_2·x_2> \quad ··· \quad <x_2·x_n> \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ <x_n·x_1> \quad <x_n·x_2> \quad ··· \quad <x_n·x_n> \end{bmatrix} G=[xixj]NN=<x1x1><x1x2><x1xn><x2x1><x2x2><x2xn><xnx1><xnx2><xnxn>

算法示例

例2.2:数据通2.1,其正实例点是 x 1 = ( 3 , 3 ) T x_1=(3,3)^T x1=(3,3)T x 2 = ( 4 , 3 ) T x_2=(4,3)^T x2=(4,3)T,其负实例点是 x 3 = ( 1 , 1 ) T x_3=(1,1)^T x3=(1,1)T,试用感知机学习算法的对偶形式求感知机模型。


  • (1)取 a i = 0 , i = 1 , 2 , 3 , b = 0 , η = 1 ; a_i=0,i=1,2,3,b=0,\eta=1; ai=0,i=1,2,3,b=0,η=1;
    (2)计算Gram矩阵:
    G = [ x i ⋅ x j ] N ∗ N = [ < x 1 ⋅ x 1 > < x 1 ⋅ x 2 > < x 1 ⋅ x 3 > < x 2 ⋅ x 1 > < x 2 ⋅ x 2 > < x 2 ⋅ x 3 > < x 3 ⋅ x 1 > < x 3 ⋅ x 2 > < x 3 ⋅ x n > ] = [ 3 ∗ 3 + 3 ∗ 3 3 ∗ 4 + 3 ∗ 3 3 ∗ 1 + 3 ∗ 1 4 ∗ 3 + 3 ∗ 3 4 ∗ 4 + 3 + 3 4 ∗ 1 + 3 ∗ 1 1 ∗ 3 + 1 ∗ 3 1 ∗ 3 + 1 ∗ 3 1 ∗ 1 + 1 ∗ 1 ] = [ 18 21 6 21 25 7 6 7 2 ] G=[x_i·x_j]_{N*N} = \begin{bmatrix} <x_1·x_1> \quad <x_1·x_2> \quad <x_1·x_3> \\ <x_2·x_1> \quad <x_2·x_2> \quad <x_2·x_3> \\ <x_3·x_1> \quad <x_3·x_2> \quad <x_3·x_n> \end{bmatrix}= \begin{bmatrix} 3*3+3*3 \quad 3*4+3*3 \quad 3*1+3*1 \\ 4*3+3*3 \quad 4*4+3+3 \quad 4*1+3*1 \\ 1*3+1*3 \quad 1*3+1*3 \quad 1*1+1*1 \end{bmatrix}= \begin{bmatrix} 18 \quad 21 \quad 6 \\ 21 \quad 25 \quad 7 \\ 6 \quad 7 \quad 2 \end{bmatrix} G=[xixj]NN=<x1x1><x1x2><x1x3><x2x1><x2x2><x2x3><x3x1><x3x2><x3xn>=33+3334+3331+3143+3344+3+341+3113+1313+1311+11=1821621257672
    (3)误分条件 y i ( w ⋅ x i + b ) = y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ≤ 0 y_i(w·x_i+b)=y_i(\sum_{j=1}^N\alpha_jy_jx_j·x_i+b)\le0 yi(wxi+b)=yi(j=1Nαjyjxjxi+b)0
    更新参数 a i ← a i + 1 , b ← b + y i a_i \gets a_i+1 , b \gets b+ y_i aiai+1,bb+yi
    (4)迭代。过程从略见表
k k k 0 1 2 3 4 5 6 7
x 1 x_1 x1 x 3 x_3 x3 x 3 x_3 x3 x 3 x_3 x3 x 1 x_1 x1 x 3 x_3 x3 x 3 x_3 x3
a 1 a_1 a1 0 1 1 1 1 2 2 2
a 2 a_2 a2 0 0 0 0 0 0 0 0
a 3 a_3 a3 0 0 1 2 3 3 4 5
b b b 0 1 0 -1 -2 -1 -2 -3

(5) w = 2 x 1 + 0 x 2 − 5 x 3 = ( 1 , 1 ) T w=2x_1+0x_2-5x_3=(1,1)^T w=2x1+0x25x3=(1,1)T
b = − 3 b=-3 b=3
分离超平面
x ( 1 ) + x ( 2 ) − 3 = 0 x^{(1)}+x^{(2)}-3=0 x(1)+x(2)3=0
感知机模型
f ( x ) = s i g n ( x ( 1 ) + x ( 2 ) − 3 ) f(x)=sign(x^{(1)}+x^{(2)}-3) f(x)=sign(x(1)+x(2)3)

  • 注:
    与原始形式一样,感知机学习算法的对偶形式迭代是收敛的,存在多个解。

你可能感兴趣的:(统计学习方法,感知机,感知机对偶形式)