输入:线性可分的数据集 T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋅ ⋅ ⋅ , ( x N , y N ) } T= \left\{ (x_1,y_1), (x_2,y_2),···,(x_N,y_N)\right\} T={ (x1,y1),(x2,y2),⋅⋅⋅,(xN,yN)},其中 x i ∈ χ = R n x_i \in\chi=\mathbf{R}^n xi∈χ=Rn, y i ∈ Y = { − 1 , + 1 } , i = 1 , 2 , ⋅ ⋅ ⋅ , N y_i\in Y=\left\{-1,+1\right\},i=1,2,···,N yi∈Y={ −1,+1},i=1,2,⋅⋅⋅,N;学习率 η ( 0 < η ≤ 1 ) \eta(0<\eta \le1) η(0<η≤1);
输出: a , b a,b a,b;感知机模型 f ( x ) = s i g n ( ∑ j = 1 N α j y j x j ⋅ x i + b ) f(x)=sign(\sum_{j=1}^N\alpha_jy_jx_j·x_i+b) f(x)=sign(j=1∑Nαjyjxj⋅xi+b)
其中 a = ( a 1 , a 2 , ⋅ ⋅ ⋅ a N ) T a=(a_1,a_2,···a_N)^T a=(a1,a2,⋅⋅⋅aN)T。
解的过程:
(1) a ← 0 , b ← 0 a\gets0,b\gets0 a←0,b←0;
(2)在训练集中选取数据 ( x i , y j ) (x_i,y_j) (xi,yj)
(3)如果 y i ( w ⋅ x i + b ) = y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ≤ 0 y_i(w·x_i+b)=y_i(\sum_{j=1}^N\alpha_jy_jx_j·x_i+b)\le0 yi(w⋅xi+b)=yi(j=1∑Nαjyjxj⋅xi+b)≤0
则 a i ← a i + η , b ← b + η y i a_i \gets a_i+\eta , b \gets b+ \eta y_i ai←ai+η,b←b+ηyi
(4)转至(2)直至没有误分类数据。
注
对偶形式中训练实例仅以内积的形式出现。为了方便,可以预先将训练集中实例间的内积计算出来并以矩阵的形式存储,这个矩阵就是所谓的Gram矩阵
G = [ x i ⋅ x j ] N ∗ N = [ < x 1 ⋅ x 1 > < x 1 ⋅ x 2 > ⋅ ⋅ ⋅ < x 1 ⋅ x n > < x 2 ⋅ x 1 > < x 2 ⋅ x 2 > ⋅ ⋅ ⋅ < x 2 ⋅ x n > ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ < x n ⋅ x 1 > < x n ⋅ x 2 > ⋅ ⋅ ⋅ < x n ⋅ x n > ] G=[x_i·x_j]_{N*N} = \begin{bmatrix} <x_1·x_1> \quad <x_1·x_2> \quad ··· \quad <x_1·x_n> \\ <x_2·x_1> \quad <x_2·x_2> \quad ··· \quad <x_2·x_n> \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ · \quad \quad\quad\quad\quad\quad· \quad\quad\quad ··· \quad\quad\quad\quad · \\ <x_n·x_1> \quad <x_n·x_2> \quad ··· \quad <x_n·x_n> \end{bmatrix} G=[xi⋅xj]N∗N=⎣⎢⎢⎢⎢⎢⎢⎡<x1⋅x1><x1⋅x2>⋅⋅⋅<x1⋅xn><x2⋅x1><x2⋅x2>⋅⋅⋅<x2⋅xn>⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅<xn⋅x1><xn⋅x2>⋅⋅⋅<xn⋅xn>⎦⎥⎥⎥⎥⎥⎥⎤
例2.2:数据通2.1,其正实例点是 x 1 = ( 3 , 3 ) T x_1=(3,3)^T x1=(3,3)T, x 2 = ( 4 , 3 ) T x_2=(4,3)^T x2=(4,3)T,其负实例点是 x 3 = ( 1 , 1 ) T x_3=(1,1)^T x3=(1,1)T,试用感知机学习算法的对偶形式求感知机模型。
k k k | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
x 1 x_1 x1 | x 3 x_3 x3 | x 3 x_3 x3 | x 3 x_3 x3 | x 1 x_1 x1 | x 3 x_3 x3 | x 3 x_3 x3 | ||
a 1 a_1 a1 | 0 | 1 | 1 | 1 | 1 | 2 | 2 | 2 |
a 2 a_2 a2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
a 3 a_3 a3 | 0 | 0 | 1 | 2 | 3 | 3 | 4 | 5 |
b b b | 0 | 1 | 0 | -1 | -2 | -1 | -2 | -3 |
(5) w = 2 x 1 + 0 x 2 − 5 x 3 = ( 1 , 1 ) T w=2x_1+0x_2-5x_3=(1,1)^T w=2x1+0x2−5x3=(1,1)T
b = − 3 b=-3 b=−3
分离超平面
x ( 1 ) + x ( 2 ) − 3 = 0 x^{(1)}+x^{(2)}-3=0 x(1)+x(2)−3=0
感知机模型
f ( x ) = s i g n ( x ( 1 ) + x ( 2 ) − 3 ) f(x)=sign(x^{(1)}+x^{(2)}-3) f(x)=sign(x(1)+x(2)−3)