多分类支持向量机及其Python实现

1.多分类支持向量机Multiclass Support Vector Machine (SVM)

  • 其评分函数为: f(ω)=ωx+b f ( ω ) = ω x + b
  • 在此用到的都是支持向量机的原始形式,没有核方法也没有使用SMO算法。无约束原始优化问题。
  • 常用的多分类支持向量机有One-Vs-All(OVA),All-Vs-All,Structured SVM,其中OVA比较简单。
  • 多分类支持向量机对第i个类的损失函数是 Li=jyimax(0,sjsyiΔ) L i = ∑ j ≠ y i m a x ( 0 , s j − s y i − Δ ) , sj s j 不是比 syi s y i 还小 Δ Δ ,或者 sj s j syi s y i 还大,那就要对其评分进行惩罚。这样就是说,到最后,正确类别的分数要比错误分类的分数至少大 Δ Δ 阈值为0的max(0,-)函数也被称为合页损失函数(hingle loss)。
  • margin
  • L=1NiLidata loss+λR(W)regularization loss L = 1 N ∑ i L i ⏟ data loss + λ R ( W ) ⏟ regularization loss
  • data loss中的超参数 Δ Δ 和正则化前面的超参数 λ λ 看似无关,其实它们两个之间相互影响。使用小的间隔 Δ Δ 时,将会压缩 ω ω 的值在较小的区间上,正则化损失也就会变小,反之亦然。
  • 和二分类支持向量机之间的关系二分类支持向量机的损失函数 Li=Cmax(0,1yiwTxi)+R(W) L i = C max ( 0 , 1 − y i w T x i ) + R ( W ) ,C也是一个超参数, C1λ C ∝ 1 λ .

2.Python 实现

训练数据获取地址:https://raw.githubusercontent.com/lxrobot/General-source-code/master/SVM/data216x197.npy

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Created on Sun Jul 29 17:15:25 2018
@author: rd
"""
from __future__ import division
import numpy as np
"""
This dataset is part of MNIST dataset,but there is only 3 classes,
classes = {0:'0',1:'1',2:'2'},and images are compressed to 14*14 
pixels and stored in a matrix with the corresponding label, at the 
end the shape of the data matrix is 
num_of_images x 14*14(pixels)+1(lable)
"""
def load_data(split_ratio):
    tmp=np.load("data216x197.npy")
    data=tmp[:,:-1]
    label=tmp[:,-1]
    mean_data=np.mean(data,axis=0)
    train_data=data[int(split_ratio*data.shape[0]):]-mean_data
    train_label=label[int(split_ratio*data.shape[0]):]
    test_data=data[:int(split_ratio*data.shape[0])]-mean_data
    test_label=label[:int(split_ratio*data.shape[0])]
    return train_data,train_label,test_data,test_label

"""compute the hingle loss without using vector operation,
While dealing with a huge dataset,this will have low efficiency
X's shape [n,14*14+1],Y's shape [n,],W's shape [num_class,14*14+1]"""
def lossAndGradNaive(X,Y,W,reg):
    dW=np.zeros(W.shape)
    loss = 0.0
    num_class=W.shape[0]
    num_X=X.shape[0]
    for i in range(num_X):
        scores=np.dot(W,X[i])        
        cur_scores=scores[int(Y[i])]
        for j in range(num_class):
            if j==Y[i]:
                continue
            margin=scores[j]-cur_scores+1
            if margin>0:
                loss+=margin
                dW[j,:]+=X[i]
                dW[int(Y[i]),:]-=X[i]
    loss/=num_X
    dW/=num_X
    loss+=reg*np.sum(W*W)
    dW+=2*reg*W
    return loss,dW

def lossAndGradVector(X,Y,W,reg):
    dW=np.zeros(W.shape)
    N=X.shape[0]
    Y_=X.dot(W.T)
    margin=Y_-Y_[range(N),Y.astype(int)].reshape([-1,1])+1.0
    margin[range(N),Y.astype(int)]=0.0
    margin=(margin>0)*margin
    loss=0.0
    loss+=np.sum(margin)/N
    loss+=reg*np.sum(W*W)
    """For one data,the X[Y[i]] has to be substracted several times"""
    countsX=(margin>0).astype(int)
    countsX[range(N),Y.astype(int)]=-np.sum(countsX,axis=1)
    dW+=np.dot(countsX.T,X)/N+2*reg*W
    return loss,dW

def predict(X,W):
    X=np.hstack([X, np.ones((X.shape[0], 1))])
    Y_=np.dot(X,W.T)
    Y_pre=np.argmax(Y_,axis=1)
    return Y_pre

def accuracy(X,Y,W):
    Y_pre=predict(X,W)
    acc=(Y_pre==Y).mean()
    return acc

def model(X,Y,alpha,steps,reg):
    X=np.hstack([X, np.ones((X.shape[0], 1))])
    W = np.random.randn(3,X.shape[1]) * 0.0001
    for step in range(steps):
        loss,grad=lossAndGradNaive(X,Y,W,reg)
        W-=alpha*grad
        print"The {} step, loss={}, accuracy={}".format(step,loss,accuracy(X[:,:-1],Y,W))
    return W

train_data,train_label,test_data,test_label=load_data(0.2)
W=model(train_data,train_label,0.0001,25,0.5)
print"Test accuracy of the model is {}".format(accuracy(test_data,test_label,W))

>>>python linearSVM.py
The 0 step, loss=2.13019790746, accuracy=0.913294797688
The 1 step, loss=0.542978231611, accuracy=0.924855491329
The 2 step, loss=0.414223869196, accuracy=0.936416184971
The 3 step, loss=0.320119455456, accuracy=0.942196531792
The 4 step, loss=0.263666138421, accuracy=0.959537572254
The 5 step, loss=0.211882339467, accuracy=0.959537572254
The 6 step, loss=0.164835849395, accuracy=0.976878612717
The 7 step, loss=0.126102746496, accuracy=0.982658959538
The 8 step, loss=0.105088702516, accuracy=0.982658959538
The 9 step, loss=0.0885842087354, accuracy=0.988439306358
The 10 step, loss=0.075766608953, accuracy=0.988439306358
The 11 step, loss=0.0656638756563, accuracy=0.988439306358
The 12 step, loss=0.0601955544228, accuracy=0.988439306358
The 13 step, loss=0.0540709950467, accuracy=0.988439306358
The 14 step, loss=0.0472173965381, accuracy=0.988439306358
The 15 step, loss=0.0421049154242, accuracy=0.988439306358
The 16 step, loss=0.035277229437, accuracy=0.988439306358
The 17 step, loss=0.0292794459009, accuracy=0.988439306358
The 18 step, loss=0.0237297062768, accuracy=0.994219653179
The 19 step, loss=0.0173572330624, accuracy=0.994219653179
The 20 step, loss=0.0128973291068, accuracy=0.994219653179
The 21 step, loss=0.00907220930146, accuracy=1.0
The 22 step, loss=0.0072871437553, accuracy=1.0
The 23 step, loss=0.00389257560805, accuracy=1.0
The 24 step, loss=0.00347981408653, accuracy=1.0
Test accuracy of the model is 0.976744186047

你可能感兴趣的:(MachineLearning)