机器学习与数据挖掘--用线性核与高斯核训练支持向量机

机器学习与数据挖掘实验六

(用线性核与高斯核训练支持向量机 )

实验目的:

掌握支持向量机的原理及应用

实验环境:

Anaconda/Jupyter notebook/Pycharm

实验内容:

使用Sklearn,在西瓜集3.0[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4mgg1fkd-1669619637339)(file:///C:/Users/GikW/AppData/Local/Temp/msohtmlclip1/01/clip_image002.gif)]上分别使用线性核和高斯核训练一个SVM,并比较其支持向量的差别。

实验步骤:
data_file_watermelon_3a = "./watermelon_3a.csv"
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv(data_file_watermelon_3a, header=None,)
df.columns = ['id', 'density', 'sugar_content', 'label']
df.set_index(['id'])
X = df[['density', 'sugar_content']].values
y = df['label'].valuesxxxxxxxxxx data_file_watermelon_3a = "./watermelon_3a.csv"import pandas as pdimport numpy as npimport matplotlib.pyplot as pltdf = pd.read_csv(data_file_watermelon_3a, header=None,)df.columns = ['id', 'density', 'sugar_content', 'label']df.set_index(['id'])X = df[['density', 'sugar_content']].valuesy = df['label'].valuesimport pandas as pdimport numpy as npfrom sklearn.preprocessing import LabelEncoderfrom sklearn.preprocessing import StandardScalerimport matplotlib.pyplot as pltseed = 2020import random# -*- coding:UTF-8 -*-np.random.seed(seed)  # Numpy module.random.seed(seed)  # Python random module.plt.rcParams['font.sans-serif'] = ['SimHei'] #用来正常显示中文标签plt.rcParams['axes.unicode_minus'] = False #用来正常显示负号plt.close('all')
# based on linear kernel as well as gaussian kernel
from sklearn import svm 
for fig_num, kernel in enumerate(('linear', 'rbf')): 
	 #补充构建SVM模型及训练代码
    svc=svm.SVC(C=1000,kernel=kernel)
    svc.fit(X,y)
    #给定新的样本X_test,预测其标签
    X_test = [[0.719,0.103]]
    #补充预测代码
    y_train_pred=svc.predict(X_test)
    # get support vectors
    sv = svc.support_vectors_
    ##### draw decision zone
    plt.figure(fig_num)
    plt.clf()
    
    # plot point and mark out support vectors
    plt.scatter( X[:,0],  X[:,1], edgecolors='k', c=y, cmap=plt.cm.Paired,  zorder=10)
    plt.scatter(sv[:,0], sv[:,1], edgecolors='k', facecolors='none', s=80, linewidths=2, zorder=10)
    # plot the decision boundary and decision zone into a color plot
    x_min, x_max = X[:, 0].min() - 0.2, X[:, 0].max() + 0.2
    y_min, y_max = X[:, 1].min() - 0.2, X[:, 1].max() + 0.2
    XX, YY = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))
    Z = svc.decision_function(np.c_[XX.ravel(), YY.ravel()]) 
    Z = Z.reshape(XX.shape)
    plt.pcolormesh(XX, YY, Z>0, cmap=plt.cm.Paired)
    plt.contour(XX, YY, Z, colors=['k', 'k', 'k'], linestyles=['--', '-', '--'], levels=[-.5, 0, .5])
    
    plt.title(kernel)
    plt.axis('tight')
plt.show()

机器学习与数据挖掘--用线性核与高斯核训练支持向量机_第1张图片

你可能感兴趣的:(机器学习与数据挖掘,数据挖掘,python)