1.5 KNN算法学习——KNN算法分类模型的实现与分类准确度评估

训练集分割成训练集与测试集,代码封装
import numpy as np

def train_test_split(X,y,test_ratio=0.2,seed=None):
    """将数据X与y按照test_ratio分割成X_train,X_test,y_train,y_test"""
    assert X.shape[0] == y.shape[0], "the size of X must equal to the sise of y"
    assert 0.0 <=test_ratio <= 1.0,"test_ratio must be valid"

    if seed:
        np.random.seed(seed)

    shuffled_index=np.random.permutation(len(X))
    test_size=int(len(X)*test_ratio)
    test_index=shuffled_index[:test_size]
    train_index=shuffled_index[test_size:]

    X_train=X[train_index]
    y_train=y[train_index]

    X_test=X[test_index]
    y_test=y[test_index]

    return X_train,X_test,y_train,y_test
训练及预测过程,代码封装
import numpy as np
from math import sqrt
from collections import Counter

class KNNClassifier:
    def __init__(self,k):
        """初始化KNN分离器"""
        assert k >= 1 ,"k must be valid"
        self.k=k

        self._X_train=None
        self._y_train=None

    def fit(self,X_train,y_train):
        assert self.k <= X_train.shape[0], "k must be valid"
        assert X_train.shape[0] == y_train.shape[0], "the size of X_train must equal to the sise of y_train"
        self._X_train=X_train
        self._y_train=y_train
        return self

    def predict(self,X_predict):
        """给定待测的数据集X_predict,返回表示X_predict的结果向量"""
        assert self._X_train is not None  and self._y_train is not None,"must fit before predict"
        assert X_predict.shape[1] == self._X_train.shape[1], "the feature number of x must be equal to X_train"
        y_predit=[self._predict(x) for x in X_predict]
        return np.array(y_predit)

    def _predict(self, x):
        """给定单个待测数据x,返回x的预测结果值"""
        assert self._X_train.shape[1] == x.shape[0], "the feature number of x must be equal to X_train"
        distance = [sqrt(np.sum((x_train - x) ** 2)) for x_train in self._X_train]
        nearest = np.argsort(distance)
        topK_y = [self._y_train[i] for i in nearest[:self.k]]
        votes = Counter(topK_y)
        return votes.most_common(1)[0][0]
运行结果:

1.5 KNN算法学习——KNN算法分类模型的实现与分类准确度评估_第1张图片

1.5 KNN算法学习——KNN算法分类模型的实现与分类准确度评估_第2张图片

通过将训练集与测试分割,进行模型的测试,通过比对结果评估性能好坏
y_predict为测试集的预测值,y_test为真实值

查看分类准确度——分类正确率


代码封装
 def accuracy_score(self,y_true, y_predict):
        """计算分类准确率"""
        assert y_true.shape[0] == y_predict.shape[0], "the size of y_true must equal to the size of y_predict"
        return sum(y_predict == y_true) / len(y_true)

    def score(self,X_test,y_true):
        y_predict=self.predict(X_test)
        return self.accuracy_score(y_true,y_predict)
sklearn中的准确度方法运用

1.5 KNN算法学习——KNN算法分类模型的实现与分类准确度评估_第3张图片



你可能感兴趣的:(Mechine,Learning之KNN,机器学习的模型评估与选择)