python KNN MNIST 手写数字识别

利用KNN算法做手写数字识别,数据集用到了MNIST。
KNN(最近邻算法),根据距离最近的K个标签中的多数值确定该数据的标签。
最主要的算法实现:

def knn(k,train_images,train_labels,test_images,test_labels):
    errorCount = 0.0 # 记录错误个数
    m=test_images.shape[0]
    #m=100
    for i in range(m):
        classifierResult =classify(test_images[i],train_images,train_labels, k) # 调用k近邻法的分类器函数,进行判决
        print ("the classifier %d came back with:%d, the real answer is: %d" % (i+1,classifierResult, test_labels[i]))
        if (classifierResult != test_labels[i]):
            errorCount += 1.0
    print ("\nthe total number of errorsis: %d" % errorCount)
    print ("\nthe total error rate is:%f" % (errorCount/float(m)))
def classify(testOne, dataSet, labels, k):
    dataSetSize = dataSet.shape[0]
    diffMat = tile(testOne, (dataSetSize,1))-dataSet
    sqDiffMat = diffMat**2
    sqDistances = sqDiffMat.sum(axis=1)# 欧式距离
    distances = sqDistances**0.5
    sortedDistIndicies = distances.argsort()# 对训练结果中的欧式距
    classCount={}                           # 离进行排序
    for i in range(k):    # 由距离最小的k个点通过多数表决法判别出结果
        voteIlabel =labels[sortedDistIndicies[i]]
        classCount[voteIlabel] =classCount.get(voteIlabel,0) + 1
    sortedClassCount =sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
    return sortedClassCount[0][0]

所有代码的实包括数据集已经打包,赚点积分,自取。
https://download.csdn.net/download/qq_35498696/12569928

你可能感兴趣的:(机器学习,python)