深度学习笔记-------KNN算法

1.基础知识了解

深度学习笔记-------KNN算法_第1张图片

深度学习笔记-------KNN算法_第2张图片

Sklearn库请参考:

非常详细的sklearn介绍_机器学习算法那些事的博客-CSDN博客_sklearn

 2.KNN实战应用

KNN算法求病人癌症检测的正确率

import csv
import random

# 读取数据
with open(".\Prostate_Cancer.csv","r") as f:
    render = csv.DictReader(f)
    datas = [row for row in render]

# 分组,打乱数据
random.shuffle(datas)
n = len(datas)//3

test_data = datas[0:n]
train_data = datas[n:]
# print (train_data[0])
# print (train_data[0]["id"])


# 计算对应的距离
def distance(x, y):
    res = 0
    for k in ("radius","texture","perimeter","area","smoothness","compactness","symmetry","fractal_dimension"):
        res += (float(x[k]) - float(y[k]))**2
    return res ** 0.5
    
# K=6
def knn(data,K):
    # 1. 计算距离
    res = [
        {"result":train["diagnosis_result"],"distance":distance(data,train)}
        for train in train_data
    ]
    # 2. 排序
    sorted(res,key=lambda x:x["distance"])
    # print(res)
    # 3. 取前K个
    res2 = res[0:K]
    # 4. 加权平均
    result = {"B":0,"M":0}
    # 4.1 总距离
    sum = 0
    for r in res2:
        sum += r["distance"]
    # 4.2 计算权重
    for r in res2 :
        result[r['result']] += 1-r["distance"]/sum
    
    # 4.3 得出结果
    if result['B'] > result['M']:
        return "B"
    else:
        return "M"

    
# print(distance(train_data[0],train_data[1]))
# 预测结果和真实结果对比,计算准确率
for k in range(1,11):
    correct = 0
    for test in test_data:
        result = test["diagnosis_result"]
        result2 = knn(test,k)
        if result == result2:
            correct += 1
    print("k="+str(k)+"时,准确率{:.2f}%".format(100*correct/len(test_data)))

运行结果:深度学习笔记-------KNN算法_第3张图片

 由此可见,当K=6时准确率最高


以上图片资料来着梅科尔工作室,仅供学习,请勿随意转载

你可能感兴趣的:(深度学习,算法,机器学习,python)