CS231n作业笔记1.2: KNN的交叉验证

CS231n简介

详见 CS231n课程笔记1:Introduction。
注:斜体字用于注明作者自己的思考,正确性未经过验证,欢迎指教。

作业笔记

关于KNN的实现详见 CS231n作业笔记1.1: KNN中的距离矩阵vectorize的实现方法(无循环)。

1. 切分训练集

np.split函数用于切分训练集,注意必须是等分,而且返回list类型。
X_train_folds = np.array(np.split(X_train,5))
y_train_folds = np.array(np.split(y_train,5))

2. 交叉验证

每次使用其中一个fold做测试集,其他folds做训练集,得到训练的准确度。
使用np.cancatenate可以按照某个维度拼接,例如[[1,2],[3,4]]–>[1,2,3,4]。用于拼接folds,构建训练集。
而选取训练集的时候则使用numpy中的mask索引,注意这种索引方式只作用于np.array。
k_to_accuracies = {}
for k in k_choices:
k_to_accuracies[k] = np.zeros(num_folds)
for i in range(num_folds):
mask = range(i) + range(i+1,num_folds)
X_train_fold = np.concatenate(X_train_folds[mask],axis = 0)
y_train_fold = np.concatenate(y_train_folds[mask],axis = 0)
classifier.train(X_train_fold,y_train_fold)
dists = classifier.compute_distances_no_loops(X_train_folds[i])
y_test_fold_pred = classifier.predict_labels(dists,k=k)
num_correct = np.sum(y_test_fold_pred == y_train_folds[i])
k_to_accuracies[k][i] = float(num_correct)/(X_train_folds[i].shape[0])

你可能感兴趣的:(cs231n,CS231n课程笔记,cs231n,KNN,交叉验证)