CS231n Spring 2019 Assignment 1—KNN

相关链接

在暑假里面粗略地学习了一下CS231n的课程,个人感觉非常有收获,对于入门深度学习中的计算机视觉特别有效。现在想通过写博客这种方式,来再一遍地找漏巩固。下面是一些课程的官方链接:

网页 说明
课程主页 这个网页上还有2015到2018的过去几年的课程链接
课程笔记 有三次作业链接和三个模块的笔记、教程
详细的教学大纲 有对应视频课程的slides,拓展阅读,也包含笔记

当然这么好的课程国内肯定会有翻译的:

网页 说明
9篇课程知识详解笔记的翻译 只是对应的是2016冬季课程的
课程视频 B站上的中英字幕的,网易云课堂截止本博客发出仍处于下架状态
作业1代码 我写的针对2019春季课程的参考代码

Assignment 1

这个是作业链接及说明:Assignment #1: Image Classification, kNN, SVM, Softmax, Neural Network,里面需要用Ipython Notebook(关于这个工具的教程可看这里)的形式完成5个小问题:knn.ipynb、svm.ipynb、softmax.ipynb、two_layer_net.ipynb、features.ipynb(没有基础,想看一下python和numpy的简易教程的可以看这里),从这些问题文件的命名就可以大致看出要完成什么。想完成assignment1,主要要阅读完这两个笔记:

  • Image Classification: Data-driven Approach, k-Nearest Neighbor, train/val/test splits
  • Linear classification: Support Vector Machine, Softmax

KNN

这里有一点我写的关于KNN的理解,KNN其实是Nearest Neighbor的拓展,在训练阶段只做数据的记录,在预测的时候,通过某种指标来求得与输入的要预测图片“距离”最近的一张图片,将它的标签作为预测结果。KNN就是求出“距离”最近的K张图片,投票选出多数的标签作为结果。其中最好的K值由交叉验证决定(Cross-validation)。
knn.ipynb里面需要写三种计算”距离“的函数,还有一个预测函数:
用两个循环的很好理解,不用多解释

def compute_distances_two_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a nested loop over both the training data and the
        test data.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data.

        Returns:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          is the Euclidean distance between the ith test point and the jth training
          point.
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
                dists[i,j] = np.sqrt(np.sum((X[i,:] - self.X_train[j,:]) ** 2))
        return dists

用一个循环的也很好理解:

def compute_distances_one_loop(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a single loop over the test data.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            dists[i,:] = np.sqrt(np.sum((X[i,:] - self.X_train) ** 2, 1))
        return dists

不用循环还一下想不到,其实要计算L2 distance,都是基于一个公式,只不过现在是矩阵而已。辅助理解1,辅助理解2,根据代码里面的提示,感觉辅助理解2里面更符合题目的意思:
再不理解可以看下面这张图:

CS231n Spring 2019 Assignment 1—KNN_第1张图片
证明

def compute_distances_no_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using no explicit loops.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        
        r1=(np.sum(np.square(X), axis=1)*(np.ones((num_train, 1)))).T
        r2=np.sum(np.square(self.X_train), axis=1)*(np.ones((num_test, 1)))
        r3=-2*np.dot(X, self.X_train.T)
        dists = np.sqrt(r1+r2+r3)

        return dists

在预测函数里面主要用到了np.argsort函数和max函数key

def predict_labels(self, dists, k=1):
        """
        Given a matrix of distances between test points and training points,
        predict a label for each test point.

        Inputs:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          gives the distance betwen the ith test point and the jth training point.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in range(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            #########################################################################
            # TODO:                                                                 #
            # Use the distance matrix to find the k nearest neighbors of the ith    #
            # testing point, and use self.y_train to find the labels of these       #
            # neighbors. Store these labels in closest_y.                           #
            # Hint: Look up the function numpy.argsort.                             #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            idx = np.argsort(dists[i,:])[0:k]
            closest_y = list(self.y_train[idx])

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
            #########################################################################
            # TODO:                                                                 #
            # Now that you have found the labels of the k nearest neighbors, you    #
            # need to find the most common label in the list closest_y of labels.   #
            # Store this label in y_pred[i]. Break ties by choosing the smaller     #
            # label.                                                                #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            y_pred[i] = max(closest_y, key=closest_y.count)

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        return y_pred

这是交叉验证的代码,是我第一次写的(那时还不会用一些函数),写的比较ugly,但还能运行,就先放上吧:

import copy
num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []
################################################################################
# TODO:                                                                        #
# Split up the training data into folds. After splitting, X_train_folds and    #
# y_train_folds should each be lists of length num_folds, where                #
# y_train_folds[i] is the label vector for the points in X_train_folds[i].     #
# Hint: Look up the numpy array_split function.                                #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
print(X_train.shape)
print(y_train.shape)
X_train_folds = np.array_split(X_train, num_folds, axis = 0)
y_train_folds = np.array_split(y_train, num_folds, axis = 0)
print(len(X_train_folds))
print(len(y_train_folds))

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}


################################################################################
# TODO:                                                                        #
# Perform k-fold cross validation to find the best value of k. For each        #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times,   #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all     #
# values of k in the k_to_accuracies dictionary.                               #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
for kk in k_choices:
    # 现在X_train_folds,y_train_folds都是列表
    accuracy = []
    for fold in range(num_folds):
        X_temp = copy.deepcopy(X_train_folds)
        y_temp = copy.deepcopy(y_train_folds)
        
        X_val = X_temp.pop(fold)
        X_tra = np.concatenate(X_temp)
        y_val = y_temp.pop(fold)
        y_tra = np.concatenate(y_temp)
        
        classifier.train(X_tra, y_tra)
        y_val_pred = classifier.predict(X_val,k=kk,num_loops=0)
        
        num_correct = np.sum(y_val_pred == y_val)
        accuracy.append(float(num_correct) / X_val.shape[0])
    k_to_accuracies[kk] = accuracy

print(k_to_accuracies)

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

# Print out the computed accuracies
for k in sorted(k_to_accuracies):
    for accuracy in k_to_accuracies[k]:
        print('k = %d, accuracy = %f' % (k, accuracy))

结果

这是我的交叉验证结果图,超参数k=10结果最好(肉眼估计的),其他结果可直接去看knn.ipynb

CS231n Spring 2019 Assignment 1—KNN_第2张图片
这是我的交叉验证结果图

最后说一点:我个人认为,做这个cs231n的作业时,有时会遇到自己知识储备中非常陌生的,之前可能一次都没看过学过的知识点,这个时候是可以看一下别人的代码,回过头再有自己的思考,但必须是在自己很努力思考后还没有任何头绪的时候,否则会影响自己的成就感~~

链接

后面的作业博文请见:

  • 后一篇的博文:SVM/Softmax classifier

你可能感兴趣的:(CS231n Spring 2019 Assignment 1—KNN)