  File "E:\assignment1\cs231n\data_utils.py", line 9, in load_CIFAR_batch
    datadict = pickle.load(f)

UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 6: ordinal not in range(128)

查到别人的解答, 点击打开链接

with open(filename, 'rb') as f:
    datadict = pickle.load(f,encoding='iso-8859-1')

我们需要告诉pickle:how to convert python bytestring data to Python 3 strings,The default is to try and decode all string data as ASCII


  1. 交叉验证选择最佳的K值

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []
# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}

for k in k_choices:
    k_to_accuracies[k] = np.zeros(num_folds)
    for i in range(num_folds):
        Xtr = np.array(X_train_folds[:i] + X_train_folds[i+1:])#把训练集中的一块划为验证集
        ytr = np.array(y_train_folds[:i] + y_train_folds[i+1:])
        Xte = np.array(X_train_folds[i])
        yte = np.array(y_train_folds[i])     

        Xtr = np.reshape(Xtr, (X_train.shape[0]/ 5*4, -1))
        ytr = np.reshape(ytr, (y_train.shape[0]/ 5*4, -1))
        Xte = np.reshape(Xte, (X_train.shape[0] / 5, -1))
        yte = np.reshape(yte, (y_train.shape[0] / 5, -1))

        classifier.train(Xtr, ytr)
        yte_pred = classifier.predict(Xte, k)
        yte_pred = np.reshape(yte_pred, (yte_pred.shape[0], -1))
        num_correct = np.sum(yte_pred == yte)
        accuracy = float(num_correct) / len(yte)
        k_to_accuracies[k][i] = accuracy

# Print out the computed accuracies
for k in sorted(k_to_accuracies):
    for accuracy in k_to_accuracies[k]:
        print ('k = %d, accuracy = %f' % (k, accuracy))





# Based on the cross-validation results above, choose the best value for k,   
# retrain the classifier using all the training data, and test it on the test
# data. You should be able to get above 28% accuracy on the test data.
best_k = 10

classifier = KNearestNeighbor()
classifier.train(X_train, y_train)
y_test_pred = classifier.predict(X_test, k=best_k)

# Compute and display the accuracy
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print ('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))


Got 144 / 500 correct => accuracy: 0.288000

而没有使用交叉验证时候,指定K=1和K=5时候的准确率分别为27%和29%( 这里没弄明白为什么交叉验证选出的K=10在测试集上的表现不如动手指定的k=5,可能因为选用的训练集并不是全部训练集)

Got 137 / 500 correct => accuracy: 0.274000 #k=1
Got 145 / 500 correct => accuracy: 0.290000 #k=5


def compute_distances_two_loops(self, X):
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the 
    test data.

    - X: A numpy array of shape (num_test, D) containing test data.

    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
      for j in range(num_train):
    return dists

  def compute_distances_one_loop(self, X):
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.

    Input / Output: Same as compute_distances_two_loops
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
    return dists

  def compute_distances_no_loops(self, X):
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.

    Input / Output: Same as compute_distances_two_loops
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train)) 
    return dists

第一种执行的时间效率最低的是使用两层循环,计算每个测试向量和每个训练向量的向量差,然后使用np.square()计算求得每个元素的平方(也可以使用点乘的方式: np.dot(X[i] - self.X_train[j], X[i] - self.X_train[j])),使用np.sum()计算元素平方的和,最后开方求得的就是该测试向量和训练向量的欧式距离,所有遍历下来需要计算500*5000次,也就是测试数据集的行数乘以训练数据集的行数。


Two loop version took 57.858908 seconds

第二种方法,使用一层循环,遍历每个测试向量,直接计算每个测试向量(1*3072)和所有训练向量(5000*3072)的差,得到一个5000*3072的矩阵,然后使用np.square(X)计算每个元素的平方,再使用np.sum(np.square(self.X_train-X[i,:]),axis = 1)计算每行所有列的元素平方的和,再开方得到一个5000*1的向量,以横向量形式保存在结果数组的一行 中,表示的是该测试向量和所有训练向量的欧式距离,最后得到的就是500*5000的一个结果,也就是测试数据集的行数乘以训练数据集的行数。


 One loop version took 106.198080 seconds



我们先来计算一下 Pi 和 Cj 之间的距离 




      dists = np.multiply(np.dot(X,self.X_train.T),-2)  #维度是(500,5000)
    sq1 = np.sum(np.square(X),axis=1,keepdims = True)  #维度是(500,1)
    sq2 = np.sum(np.square(self.X_train),axis=1)     #维度是(5000,)没有保持维度,也不能保持维度
    dists = np.add(dists,sq1)                  #维度是(500,5000)
    dists = np.add(dists,sq2) 
    dists = np.sqrt(dists)

