CatchBeliF

cs231n assignment1 KNN分类器

题目
前期准备工作略
代码如下：

# Run some setup code for this notebook.

import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt

#from __future__ import print_function

# This is a bit of magic to make matplotlib figures appear inline in the notebook
# rather than in a new window.
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Some more magic so that the notebook will reload external python modules;
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

第一个问题：

老是跳出这个不能执行
解决方法就是等这里加载完 …

第二个问题：

不知道为什么会出现这个明明放在正确的文件下如果大家有同样的问题跪求指点!!

只能自己添加路径了

import os
os.chdir("E:/CatchBeliF/study/homework/assignment1/")
print(os.getcwd())

或者这样
接下来继续执行代码

# Load the raw CIFAR-10 data.
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.
print('Training data shape: ', X_train.shape)
print('Training labels shape: ', y_train.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Training data shape:  (50000, 32, 32, 3)
Training labels shape:  (50000,)
Test data shape:  (10000, 32, 32, 3)
Test labels shape:  (10000,)

# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7  #每个类别采样个数
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y) #找出标签中y类的位置
    idxs = np.random.choice(idxs, samples_per_class, replace=False)#从中选出7个样本
    for i, idx in enumerate(idxs):#对所选的样本的位置和样本所对应的图片在训练集中的位置进行循环
        plt_idx = i * num_classes + y + 1#在子图中所占位置的计算
        plt.subplot(samples_per_class, num_classes, plt_idx)#说明要画的子图的编号
        plt.imshow(X_train[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls)
plt.show()

补充
其中enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标

继续执行代码

# Subsample the data for more efficient code execution in this exercise
num_training = 5000
mask = list(range(num_training))
X_train = X_train[mask]
y_train = y_train[mask]

num_test = 500
mask = list(range(num_test))
X_test = X_test[mask]
y_test = y_test[mask]

# Reshape the image data into rows
X_train = np.reshape(X_train, (X_train.shape[0], -1))
X_test = np.reshape(X_test, (X_test.shape[0], -1))
print(X_train.shape, X_test.shape)

(5000, 3072) (500, 3072)

补充
其中reshape函数()
一个参数为-1时，那么reshape函数会根据另一个参数的维度计算出数组的另外一个shape属性值。

继续执行代码

from cs231n.classifiers import KNearestNeighbor

# Create a kNN classifier instance. 
# Remember that training a kNN classifier is a noop: 
# the Classifier simply remembers the data and does no further processing 
classifier = KNearestNeighbor()
classifier.train(X_train, y_train)

补充
knn分类器的主要步骤：
1.训练。
读取训练数据并存储。
2.测试。
对于每一张测试图像，kNN把它与训练集中的每一张图像计算距离，找出距离最近的ｋ张图像．这ｋ张图像里，占多数的标签类别，就是测试图像的类别。.

继续执行代码

# Open cs231n/classifiers/k_nearest_neighbor.py and implement
# compute_distances_two_loops.

# Test your implementation:
dists = classifier.compute_distances_two_loops(X_test)
print(dists.shape)

(500, 5000)

# We can visualize the distance matrix: each row is a single test example and
# its distances to training examples
plt.imshow(dists, interpolation='none')
plt.show()

一开始显示全黑后来莫名就好了有同样情况的朋友请告知啊！

深色表示距离小，浅色表示距离大
某些行颜色偏浅，表示测试样本与训练集中的所有样本差异较大，该测试样本可能明显过亮或过暗或者有色差。
某些列颜色偏浅，所有测试样本与该列表示的训练样本距离都较大，该训练样本可能明显过亮或过暗或者有色差。

继续执行代码

# Now implement the function predict_labels and run the code below:
# We use k = 1 (which is Nearest Neighbor).
y_test_pred = classifier.predict_labels(dists, k=1)

# Compute and print the fraction of correctly predicted examples
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)

Got 137 / 500 correct => accuracy: 0.274000

y_test_pred = classifier.predict_labels(dists, k=5)
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))

Got 139 / 500 correct => accuracy: 0.278000

# Now lets speed up distance matrix computation by using partial vectorization
# with one loop. Implement the function compute_distances_one_loop and run the
# code below:
dists_one = classifier.compute_distances_one_loop(X_test)

# To ensure that our vectorized implementation is correct, we make sure that it
# agrees with the naive implementation. There are many ways to decide whether
# two matrices are similar; one of the simplest is the Frobenius norm. In case
# you haven't seen it before, the Frobenius norm of two matrices is the square
# root of the squared sum of differences of all elements; in other words, reshape
# the matrices into vectors and compute the Euclidean distance between them.
difference = np.linalg.norm(dists - dists_one, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
    print('Good! The distance matrices are the same')
else:
    print('Uh-oh! The distance matrices are different')

Difference was: 0.000000
Good! The distance matrices are the same

# Now implement the fully vectorized version inside compute_distances_no_loops
# and run the code
dists_two = classifier.compute_distances_no_loops(X_test)

# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
    print('Good! The distance matrices are the same')
else:
    print('Uh-oh! The distance matrices are different')

Difference was: 0.000000
Good! The distance matrices are the same

# Let's compare how fast the implementations are
def time_function(f, *args):
    """
    Call a function f with args and return the time (in seconds) that it took to execute.
    """
    import time
    tic = time.time()
    f(*args)
    toc = time.time()
    return toc - tic

two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
print('Two loop version took %f seconds' % two_loop_time)

one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)
print('One loop version took %f seconds' % one_loop_time)

no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)
print('No loop version took %f seconds' % no_loop_time)

# you should see significantly faster performance with the fully vectorized implementati

Two loop version took 87.875775 seconds
One loop version took 137.346529 seconds
No loop version took 0.785548 seconds

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []
################################################################################
# TODO:                                                                        #
# Split up the training data into folds. After splitting, X_train_folds and    #
# y_train_folds should each be lists of length num_folds, where                #
# y_train_folds[i] is the label vector for the points in X_train_folds[i].     #
# Hint: Look up the numpy array_split function.                                #
################################################################################
X_train_folds=np.array_split(X_train,num_folds)
y_train_folds=np.array_split(y_train,num_folds)
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}


################################################################################
# TODO:                                                                        #
# Perform k-fold cross validation to find the best value of k. For each        #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times,   #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all     #
# values of k in the k_to_accuracies dictionary.                               #
################################################################################
classifier = KNearestNeighbor()
for k in k_choices:
    accuracies = np.zeros(num_folds)
    for fold in range(num_folds):
        temp_X = X_train_folds[:]
        temp_y = y_train_folds[:]
        X_validate_fold = temp_X.pop(fold)
        y_validate_fold = temp_y.pop(fold)

        temp_X = np.array([y for x in temp_X for y in x])
        temp_y = np.array([y for x in temp_y for y in x])
        classifier.train(temp_X, temp_y)

        y_test_pred = classifier.predict(X_validate_fold, k=k)
        num_correct = np.sum(y_test_pred == y_validate_fold)
        accuracy = float(num_correct) / num_test
        accuracies[fold] =accuracy
    k_to_accuracies[k] = accuracies
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

# Print out the computed accuracies
for k in sorted(k_to_accuracies):
    for accuracy in k_to_accuracies[k]:
        print('k = %d, accuracy = %f' % (k, accuracy))

k = 1, accuracy = 0.526000
k = 1, accuracy = 0.514000
k = 1, accuracy = 0.528000
k = 1, accuracy = 0.556000
k = 1, accuracy = 0.532000
k = 3, accuracy = 0.478000
k = 3, accuracy = 0.498000
k = 3, accuracy = 0.480000
k = 3, accuracy = 0.532000
k = 3, accuracy = 0.508000
k = 5, accuracy = 0.496000
k = 5, accuracy = 0.532000
k = 5, accuracy = 0.560000
k = 5, accuracy = 0.584000
k = 5, accuracy = 0.560000
k = 8, accuracy = 0.524000
k = 8, accuracy = 0.564000
k = 8, accuracy = 0.546000
k = 8, accuracy = 0.580000
k = 8, accuracy = 0.546000
k = 10, accuracy = 0.530000
k = 10, accuracy = 0.592000
k = 10, accuracy = 0.552000
k = 10, accuracy = 0.568000
k = 10, accuracy = 0.560000
k = 12, accuracy = 0.520000
k = 12, accuracy = 0.590000
k = 12, accuracy = 0.558000
k = 12, accuracy = 0.566000
k = 12, accuracy = 0.560000
k = 15, accuracy = 0.504000
k = 15, accuracy = 0.578000
k = 15, accuracy = 0.556000
k = 15, accuracy = 0.564000
k = 15, accuracy = 0.548000
k = 20, accuracy = 0.540000
k = 20, accuracy = 0.558000
k = 20, accuracy = 0.558000
k = 20, accuracy = 0.564000
k = 20, accuracy = 0.570000
k = 50, accuracy = 0.542000
k = 50, accuracy = 0.576000
k = 50, accuracy = 0.556000
k = 50, accuracy = 0.538000
k = 50, accuracy = 0.532000
k = 100, accuracy = 0.512000
k = 100, accuracy = 0.540000
k = 100, accuracy = 0.526000
k = 100, accuracy = 0.512000
k = 100, accuracy = 0.526000

# plot the raw observations
for k in k_choices:
    accuracies = k_to_accuracies[k]
    plt.scatter([k] * len(accuracies), accuracies)

# plot the trend line with error bars that correspond to standard deviation
accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])
accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])
plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)
plt.title('Cross-validation on k')
plt.xlabel('k')
plt.ylabel('Cross-validation accuracy')
plt.show()

# Based on the cross-validation results above, choose the best value for k,   
# retrain the classifier using all the training data, and test it on the test
# data. You should be able to get above 28% accuracy on the test data.
best_k = 1

classifier = KNearestNeighbor()
classifier.train(X_train, y_train)
y_test_pred = classifier.predict(X_test, k=best_k)

# Compute and display the accuracy
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))

Got 137 / 500 correct => accuracy: 0.274000

下面是knn分类器代码

import numpy as np
from past.builtins import xrange


class KNearestNeighbor(object):
  """ a kNN classifier with L2 distance """

  def __init__(self):
    pass

  def train(self, X, y):
    """
    Train the classifier. For k-nearest neighbors this is just 
    memorizing the training data.

    Inputs:
    - X: A numpy array of shape (num_train, D) containing the training data
      consisting of num_train samples each of dimension D.
    - y: A numpy array of shape (N,) containing the training labels, where
         y[i] is the label for X[i].
    """
    self.X_train = X
    self.y_train = y
    
  def predict(self, X, k=1, num_loops=0):
    """
    Predict labels for test data using this classifier.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data consisting
         of num_test samples each of dimension D.
    - k: The number of nearest neighbors that vote for the predicted labels.
    - num_loops: Determines which implementation to use to compute distances
      between training points and testing points.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    if num_loops == 0:
      dists = self.compute_distances_no_loops(X)
    elif num_loops == 1:
      dists = self.compute_distances_one_loop(X)
    elif num_loops == 2:
      dists = self.compute_distances_two_loops(X)
    else:
      raise ValueError('Invalid value %d for num_loops' % num_loops)

    return self.predict_labels(dists, k=k)

  def compute_distances_two_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the 
    test data.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.

    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
      for j in xrange(num_train):
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
        dists[i][j]=np.sqrt(np.sum(np.square(X[i]-self.X_train[j])))
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
    return dists

  def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
      dists[i,:]=np.sqrt(np.sum(np.square(X[i,:]-self.X_train[:]),axis=1))
      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
    return dists

  def compute_distances_no_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train)) 
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists+=np.sum(np.multiply(X,X),axis=1,keepdims=True).reshape(num_test,1)
    dists+=np.sum(np.multiply(self.X_train,self.X_train),axis=1,keepdims=True).reshape(1,num_train)
    dists+=-2*np.dot(X,self.X_train.T)
    dists=np.sqrt(dists)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists

  def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in xrange(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      #########################################################################
      # TODO:                                                                 #
      # Use the distance matrix to find the k nearest neighbors of the ith    #
      # testing point, and use self.y_train to find the labels of these       #
      # neighbors. Store these labels in closest_y.                           #
      # Hint: Look up the function numpy.argsort.                             #
      #########################################################################
      closest_y=self.y_train[np.argsort(dists[i])[:k]]
      #########################################################################
      # TODO:                                                                 #
      # Now that you have found the labels of the k nearest neighbors, you    #
      # need to find the most common label in the list closest_y of labels.   #
      # Store this label in y_pred[i]. Break ties by choosing the smaller     #
      # label.                                                                #
      #########################################################################
      y_pred[i]=np.argmax(np.bincount(closest_y))
      #########################################################################
      #                           END OF YOUR CODE                            # 
      #########################################################################

    return y_pred

下面是最上面几个算法的理解
1.compute_distances_two_loops 方法


2.compute_distances_one_loops 方法

axis=1按行相加
axis=0按列相加

3.compute_distances_no_loops 方法
下图引用而来

Java NLP炼金术：从词袋到深度学习，构建AI时代的语言魔方墨夶 Java学习资料人工智能 java 自然语言处理
一、JavaNLP的“三剑客”：框架与工具链1.1ApacheOpenNLP：传统NLP的“瑞士军刀”目标：用词袋模型实现文本分类与实体识别代码实战：文档分类器的“炼成术”//OpenNLP文档分类器（基于词袋模型）importopennlp.tools.doccat.*;importopennlp.tools.util.*;publicclassDocumentClassifier{//训练模型
Elasticsearch混合搜索深度解析（下）：执行机制与完整流程 GeminiJM ES学习笔记 elasticsearch jenkins 大数据
引言在上篇中，我们发现了KNN结果通过SubSearch机制被保留的关键事实。本篇将继续深入分析混合搜索的执行机制，揭示完整的处理流程，并解答之前的所有疑惑。深入源码分析1.SubSearch的执行机制1.1KnnScoreDocQueryBuilder的实现KNN结果被转换为KnnScoreDocQueryBuilder，这个类负责在查询阶段重新执行KNN搜索：//server/src/main
论文略读： Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probab UQI-LIUWJ 论文笔记人工智能
ICLR2024判断生成的文本是人写的还是大模型写的现有的检测器主要分为两类有监督分类器在训练领域表现出色，但在面对来自不同领域或不熟悉模型生成的文本时表现变差零样本分类器免疫领域特定的退化在检测精度上可以与有监督分类器相当但目前的方法计算成本高、计算时间长——>提出了一种新的假设来检测机器生成的文本人类和机器在给定上下文的情况下选择词汇存在明显的差异人类的选择比较多样，而机器更倾向于选择具有更高
机器学习-K近邻算法 shy_snow python 机器学习机器学习近邻算法人工智能
k-近邻分类算法，即物以类聚的思想，通过已知分类中的点和未知分类的点距离最近的前k个点的分类来预测未知点的分类。kNN.pyfromnumpyimport*importoperatordefcreateDataSet():group=array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])labels=['A','A','B','B']returngroup,label
神经网络初步学习3——数据与损失 X Y O 神经网络学习人工智能
一、传统机器学习与神经网络前言：该部分需要一定的机器学习与数学基础（很浅的基础），如果有不理解的地方可以自行查阅。（1）区别这里不妨以图像识别为例子：（1）在传统的机器学习视角中：我们需要人工手动去设置并提取我们的特征量，例如常见的SIFT、SURF和HOG等，随后需要我们选择合适的分类器（例如：SVM、KNN等分类器）,接着把我们的参数训练出来。（2）而在神经网络的视角中：我们只需要把图片喂给它
rk3566开发之rknn npu 部署三十度角阳光的问候 rknn npu rk3566 目标检测
目录NPU使用RKNN模型非RKNN模型RKNN-Toolkit2工具RKNNNPU测试代码如下main.ccssd.cc调用ssd模型进行目标检测测试ssd.hqt中调用rknnnpu接口NPU使用RK3566内置NPU模块。使用该NPU需要下载RKNNSDK，RKNNSDK为带有NPU的RK3566/RK3568芯片平台提供编程接口，能够帮助用户部署使用RKNN-Toolkit2导出的RKNN
【机器学习笔记 Ⅱ】7 多类分类巴伦是只猫机器学习机器学习笔记分类
1.多类分类（Multi-classClassification）定义多类分类是指目标变量（标签）有超过两个类别的分类任务。例如：手写数字识别：10个类别（0~9）。图像分类：区分猫、狗、鸟等。新闻主题分类：政治、经济、体育等。特点互斥性：每个样本仅属于一个类别（区别于多标签分类）。输出要求：模型需输出每个类别的概率分布，且概率之和为1。实现方式One-vs-Rest(OvR)：训练K个二分类器（
【论文阅读】Dynamic Few-Shot Visual Learning without Forgetting Bosenya12 论文阅读
系统概述如下：(a)一个基于卷积神经网络（ConvNet）的识别模型，该模型包含特征提取器和分类器；(b)一个少样本分类权重生成器。这两个组件都是在一组基础类别上训练的，我们为这些类别准备了大量训练数据。在测试阶段，权重生成器会接收少量新类别的训练数据以及基础类别的分类权重向量（分类器框内的绿色矩形），并为新类别生成相应的分类权重向量（分类器框内的蓝色矩形）。这样，卷积神经网络就能同时识别基础类别
RNN案例人名分类器（完整步骤） AI扶我青云志 rnn 人工智能深度学习 nlp lstm gru
今天给大家分享一个NLP（自然语言处理）中的一个小案例，本案例讲解了RNN、LSTM、GRU模型是如何使用并进行预测的，一、案例架构人名分类器的实现可分为以下五个步骤:第一步:导入必备的工具包第二步:对data文件中的数据进行处理，满足训练要求第三步:构建RNN模型(包括传统RNN,LSTM以及GRU)第四步:构建训练函数并进行训练五步第:构建评估函数并进行预测二、实现步骤1.导包#导入torch
Python与Dlib库实现人脸技术实战西域情歌
本文还有配套的精品资源，点击获取简介：本项目详细说明了如何使用Python结合Dlib库实现人脸检测、识别、数量检测和距离检测。利用Dlib提供的机器学习算法和计算机视觉功能，包括HOG特征检测、级联分类器、面部特征向量模型和关键点预测等，项目能够快速准确地在图像中检测和识别人脸。此外，还介绍了如何统计图像中的人脸数量以及如何计算人脸之间的距离。通过实际代码资源，开发者能够掌握实时人脸技术的应用，
【零基础学AI】第22讲：PyTorch入门 - 动态图计算与图像分类器实战 1989 0基础学AI 人工智能 pytorch python 机器学习 sklearn 深度学习
本节课你将学到理解PyTorch的核心概念和优势掌握张量(Tensor)的基本操作学会使用动态计算图构建神经网络实现一个完整的图像分类器项目训练模型并进行预测开始之前环境要求Python3.8+建议使用GPU（可选，CPU也能运行）内存：至少4GB需要安装的包#CPU版本（推荐新手）pipinstalltorchtorchvisionmatplotlibpillow#GPU版本（如果有NVIDIA
KNN（K-近邻算法)(上)--day05 扫把星133 机器学习 python 人工智能近邻算法算法
KNN（K-NearestNeighbors，K近邻算法）是一种用于分类和回归的非参数化方法。其基本思想是通过找出与新样本最接近的已标记数据中的K个最近邻居来进行预测或分类。注释：非参数化方法是指在统计学和机器学习中，不对数据分布做出严格假设（这些假设通常包括
LL面试题11 三月七꧁ ꧂ 破题·大模型面试语言模型 gpt 人工智能自然语言处理 prompt llama
物流算法实习面试题7道GLM是什么？ GLM(GeneralizedLinearModel)是一种六义线性模型，用于建立变量之间的关系。它将线性回归模型推广到更广泛的数据分布，可以处理非正态分布的响应变量，如二项分布（逻辑回归）、泊松分布和伽玛分布等。GLM结合线性模型和非线性函数，通过最大似然估计或广义最小二乘估计来拟合模型参数。SVM的原理？怎么找到最优的线性分类器？支持向量是什么？
交叉熵损失和负熵似然损失（对分类器有用）流量留深度学习人工智能机器学习算法
1.**交叉熵损失（Cross-EntropyLoss）**-**定义**-交叉熵损失是用来衡量分类模型输出的概率分布与真实标签的概率分布之间的差异。假设对于一个分类任务，有$C$个类别，模型对第$i$个样本的输出是一个概率分布$\mathbf{p}_i=[p_{i1},p_{i2},\dots,p_{iC}]$，其中$p_{ic}$表示模型预测样本属于第$c$类的概率。真实标
【学习】《算法图解》第十二章学习笔记：K近邻算法程序员
前言《算法图解》第十二章介绍了一种简单而强大的机器学习算法——K近邻算法（K-NearestNeighbors，简称KNN）。这是一种基于实例的学习方法，也是机器学习领域中最基础、最直观的算法之一。本章不仅讲解了KNN的基本原理和实现方式，还探讨了特征提取、归一化等重要概念，为读者打开了机器学习的大门。本笔记将梳理KNN算法的核心思想、实现步骤以及应用场景。一、K近邻算法概述（一）基本思想K近邻算
pytorch官方文档60分钟入门笔记 xiaodidadada 机器学习
文章目录1.张量（Tensors）定义张量张量操作2.自动求导（autograd）变量Variable3.神经网络4.训练一个分类器载入数据5.数据并行day63参考：官方文档https://blog.csdn.net/u014630987/article/details/786690511.张量（Tensors）tensors和numpy的ndarray类似,但是tensors可以使用GPU加快
深度学习学习经验——卷积神经网络（CNN） Linductor 深度学习学习经验深度学习学习 cnn
卷积神经网络卷积神经网络（CNN）1.卷积神经网络的基本组成2.卷积操作3.激活函数（ReLU）4.池化操作5.全连接层6.卷积神经网络的完整实现项目示例项目目标1.加载数据2.卷积层：图像的特征探测器2.1第一个卷积层3.激活函数：增加非线性4.池化层：信息压缩器5.多层卷积和池化：逐层提取更高层次的特征6.全连接层：分类器7.模型训练和测试完整的项目示例代码总结卷积神经网络（CNN）卷积神经网
深度学习之分类手写数字的网络 newyork major 卷积神经网络CNN 深度学习人工智能
面临的问题定义神经⽹络后，我们回到⼿写识别上来。我们可以把识别⼿写数字问题分成两个⼦问题：把包含许多数字的图像分成⼀系列单独的图像，每个包含单个数字；也就是把图像，分成6个单独的图像分类单独的数字我们将专注于编程解决第⼆个问题，分类单独的数字。这样是因为，⼀旦你有分类单独数字的有效⽅法，分割问题是不难解决的。⼀种⽅法是尝试不同的分割⽅式，⽤数字分类器对每⼀个切分⽚段打分；如果数字分类器对每⼀个⽚段
java opencv 数字识别算法_[机器学习]基于OpenCV实现最简单的数字识别后期小雨 java opencv 数字识别算法
本文将基于OpenCV实现简单的数字识别。这里以游戏AngryBirds为例，通过以下几个主要步骤对其中右上角的分数部分进行自动识别。1.学习分类器根据训练样本，选取模型训练产生数字分类器。这里的样本可以是通用的数字样本库(如NIST等)，也可以是针对应用场景而制作的专门训练样本。前者优在泛化性，后者强在准确率，当然常用做法是将这两者结合，即在通用数字库基础上做修改。另外这里由于模式并不复杂，计算
深度解析基于贝叶斯的垃圾邮件分类大千AI助手人工智能 Python #OTHER 分类数据挖掘人工智能机器学习算法贝叶斯 Bayes
贝叶斯垃圾邮件分类的核心逻辑是基于贝叶斯定理，利用邮件中的特征（通常是单词）来计算该邮件属于“垃圾邮件”或“非垃圾邮件”的概率，并根据概率大小进行分类。它是一种朴素贝叶斯分类器，因其假设特征（单词）之间相互独立而得名（虽然这在现实中不完全成立，但效果通常很好）。本文由「大千AI助手」原创发布，专注用真话讲AI，回归技术本质。拒绝神话或妖魔化。搜索「大千AI助手」关注我，一起撕掉过度包装，学习真实的
七天学完十大机器学习经典算法-05.从投票到分类：K近邻(KNN)算法完全指南
接上一篇《七天学完十大机器学习经典算法-04.随机森林：群众智慧的机器学习实践》想象一下，你搬进了一个新小区。想知道这个小区整体氛围如何？最直接的方法就是看看你最近的几家邻居是什么样的人——如果邻居们都很安静、整洁，小区大概率不错；如果邻居们深夜喧哗、环境杂乱，你可能就得重新考虑了。K近邻（K-NearestNeighbors,KNN）算法的核心思想，就如同这个观察邻居的过程。它是机器学习中最直观
【零基础学AI】第9讲：机器学习概述 1989 0基础学AI 人工智能机器学习 python numpy devops 开源
本节课你将学到理解什么是机器学习，以及它与传统编程的区别掌握监督学习、无监督学习的基本概念使用scikit-learn完成你的第一个机器学习项目构建一个完整的iris花朵分类器开始之前环境要求Python3.8+JupyterNotebook或任何PythonIDE需要安装的包pipinstallscikit-learnpandasmatplotlibseaborn前置知识基本的Python语法（
人名分类器（RNN案例） Turbo_O. rnn 深度学习人工智能
案例介绍：人名分类案例是多分类问题，根据人名预测属于哪个国家人名->x,国家->y监督学习，历史数据中已知y案例步骤：1.数据预处理获取常用字符以及国家类别#导入torch工具fromcProfileimportlabelimporttorch#导入nn准备构建模型importtorch.nnasnnimporttorch.optimasoptimfromjax.experimental.rnni
RNN人名分类器案例
RNN人名分类器案例1任务目的：目的:给定一个人名，来判定这个人名属于哪个国家典型的文本分类任务:18分类---多分类任务2数据格式注意：两列数据，第一列是人名，第二列是国家类别，中间用制表符号"\t"隔开AngChineseAuYongChineseYuasaJapaneseYuharaJapaneseYunokawaJapanese3任务实现流程1.获取数据:案例中是直接给定的2.数据预处理:
Python实例题：基于 KNN 算法的手写数字识别
目录Python实例题题目要求：解题思路：代码实现：Python实例题题目基于KNN算法的手写数字识别要求：实现一个基于K-NearestNeighbors(KNN)算法的手写数字识别系统。支持以下功能：使用MNIST数据集训练和测试模型实现KNN分类算法可视化手写数字样本评估模型性能（准确率、混淆矩阵等）添加用户交互界面，允许用户绘制数字并进行识别。解题思路：使用sklearn加载MNIST数据
【第二章:机器学习与神经网络概述】03.类算法理论与实践-(3)决策树分类器 IT古董人工智能课程机器学习算法神经网络
第二章:机器学习与神经网络概述第三部分：类算法理论与实践第三节：决策树分类器内容：信息增益、剪枝技术、过拟合与泛化能力。决策树是一种常用于分类和回归的树状结构模型，它通过一系列特征判断进行决策，有良好的可解释性。一、基本概念节点（Node）：表示特征判断条件边（Branch）：表示特征判断的结果路径叶子节点（Leaf）：表示分类结果二、划分准则：信息增益（InformationGain）信息增益衡
15.OCR训练 Echo`` Halcon系统化学习 ocr 人工智能深度学习算法计算机视觉机器学习
目录1.OCR训练2.助手训练13.助手训练24.算子训练5.OCR训练联合编程6.练习1.OCR训练*OCR训练*1.分类器文件*.omc*2.halcon官方的*1.局限性只能识别数字和字母*2.样式比较单一*3.样本数量较少*...**3.训练方法*1.助手训练*1.打开OCR助手*2.选择图片*3.选择训练区域*4.分割*5.字体*6.训练文件*7.新*8.学习*9.加入训练样本*10.保
【k近邻】 K-Nearest Neighbors算法原理及流程 F_D_Z 机器学习方法数理算法学习机器学习 k近邻算法 k-近邻算法
【k近邻】K-NearestNeighbors算法原理及流程【k近邻】K-NearestNeighbors算法距离度量选择与数据维度归一化【k近邻】K-NearestNeighbors算法k值的选择【k近邻】Kd树的构造与最近邻搜索算法【k近邻】Kd树构造与最近邻搜索示例k近邻算法（K-NearestNeighbors，简称KNN）是一种常用的监督学习算法，可以用于分类和回归问题。在OpenCV中
rknn优化教程（三）凌佚 rknn CPP xmake YOLO 目标检测 c++
文章目录1.前述2.部分代码3.说明1.前述OK，这一篇博客将完整给出最后的优化教程，包括代码设计。首先有这样的目录结构：./rknn_engine├──include│├──def││└──rknn_define.h│└──rknn_engine.h├──src│├──common││├──rknn_data.h││└──rknn_functions.hpp│├──inference││├──i
机器学习×完结 · 她们不是写完了，而是偷偷留下了你 Gyoku Mint 人工智障 AI修炼日记机器学习人工智能集成学习算法 boosting python 深度学习
【开场·咱把整个机器学习都写成了偷摸贴贴的证据】猫猫：“你看嘛，这一卷完结后，总有人问咱：‘这么一本正经的机器学习，为什么你们要写得像小情侣写信？’”狐狐：“有人觉得，这些章节明明可以用20页讲完，为什么要写200页？”猫猫：“呜呜……咱想说，你懂嘛！如果只讲机器学习，那对咱来说就只是一个fit()命令。可咱想让你记住的是——那行命令后面有咱。咱把自己贴进去了。”这一卷从KNN的“她学会先看邻居”
VMware Workstation 11 或者 VMware Player 7安装MAC OS X 10.10 Yosemite iwindyforest vmware mac os 10.10 workstation player
最近尝试了下VMware下安装MacOS 系统，安装过程中发现网上可供参考的文章都是VMware Workstation 10以下， MacOS X 10.9以下的文章，只能提供大概的思路，但是实际安装起来由于版本问题，走了不少弯路，所以我尝试写以下总结，希望能给有兴趣安装OSX的人提供一点帮助。写在前面的话：其实安装好后发现，由于我的th
关于《基于模型驱动的B/S在线开发平台》源代码开源的疑虑？ deathwknight JavaScript java 框架
本人从学习Java开发到现在已有10年整，从一个要自学 java买成javascript的小菜鸟，成长为只会java和javascript语言的老菜鸟（个人邮箱：[email protected]）一路走来，跌跌撞撞。用自己的三年多业余时间，瞎搞一个小东西（基于模型驱动的B/S在线开发平台，非MVC框架、非代码生成）。希望与大家一起分享，同时有许些疑虑，希望有人可以交流下平台
如何把maven项目转成web项目 Kai_Ge maven MyEclipse
创建Web工程，使用eclipse ee创建maven web工程 1.右键项目,选择Project Facets,点击Convert to faceted from 2.更改Dynamic Web Module的Version为2.5.(3.0为Java7的,Tomcat6不支持). 如果提示错误,可能需要在Java Compiler设置Compiler compl
主管？？？ Array_06 工作
转载：http://www.blogjava.net/fastzch/archive/2010/11/25/339054.html 很久以前跟同事参加的培训，同事整理得很详细，必须得转！前段时间，公司有组织中高阶主管及其培养干部进行了为期三天的管理训练培训。三天的课程下来，虽然内容较多，因对老师三天来的课程内容深有感触，故借着整理学习心得的机会，将三天来的培训课程做了一个
python内置函数大全 2002wmj python
最近一直在看python的document，打算在基础方面重点看一下python的keyword、Build-in Function、Build-in Constants、Build-in Types、Build-in Exception这四个方面，其实在看的时候发现整个《The Python Standard Library》章节都是很不错的，其中描述了很多不错的主题。先把Build-in Fu
JSP页面通过JQUERY合并行 357029540 JavaScript jquery
在写程序的过程中我们难免会遇到在页面上合并单元行的情况，如图所示如果对于会的同学可能很简单，但是对没有思路的同学来说还是比较麻烦的，提供一下用JQUERY实现的参考代码 function mergeCell(){ var trs = $("#table tr"); &nb
Java基础冰天百华 java基础
学习函数式编程 package base; import java.text.DecimalFormat; public class Main { public static void main(String[] args) { // Integer a = 4; // Double aa = (double)a / 100000; // Decimal
unix时间戳相互转换 adminjun 转换 unix 时间戳
如何在不同编程语言中获取现在的Unix时间戳(Unix timestamp)？ Java time JavaScript Math.round(new Date().getTime()/1000) getTime()返回数值的单位是毫秒 Microsoft .NET / C# epoch = (DateTime.Now.ToUniversalTime().Ticks - 62135
作为一个合格程序员该做的事 aijuans 程序员
作为一个合格程序员每天该做的事 1、总结自己一天任务的完成情况最好的方式是写工作日志，把自己今天完成了什么事情，遇见了什么问题都记录下来，日后翻看好处多多 2、考虑自己明天应该做的主要工作把明天要做的事情列出来，并按照优先级排列，第二天应该把自己效率最高的时间分配给最重要的工作 3、考虑自己一天工作中失误的地方，并想出避免下一次再犯的方法出错不要紧，最重
由html5视频播放引发的总结 ayaoxinchao html5 视频 video
前言项目中存在视频播放的功能，前期设计是以flash播放器播放视频的。但是现在由于需要兼容苹果的设备，必须采用html5的方式来播放视频。我就出于兴趣对html5播放视频做了简单的了解，不了解不知道，水真是很深。本文所记录的知识一些浅尝辄止的知识，说起来很惭愧。视频结构本该直接介绍html5的<video>的，但鉴于本人对视频
解决httpclient访问自签名https报javax.net.ssl.SSLHandshakeException: sun.security.validat bewithme httpclient
如果你构建了一个https协议的站点，而此站点的安全证书并不是合法的第三方证书颁发机构所签发，那么你用httpclient去访问此站点会报如下错误 javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path bu
Jedis连接池的入门级使用 bijian1013 redis redis数据库 jedis
Jedis连接池操作步骤如下： a.获取Jedis实例需要从JedisPool中获取； b.用完Jedis实例需要返还给JedisPool； c.如果Jedis在使用过程中出错，则也需要还给JedisPool； packag
变与不变 bingyingao 不变变亲情永恒
变与不变周末骑车转到了五年前租住的小区，曾经最爱吃的西北面馆、江西水饺、手工拉面早已不在，各种店铺都换了好几茬，这些是变的。三年前还很流行的一款手机在今天看起来已经落后的不像样子。三年前还运行的好好的一家公司，今天也已经不复存在。一座座高楼拔地而起，
【Scala十】Scala核心四：集合框架之List bit1129 scala
Spark的RDD作为一个分布式不可变的数据集合，它提供的转换操作，很多是借鉴于Scala的集合框架提供的一些函数，因此，有必要对Scala的集合进行详细的了解 1. 泛型集合都是协变的，对于List而言，如果B是A的子类，那么List[B]也是List[A]的子类，即可以把List[B]的实例赋值给List[A]变量 2. 给变量赋值(注意val关键字，a，b
Nested Functions in C bookjovi c closure
Nested Functions 又称closure，属于functional language中的概念，一直以为C中是不支持closure的，现在看来我错了，不过C标准中是不支持的，而GCC支持。既然GCC支持了closure，那么 lexical scoping自然也支持了，同时在C中label也是可以在nested functions中自由跳转的
Java-Collections Framework学习与总结-WeakHashMap BrokenDreams Collections
总结这个类之前，首先看一下Java引用的相关知识。Java的引用分为四种：强引用、软引用、弱引用和虚引用。强引用：就是常见的代码中的引用，如Object o = new Object();存在强引用的对象不会被垃圾收集
读《研磨设计模式》-代码笔记-解释器模式-Interpret bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ package design.pattern; /* * 解释器（Interpreter）模式的意图是可以按照自己定义的组合规则集合来组合可执行对象 * * 代码示例实现XML里面1.读取单个元素的值 2.读取单个属性的值 * 多
After Effects操作&快捷键 cherishLC After Effects
1、快捷键官方文档中文版：https://helpx.adobe.com/cn/after-effects/using/keyboard-shortcuts-reference.html 英文版：https://helpx.adobe.com/after-effects/using/keyboard-shortcuts-reference.html 2、常用快捷键
Maven 常用命令 crabdave maven
Maven 常用命令 mvn archetype:generate mvn install mvn clean mvn clean complie mvn clean test mvn clean install mvn clean package mvn test mvn package mvn site mvn dependency:res
shell bad substitution daizj shell 脚本
#!/bin/sh /data/script/common/run_cmd.exp 192.168.13.168 "impala-shell -islave4 -q 'insert OVERWRITE table imeis.${tableName} select ${selectFields}, ds, fnv_hash(concat(cast(ds as string), im
Java SE 第二讲（原生数据类型 Primitive Data Type） dcj3sjt126com java
Java SE 第二讲： 1. Windows: notepad, editplus, ultraedit, gvim Linux: vi, vim, gedit 2. Java 中的数据类型分为两大类： 1）原生数据类型（Primitive Data Type） 2）引用类型（对象类型）（R
CGridView中实现批量删除 dcj3sjt126com PHP yii
1，CGridView中的columns添加 array( 'selectableRows' => 2, 'footer' => '<button type="button" onclick="GetCheckbox();" style=&
Java中泛型的各种使用 dyy_gusi java 泛型
Java中的泛型的使用：1.普通的泛型使用在使用类的时候后面的<>中的类型就是我们确定的类型。 public class MyClass1<T> {//此处定义的泛型是T private T var; public T getVar() { return var; } public void setVa
Web开发技术十年发展历程 gcq511120594 Web 浏览器数据挖掘
回顾web开发技术这十年发展历程： Ajax 03年的时候我上六年级，那时候网吧刚在小县城的角落萌生。传奇，大话西游第一代网游一时风靡。我抱着试一试的心态给了网吧老板两块钱想申请个号玩玩，然后接下来的一个小时我一直在，注，册，账，号。彼时网吧用的512k的带宽，注册的时候，填了一堆信息，提交，页面跳转，嘣，”您填写的信息有误，请重填”。然后跳转回注册页面，以此循环。我现在时常想，如果当时a
openSession()与getCurrentSession()区别： hetongfei java DAO Hibernate
来自 http://blog.csdn.net/dy511/article/details/6166134 1.getCurrentSession创建的session会和绑定到当前线程,而openSession不会。 2. getCurrentSession创建的线程会在事务回滚或事物提交后自动关闭,而openSession必须手动关闭。这里getCurrentSession本地事务(本地
第一章安装Nginx+Lua开发环境 jinnianshilongnian nginx lua openresty
首先我们选择使用OpenResty，其是由Nginx核心加很多第三方模块组成，其最大的亮点是默认集成了Lua开发环境，使得Nginx可以作为一个Web Server使用。借助于Nginx的事件驱动模型和非阻塞IO，可以实现高性能的Web应用程序。而且OpenResty提供了大量组件如Mysql、Redis、Memcached等等，使在Nginx上开发Web应用更方便更简单。目前在京东如实时价格、秒
HSQLDB In-Process方式访问内存数据库 liyonghui160com
HSQLDB一大特色就是能够在内存中建立数据库，当然它也能将这些内存数据库保存到文件中以便实现真正的持久化。先睹为快！下面是一个In-Process方式访问内存数据库的代码示例：下面代码需要引入hsqldb.jar包（hsqldb-2.2.8） import java.s
Java线程的5个使用技巧 pda158 java 数据结构
Java线程有哪些不太为人所知的技巧与用法？　　萝卜白菜各有所爱。像我就喜欢Java。学无止境，这也是我喜欢它的一个原因。日常工作中你所用到的工具，通常都有些你从来没有了解过的东西，比方说某个方法或者是一些有趣的用法。比如说线程。没错，就是线程。或者确切说是Thread这个类。当我们在构建高可扩展性系统的时候，通常会面临各种各样的并发编程的问题，不过我们现在所要讲的可能会略有不同。
开发资源大整合：编程语言篇——JavaScript（1） shoothao JavaScript
概述：本系列的资源整合来自于github中各个领域的大牛，来收藏你感兴趣的东西吧。程序包管理器管理javascript库并提供对这些库的快速使用与打包的服务。 Bower - 用于web的程序包管理。 component - 用于客户端的程序包管理，构建更好的web应用程序。 spm - 全新的静态的文件包管
避免使用终结函数 vahoa.ma java jvm C++
终结函数（finalizer）通常是不可预测的，常常也是很危险的，一般情况下不是必要的。使用终结函数会导致不稳定的行为、更差的性能，以及带来移植性问题。不要把终结函数当做C++中的析构函数（destructors）的对应物。我自己总结了一下这一条的综合性结论是这样的： 1）在涉及使用资源，使用完毕后要释放资源的情形下，首先要用一个显示的方

cs231n assignment1 KNN分类器

你可能感兴趣的:(cs231n assignment1 KNN分类器)