嫉妒的虚荣

CS231n（Spring 2019）Assignment 1 - SVM

Preprocessing

# Load the raw CIFAR-10 data.
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'

# Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.
print('Training data shape: ', X_train.shape)
print('Training labels shape: ', y_train.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Clear previously loaded data.
Training data shape:  (50000, 32, 32, 3)
Training labels shape:  (50000,)
Test data shape:  (10000, 32, 32, 3)
Test labels shape:  (10000,)

# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y)
    idxs = np.random.choice(idxs, samples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt_idx = i * num_classes + y + 1
        plt.subplot(samples_per_class, num_classes, plt_idx)
        plt.imshow(X_train[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls)
plt.show()

# Split the data into train, val, and test sets. In addition we will
# create a small development set as a subset of the training data;
# we can use this for development so our code runs faster.
num_training = 49000
num_validation = 1000
num_test = 1000
num_dev = 500

# Our validation set will be num_validation points from the original
# training set.
mask = range(num_training, num_training + num_validation)
X_val = X_train[mask]
y_val = y_train[mask]

# Our training set will be the first num_train points from the original
# training set.
mask = range(num_training)
X_train = X_train[mask]
y_train = y_train[mask]

# We will also make a development set, which is a small subset of
# the training set.
mask = np.random.choice(num_training, num_dev, replace=False)
X_dev = X_train[mask]
y_dev = y_train[mask]

# We use the first num_test points of the original test set as our
# test set.
mask = range(num_test)
X_test = X_test[mask]
y_test = y_test[mask]

print('Train data shape: ', X_train.shape)
print('Train labels shape: ', y_train.shape)
print('Validation data shape: ', X_val.shape)
print('Validation labels shape: ', y_val.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Train data shape:  (49000, 32, 32, 3)
Train labels shape:  (49000,)
Validation data shape:  (1000, 32, 32, 3)
Validation labels shape:  (1000,)
Test data shape:  (1000, 32, 32, 3)
Test labels shape:  (1000,)

# Preprocessing: reshape the image data into rows
X_train = np.reshape(X_train, (X_train.shape[0], -1))
X_val = np.reshape(X_val, (X_val.shape[0], -1))
X_test = np.reshape(X_test, (X_test.shape[0], -1))
X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))

# As a sanity check, print out the shapes of the data
print('Training data shape: ', X_train.shape)
print('Validation data shape: ', X_val.shape)
print('Test data shape: ', X_test.shape)
print('dev data shape: ', X_dev.shape)

Training data shape:  (49000, 3072)
Validation data shape:  (1000, 3072)
Test data shape:  (1000, 3072)
dev data shape:  (500, 3072)

# Preprocessing: subtract the mean image
# first: compute the image mean based on the training data
mean_image = np.mean(X_train, axis=0)
print(mean_image[:10]) # print a few of the elements
plt.figure(figsize=(4,4))
plt.imshow(mean_image.reshape((32,32,3)).astype('uint8')) # visualize the mean image
plt.show()

# second: subtract the mean image from train and test data
X_train -= mean_image
X_val -= mean_image
X_test -= mean_image
X_dev -= mean_image

# third: append the bias dimension of ones (i.e. bias trick) so that our SVM
# only has to worry about optimizing a single weight matrix W.
X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])
X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])
X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])
X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])

print(X_train.shape, X_val.shape, X_test.shape, X_dev.shape)

[130.64189796 135.98173469 132.47391837 130.05569388 135.34804082
 131.75402041 130.96055102 136.14328571 132.47636735 131.48467347]

(49000, 3073) (1000, 3073) (1000, 3073) (500, 3073)

SVM Classifier

(in cs231n/classifiers/linear_svm.py)

def svm_loss_naive(W, X, y, reg):
    """
    Structured SVM loss function, naive implementation (with loops).

    Inputs have dimension D, there are C classes, and we operate on minibatches
    of N examples.

    Inputs:
    - W: A numpy array of shape (D, C) containing weights.
    - X: A numpy array of shape (N, D) containing a minibatch of data.
    - y: A numpy array of shape (N,) containing training labels; y[i] = c means
      that X[i] has label c, where 0 <= c < C.
    - reg: (float) regularization strength

    Returns a tuple of:
    - loss as single float
    - gradient with respect to weights W; an array of same shape as W
    """
    dW = np.zeros(W.shape) # initialize the gradient as zero

    # compute the loss and the gradient
    num_classes = W.shape[1]
    num_train = X.shape[0]
    loss = 0.0
    for i in range(num_train):
        scores = X[i].dot(W)
        correct_class_score = scores[y[i]]
        for j in range(num_classes):
            if j == y[i]:
                continue
            margin = scores[j] - correct_class_score + 1 # note delta = 1
            if margin > 0:
                loss += margin
                dW[:, y[i]] += -X[i,:].T
                dW[:, j] += X[i,:].T

    # Right now the loss is a sum over all training examples, but we want it
    # to be an average instead so we divide by num_train.
    loss /= num_train
    dW /= num_train

    # Add regularization to the loss.
    loss += 0.5 * reg * np.sum(W * W)

    #############################################################################
    # TODO:                                                                     #
    # Compute the gradient of the loss function and store it dW.                #
    # Rather that first computing the loss and then computing the derivative,   #
    # it may be simpler to compute the derivative at the same time that the     #
    # loss is being computed. As a result you may need to modify some of the    #
    # code above to compute the gradient.                                       #
    #############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    dW += reg * W

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
    return loss, dW

# Evaluate the naive implementation of the loss we provided for you:
from cs231n.classifiers.linear_svm import svm_loss_naive
import time

# generate a random SVM weight matrix of small numbers
W = np.random.randn(3073, 10) * 0.0001 

loss, grad = svm_loss_naive(W, X_dev, y_dev, 0.000005)
print('loss: %f' % (loss, ))

loss: 9.053956

# Once you've implemented the gradient, recompute it with the code below
# and gradient check it with the function we provided for you

# Compute the loss and its gradient at W.
loss, grad = svm_loss_naive(W, X_dev, y_dev, 0)

# Numerically compute the gradient along several randomly chosen dimensions, and
# compare them with your analytically computed gradient. The numbers should match
# almost exactly along all dimensions.
from cs231n.gradient_check import grad_check_sparse
f = lambda w: svm_loss_naive(w, X_dev, y_dev, 0.0)[0]
grad_numerical = grad_check_sparse(f, W, grad)

# do the gradient check once again with regularization turned on
# you didn't forget the regularization gradient did you?
loss, grad = svm_loss_naive(W, X_dev, y_dev, 5e1)
f = lambda w: svm_loss_naive(w, X_dev, y_dev, 5e1)[0]
grad_numerical = grad_check_sparse(f, W, grad)

numerical: -34.753629 analytic: -34.753629, relative error: 2.742306e-12
numerical: -48.469713 analytic: -48.469713, relative error: 6.065376e-13
numerical: -1.603047 analytic: -1.603047, relative error: 9.472120e-11
numerical: 5.249776 analytic: 5.249776, relative error: 1.141679e-11
numerical: -4.809977 analytic: -4.809977, relative error: 2.608340e-11
numerical: -10.777707 analytic: -10.777707, relative error: 1.961092e-11
numerical: -3.271406 analytic: -3.237796, relative error: 5.163584e-03
numerical: -3.988728 analytic: -3.988728, relative error: 3.997376e-11
numerical: -22.090951 analytic: -22.090951, relative error: 8.991002e-12
numerical: -23.452841 analytic: -23.452841, relative error: 5.544743e-12
numerical: -1.728582 analytic: -1.716928, relative error: 3.382425e-03
numerical: -22.171377 analytic: -22.175782, relative error: 9.933796e-05
numerical: -8.501782 analytic: -8.509761, relative error: 4.690096e-04
numerical: -24.672617 analytic: -24.670656, relative error: 3.974417e-05
numerical: -0.118892 analytic: -0.120984, relative error: 8.721904e-03
numerical: 10.829417 analytic: 10.825013, relative error: 2.033897e-04
numerical: -12.508858 analytic: -12.504099, relative error: 1.902809e-04
numerical: -4.451844 analytic: -4.444428, relative error: 8.335108e-04
numerical: -5.860418 analytic: -5.864526, relative error: 3.503710e-04
numerical: 28.043494 analytic: 28.045330, relative error: 3.273877e-05

# Next implement the function svm_loss_vectorized; for now only compute the loss;
# we will implement the gradient in a moment.
tic = time.time()
loss_naive, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Naive loss: %e computed in %fs' % (loss_naive, toc - tic))

from cs231n.classifiers.linear_svm import svm_loss_vectorized
tic = time.time()
loss_vectorized, _ = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic))

# The losses should match but your vectorized implementation should be much faster.
print('difference: %f' % (loss_naive - loss_vectorized))

Naive loss: 9.053956e+00 computed in 0.117306s
Vectorized loss: 9.053956e+00 computed in 0.063829s
difference: 0.000000

# Complete the implementation of svm_loss_vectorized, and compute the gradient
# of the loss function in a vectorized way.

# The naive implementation and the vectorized implementation should match, but
# the vectorized version should still be much faster.
tic = time.time()
_, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Naive loss and gradient: computed in %fs' % (toc - tic))

tic = time.time()
_, grad_vectorized = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Vectorized loss and gradient: computed in %fs' % (toc - tic))

# The loss is a single number, so it is easy to compare the values computed
# by the two implementations. The gradient on the other hand is a matrix, so
# we use the Frobenius norm to compare them.
difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')
print('difference: %f' % difference)

Naive loss and gradient: computed in 0.125662s
Vectorized loss and gradient: computed in 0.003994s
difference: 0.000000

Stochastic Gradient Descent(SGD)

(in cs231n/classifiers/linear_classifier.py)

class LinearClassifier(object):

    def __init__(self):
        self.W = None

    def train(self, X, y, learning_rate=1e-3, reg=1e-5, num_iters=100,
              batch_size=200, verbose=False):
        """
        Train this linear classifier using stochastic gradient descent.

        Inputs:
        - X: A numpy array of shape (N, D) containing training data; there are N
          training samples each of dimension D.
        - y: A numpy array of shape (N,) containing training labels; y[i] = c
          means that X[i] has label 0 <= c < C for C classes.
        - learning_rate: (float) learning rate for optimization.
        - reg: (float) regularization strength.
        - num_iters: (integer) number of steps to take when optimizing
        - batch_size: (integer) number of training examples to use at each step.
        - verbose: (boolean) If true, print progress during optimization.

        Outputs:
        A list containing the value of the loss function at each training iteration.
        """
        num_train, dim = X.shape
        num_classes = np.max(y) + 1 # assume y takes values 0...K-1 where K is number of classes
        if self.W is None:
            # lazily initialize W
            self.W = 0.001 * np.random.randn(dim, num_classes)

        # Run stochastic gradient descent to optimize W
        loss_history = []
        for it in range(num_iters):
            X_batch = None
            y_batch = None

            #########################################################################
            # TODO:                                                                 #
            # Sample batch_size elements from the training data and their           #
            # corresponding labels to use in this round of gradient descent.        #
            # Store the data in X_batch and their corresponding labels in           #
            # y_batch; after sampling X_batch should have shape (batch_size, dim)   #
            # and y_batch should have shape (batch_size,)                           #
            #                                                                       #
            # Hint: Use np.random.choice to generate indices. Sampling with         #
            # replacement is faster than sampling without replacement.              #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            batch_inx = np.random.choice(num_train,batch_size)
            X_batch = X[batch_inx,:]
            y_batch = y[batch_inx]

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            # evaluate loss and gradient
            loss, grad = self.loss(X_batch, y_batch, reg)
            loss_history.append(loss)

            # perform parameter update
            #########################################################################
            # TODO:                                                                 #
            # Update the weights using the gradient and the learning rate.          #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            self.W -= learning_rate * grad

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            if verbose and it % 100 == 0:
                print('iteration %d / %d: loss %f' % (it, num_iters, loss))

        return loss_history

# In the file linear_classifier.py, implement SGD in the function
# LinearClassifier.train() and then run it with the code below.
from cs231n.classifiers import LinearSVM
svm = LinearSVM()
tic = time.time()
loss_hist = svm.train(X_train, y_train, learning_rate=1e-7, reg=2.5e4,
                      num_iters=1500, verbose=True)
toc = time.time()
print('That took %fs' % (toc - tic))

iteration 0 / 1500: loss 411.256562
iteration 100 / 1500: loss 244.689758
iteration 200 / 1500: loss 148.272938
iteration 300 / 1500: loss 92.533291
iteration 400 / 1500: loss 57.345825
iteration 500 / 1500: loss 36.810807
iteration 600 / 1500: loss 24.459797
iteration 700 / 1500: loss 16.620232
iteration 800 / 1500: loss 11.988697
iteration 900 / 1500: loss 9.749128
iteration 1000 / 1500: loss 7.778197
iteration 1100 / 1500: loss 6.811365
iteration 1200 / 1500: loss 6.014506
iteration 1300 / 1500: loss 5.870699
iteration 1400 / 1500: loss 5.501124
That took 7.865505s

# A useful debugging strategy is to plot the loss as a function of
# iteration number:
plt.plot(loss_hist)
plt.xlabel('Iteration number')
plt.ylabel('Loss value')
plt.show()

# Write the LinearSVM.predict function and evaluate the performance on both the
# training and validation set
y_train_pred = svm.predict(X_train)
print('training accuracy: %f' % (np.mean(y_train == y_train_pred), ))
y_val_pred = svm.predict(X_val)
print('validation accuracy: %f' % (np.mean(y_val == y_val_pred), ))

training accuracy: 0.379837
validation accuracy: 0.388000

# Use the validation set to tune hyperparameters (regularization strength and
# learning rate). You should experiment with different ranges for the learning
# rates and regularization strengths; if you are careful you should be able to
# get a classification accuracy of about 0.39 on the validation set.

#Note: you may see runtime/overflow warnings during hyper-parameter search. 
# This may be caused by extreme values, and is not a bug.

learning_rates = [1e-7, 5e-5,1e-5]
regularization_strengths = [2.5e4, 5e4, 0.01]

# results is dictionary mapping tuples of the form
# (learning_rate, regularization_strength) to tuples of the form
# (training_accuracy, validation_accuracy). The accuracy is simply the fraction
# of data points that are correctly classified.
results = {}
best_val = -1   # The highest validation accuracy that we have seen so far.
best_svm = None # The LinearSVM object that achieved the highest validation rate.

################################################################################
# TODO:                                                                        #
# Write code that chooses the best hyperparameters by tuning on the validation #
# set. For each combination of hyperparameters, train a linear SVM on the      #
# training set, compute its accuracy on the training and validation sets, and  #
# store these numbers in the results dictionary. In addition, store the best   #
# validation accuracy in best_val and the LinearSVM object that achieves this  #
# accuracy in best_svm.                                                        #
#                                                                              #
# Hint: You should use a small value for num_iters as you develop your         #
# validation code so that the SVMs don't take much time to train; once you are #
# confident that your validation code works, you should rerun the validation   #
# code with a larger value for num_iters.                                      #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

for learning_rate in learning_rates:
    for regularization_strength in regularization_strengths:
        svm = LinearSVM()
        loss_hist = svm.train(X_train, y_train, learning_rate = learning_rate, reg = regularization_strength,
                      num_iters = 1500, verbose=False)
        y_train_pred = svm.predict(X_train)
        train_accuracy = np.mean(y_train_pred == y_train)
        y_val_pred = svm.predict(X_val)
        val_accuracy = np.mean(y_val == y_val_pred)
        results[(learning_rate, regularization_strength)] = [train_accuracy, val_accuracy]
        if val_accuracy > best_val:
            best_val = val_accuracy
            best_svm = svm

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
# Print out results.
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
                lr, reg, train_accuracy, val_accuracy))
    
print('best validation accuracy achieved during cross-validation: %f' % best_val)

lr 1.000000e-07 reg 1.000000e-02 train accuracy: 0.300816 val accuracy: 0.312000
lr 1.000000e-07 reg 2.500000e+04 train accuracy: 0.383796 val accuracy: 0.393000
lr 1.000000e-07 reg 5.000000e+04 train accuracy: 0.370816 val accuracy: 0.375000
lr 1.000000e-05 reg 1.000000e-02 train accuracy: 0.341224 val accuracy: 0.336000
lr 1.000000e-05 reg 2.500000e+04 train accuracy: 0.193041 val accuracy: 0.188000
lr 1.000000e-05 reg 5.000000e+04 train accuracy: 0.195224 val accuracy: 0.187000
lr 5.000000e-05 reg 1.000000e-02 train accuracy: 0.284980 val accuracy: 0.286000
lr 5.000000e-05 reg 2.500000e+04 train accuracy: 0.149224 val accuracy: 0.148000
lr 5.000000e-05 reg 5.000000e+04 train accuracy: 0.095102 val accuracy: 0.090000
best validation accuracy achieved during cross-validation: 0.393000

# Visualize the cross-validation results
import math
x_scatter = [math.log10(x[0]) for x in results]
y_scatter = [math.log10(x[1]) for x in results]

# plot training accuracy
marker_size = 100
colors = [results[x][0] for x in results]
plt.subplot(2, 1, 1)
plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)
plt.colorbar()
plt.xlabel('log learning rate')
plt.ylabel('log regularization strength')
plt.title('CIFAR-10 training accuracy')

# plot validation accuracy
colors = [results[x][1] for x in results] # default size of markers is 20
plt.subplot(2, 1, 2)
plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)
plt.colorbar()
plt.xlabel('log learning rate')
plt.ylabel('log regularization strength')
plt.title('CIFAR-10 validation accuracy')
plt.show()

# Evaluate the best svm on test set
y_test_pred = best_svm.predict(X_test)
test_accuracy = np.mean(y_test == y_test_pred)
print('linear SVM on raw pixels final test set accuracy: %f' % test_accuracy)

linear SVM on raw pixels final test set accuracy: 0.370000

# Visualize the learned weights for each class.
# Depending on your choice of learning rate and regularization strength, these may
# or may not be nice to look at.
w = best_svm.W[:-1,:] # strip out the bias
w = w.reshape(32, 32, 3, 10)
w_min, w_max = np.min(w), np.max(w)
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for i in range(10):
    plt.subplot(2, 5, i + 1)
      
    # Rescale the weights to be between 0 and 255
    wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)
    plt.imshow(wimg.astype('uint8'))
    plt.axis('off')
    plt.title(classes[i])

你可能感兴趣的:(CS231n)

cs231n_深度之眼第二次作业 Jie_Cheney
图像分类数据和label分别是什么？图像分类存在的问题与挑战？图像分类数据包括训练集测试集的数据，在有监督的问题中对于训练集数据来说是有label的，而测试集是等待我们去识别它的类别，不具有label。label就是分类标签，比如cifar10这个数据集，待分类的这10类数据我们可以写成1-10，或者0-9这就叫做label。图像分类存在的问题与挑战：光照，角度，形变，遮挡。使用python加载一
向量，矩阵和张量的导数 | 简单的数学橘子学AI
前段时间看过一些矩阵求导的教程，在看过的资料中，尤其喜欢斯坦福大学CS231n卷积神经网络课程中提到的Erik这篇文章。循着他的思路，可以逐步将复杂的求导过程简化、再简化，直到发现其中有规律的部分。话不多说，一起来看看吧。作者：ErikLearned-Miller翻译：橘子来源：橘子AI笔记（datawitch）本文旨在帮助您学习向量、矩阵和高阶张量（三维或三维以上的数组）的求导方法，以及如何求对
cs231n assignment1——SVM 柠檬山楂荷叶茶 cs231n 支持向量机 python 机器学习
整体思路加载CIFAR-10数据集并展示部分数据数据图像归一化，减去均值（也可以再除以方差）svm_loss_naive和svm_loss_vectorized计算hinge损失，用拉格朗日法列hinge损失函数利用随机梯度下降法优化SVM在训练集和验证集计算准确率，保存最好的模型在测试集进行预测计算准确率加载展示划分数据集加载CIFAR-10数据集#LoadtherawCIFAR-10data.
（2023版）斯坦福CS231n学习笔记：DL与CV教程 (12) | 视觉模型可视化与可解释性（Visualizing and Understanding）女王の专属领地计算机视觉 #计算机视觉 #学习笔记
前言笔记专栏：斯坦福CS231N：面向视觉识别的卷积神经网络（23）课程链接：https://www.bilibili.com/video/BV1xV411R7i5CS231n:深度学习计算机视觉（2017）中文笔记：https://zhuxiaoxia.blog.csdn.net/article/details/801551662023最新课程PPT：https://download.csdn.
2019-02-25~~2019-03-03 第十周周末复盘仰望星空的小狗
一、任务清单1、刷leetcode题目（7道）2、听tensorflow，cs231n和cv课程3、技术文档输出4、恢复早起的作息二、反思1、自从年前工作非常忙，加上遇上一些郁闷的事情，导致年前到现在时间记录中断了很长一段时间。本周开始恢复时间记录，日打卡，周复盘。2、生活中不论谁，肯定会时不时遇上一些令人郁闷的事情，这些郁闷的事情很可能会打乱原本的生活节奏。但是，生活还有很长的路要走，不应该因为
训练神经网络(上)激活函数笔写落去深度学习神经网络人工智能深度学习
本文介绍几种激活函数,只作为个人笔记.观看视频为cs231n文章目录前言一、Sigmoid函数二、tanh函数三、ReLU函数四、LeakyReLU函数五、ELU函数六.在实际应用中寻找激活函数的做法总结前言激活函数是用来加入非线性因素的，提高神经网络对模型的表达能力，解决线性模型所不能解决的问题。一、Sigmoid函数这个函数大家应该熟悉在逻辑回归中曾用到这个sigmoid函数这个函数可以将负无
卷积神经网络 weixin_34283445 人工智能
https://zhuanlan.zhihu.com/p/27642620关于卷积神经网络的讲解，网上有很多精彩文章，且恐怕难以找到比斯坦福的CS231n还要全面的教程。所以这里对卷积神经网络的讲解主要是以不同的思考侧重展开，通过对卷积神经网络的分析，进一步理解神经网络变体中“因素共享”这一概念。注意：该文会跟其他的现有文章有很大的不同。读该文需要有本书前些章节作为预备知识，不然会有理解障碍。没看
CS231n 作业答案 tech0ne
CS231n三次大作业：#第一次作业##原始包下载：作业一完成包地址：作业一JupyterNotebook结果：KNNSVMSoftmaxTwolayernetFeatures第二次作业原始包下载：作业二完成包地址：作业二JupyterNotebook结果：FullyConnectedNetsBatchNormalizationDropoutConvolutionalNetworksTensorf
cs231n作业-assignment1 momentum_ AI python 机器学习 numpy
assignment1(cs231n)文章目录assignment1(cs231n)KNN基础计算distances方法一：双层循环计算distances方法二：单层循环计算distances方法三：无循环根据dists找到每个测试样本的种类KNN模型汇总交叉验证KNN基础计算distances方法一：双层循环dists矩阵是（num_test,num_train）500*5000defcompu
【深度学习理论】(1) 损失函数立Sir 深度学习理论机器学习人工智能神经网络深度学习损失函数
各位同学好，最近学习了CS231N斯坦福计算机视觉公开课，讲的太精彩了，和大家分享一下。已知一张图像属于各个类别的分数，我们希望图像属于正确分类的分数是最大的，那如何定量的去衡量呢，那就是损失函数的作用了。通过比较分数与真实标签的差距，构造损失函数，就可以定量的衡量模型的分类效果，进而进行后续的模型优化和评估。构造损失函数之后，我们的目标就是将损失函数的值最小化，使用梯度下降的方法求得损失函数对于
线性分类器--数据处理骆驼穿针眼计算机视觉与深度学习深度学习
数据集划分通常按照70%，20%，10%来分数据集数据处理斯坦福的线性分类器体验http://vision.stanford.edu/teaching/cs231n-demos/linear-classify/
【CS231n】－学习笔记-1-Intro to Computer Vision, historical context. Alice熹爱学习计算机视觉计算机视觉 CS231n DeepLearning PYTHON
Class:http://cs231n.stanford.eduSchedule:http://cs231n.stanford.edu/syllabus.htmlSlides:http://vision.stanford.edu/teaching/cs231n/slides/winter1516_lecture1.pdfVideo:https://www.youtube.com/watch?v=N
笔记00-杜克大学公开课,图像和视频处理:从火星到好莱坞木木爱吃糖醋鱼
笔记内容介绍》ImageandVideoProcessing:FromMarstoHollywoodwithaStopattheHospital算起来是2017年中的时候，因为要搞深度学习的东西，就自学了斯坦福cs231n的神经网络的课。Youtube上有至少两期的公开课视频。好像从李飞飞离职之后，截止到2017年春季，就没再继续了。现在想想哪门课的内容挺多挺繁杂的。虽然是本科的课，最后好像每个学
向量对向量求导，链式法则构建的乐趣向量对向量求导
这还算不得向量微积分里多么主干的内容，只是一个小技术，但是数学推导很多时候就会用到。http://cs231n.stanford.edu/vecDerivs.pdf这个文献是一个好文献。另优秀翻译：https://zhuanlan.zhihu.com/p/142668996链式法则注意：这里的乘法变成了innerproduct推导过程中比较关键的点：除了利用这文献所讲的分量慢慢推，还有一个要点，首
Win10上关于cs231n（2017）课后作业的环境配置 Diane小山
开始首先，这篇文章是针对那些想完成cs231n作业，但是觉得装linux双系统很麻烦的童鞋。cs231n作业的SetUp官方教程只针对了那些使用Unix(Ubuntu,Macos等)的人，对使用Windows的人十分不友好。安装anaconda百度一篇anaconda的安装教程，照着安装即可。这里需要提醒的有两点：国内的anaconda镜像能用的基本都挂了，所以还是老老实实去官方网站下载吧：）一定
CS231N assignment2 SVM weixin_30363509 数据结构与算法人工智能 python
CS231NAssignment2SupportVectorMachineBegin本文主要介绍CS231N系列课程的第一项作业，写一个SVM无监督学习训练模型。课程主页：网易云课堂CS231N系列课程语言：Python3.61线形分类器以图像为例，一幅图像像素为32*32*3代表长32宽32有3通道的衣服图像，将其变为1*3072的一个向量，即该图像的特征向量。我们如果需要训练1000幅图像，那
【AI】斯坦福CS231n课程练习（1）—— KNN和SVM分类李清焰 CS231n KNN SVM
文章目录一、前言1、CS231n是啥？2、本篇博客任务3、使用的数据集二、知识准备1、KNN是什么？2、SVM是什么？SVM的组成：三、实验——KNN和SVM分类1、KNN图片分类（重要步骤将在目录上体现）（1）在colab上切换目录，加载dataset（2）加载包、设置和外部模块（3）加载、初步处理数据（4）可视化打印一些图片看看我们的数据集长什么样（5）对测试、训练数据进行分组（6）创建KNN
深度学习系列之cs231n assignment1 KNN（二）明曦君深度学习 python 机器学习
写在前面：久经周折，终于能够将KNN系列给大家继续分享了，这次的内容来源于李飞飞教授团队的cs231n深度学习课程的作业1中的KNN研究，我会在全文我遇到困难的地方进行分享，以及一些想法。内容安排深度学习系列依托与cs231n的课程作业，因为只想练习编程，所以不对课程内容进行分享，仅针对编程内容进行分享。那么这一次的分享就是assignment1中K近邻分类器的使用，以及完成其中的四个问题，这四个
cs231n assignment2(3) 没天赋的学琴
assignment2的第三部分，是熟悉深度学习框架pytorch或者tensorflow，这里选择的是使用pytorch框架。该部分主要通过三个层次：Barebones、ModuleAPI、SequentialAPI，来了解pytorch。Barebones在该层次中，需要利用pytorch所提供的一些函数，不仅需要定义神经网络的结构，同时还需编写网络的前向传播以及模型的训练部分；而参数的梯度可
第三十三周学习笔记 luputo 学习笔记
第三十三周学习笔记CS231nDeepLearningSoftwareCPUvsGPUCPU:Fewercores,buteachcoreismuchfasterandmuchmorecapable;greatatsequentialtasksGPU:Morecores,buteachcoreismuchslowerand“dumber”;greatforparalleltasks（matrixm
CNN(卷积神经网络)、RNN(循环神经网络)、DNN，LSTM weixin_34174132 人工智能
http://cs231n.github.io/neural-networks-1https://arxiv.org/pdf/1603.07285.pdfhttps://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/Appli
CNN笔记：通俗理解卷积神经网络 I_O_fly 神经网络 cnn 神经网络深度学习
通俗理解卷积神经网络（cs231n与5月dl班课程笔记）1前言2012年我在北京组织过8期machinelearning读书会，那时“机器学习”非常火，很多人都对其抱有巨大的热情。当我2013年再次来到北京时，有一个词似乎比“机器学习”更火，那就是“深度学习”。本博客内写过一些机器学习相关的文章，但上一篇技术文章“LDA主题模型”还是写于2014年11月份，毕竟自2015年开始创业做在线教育后，太
Knn算法与 Svm算法对比一个不知名的码农支持向量机算法机器学习
Knn算法与Svm算法对比这里首先借用一个博主所做的图表，讲的很有理有据(7条消息)[cs231n]KNN与SVM区别_Rookie’Program的博客-CSDN博客_knn和svm的区别这里我们来讲一下我对这两个算法的理解knn看起来就是比较简单的一个数学模型，就是划范围论，精细程度实际上可能没有svm好，并且测试量也不能大，数据一大，处理起来又很麻烦，预测效率也比较低。相反的svm和knn对
斯坦福大学CS520知识图谱系列课程学习笔记：第一讲什么是知识图谱 ngl567
随着知识图谱在人工智能各个领域的广泛使用，知识图谱受到越来越多AI研究人员的关注和学习，已经成为人工智能迈向认知系统的关键技术之一。之前，斯坦福大学的面向计算机视觉的CS231n和面向自然语言处理的CS224n成为了全球非常多AI研究人员的入门经典学习课程。因此，斯坦福大学于今年3月开设了一门专门面向知识图谱的系列课程CS520，官网课程页：https://web.stanford.edu/cla
北京邮电大学计算机视觉与深度学习鲁鹏计算机视觉概述课程手迹 qinyaoze 机器学习 CV手记计算机视觉人工智能深度学习
课程笔记计算机视觉=输入(认知神经科学-理论,运用方法&算法,硬件)+输出(机器人)课程：图像处理-CS131，图像结构-CS231a，图像理论-CS230/CS231nQ-象棋与人工智能的关系？IBM-深蓝，Google-AlphaGo>>机器赢得象棋胜利=强大的搜索算法目标：语义鸿沟，即建立图像像素核语义间的关系发展过程：系统出现-物种大繁荣>>理论研究-猫视觉神经>>积木世界>>MIT图像处
国外AI大牛推荐的10大最有帮助免费在线机器学习课程机器学习与系统
woman_ml.jpg本文编译自twitter用户chipro斯坦福在线自学课程《概率与统计》：该课程涉及概率统计的基本概念，涵盖机器学习4个基本方面：探索性数据分析，产生数据，概率和推理。MIT的《线性代数》：这是我见过的最好的线性代数课程，由传奇教授GilbertStrang（吉尔伯特斯特朗）教授。斯坦福的CS231N：用于视觉识别的卷积神经网络：平衡理论与实践。课堂笔记写得很好，解释了不同
CS231n学习笔记--计算机视觉历史回顾与介绍1 听城
CS231n简介首先我们来看看官方对这门课的介绍：计算机视觉在社会中已经逐渐普及，并广泛运用于搜索检索、图像理解、手机应用、地图导航、医疗制药、无人机和无人驾驶汽车等领域。而这些应用的核心技术就是图像分类、图像定位和图像探测等视觉识别任务。近期神经网络（也就是“深度学习”）方法上的进展极大地提升了这些代表当前发展水平的视觉识别系统的性能。本课程将深入讲解深度学习框架的细节问题，聚焦面向视觉识别任务
计算机视觉实战项目（图像分类+目标检测+目标跟踪+姿态识别+车道线识别+车牌识别）阿利同学计算机视觉分类目标检测
图像分类教程博客_传送门链接:链接在本教程中，您将学习如何使用迁移学习训练卷积神经网络以进行图像分类。您可以在cs231n上阅读有关迁移学习的更多信息。本文主要目的是教会你如何自己搭建分类模型，耐心看完，相信会有很大收获。废话不多说，直切主题…首先们要知道深度学习大都包含了下面几个方面：1.加载（处理）数据2.网络搭建3.损失函数（模型优化）4模型训练和保存把握好这些主要内容和流程，基本上对分类模
cs231n assignment2(2) 没天赋的学琴
assignment2的第二部分的内容，实现一个卷积神经网络。这一部分主要是实现卷积神经网络中的一些所需用到的layer类型：卷积层(convolution)和池化层(这里是实现max-pooling)。这部分的实现是不考虑其运行效率，而在真正的实现应用上，卷积神经网络的运行效率是一个很重要的问题。卷积层卷积层是由一个个过滤器(filter)，每个过滤器的尺寸为:，这里的的大小与输入的图像或act
cs231n作业：Assignment1-Softmax Diane小山
softmax.pydefsoftmax_loss_naive(W,X,y,reg):"""Softmaxlossfunction,naiveimplementation(withloops)InputshavedimensionD,thereareCclasses,andweoperateonminibatchesofNexamples.Inputs:-W:Anumpyarrayofshape(
sql统计相同项个数并按名次显示朱辉辉33 java oracle
现在有如下这样一个表： A表 ID Name time ------------------------------ 0001 aaa 2006-11-18 0002 ccc 2006-11-18 0003 eee 2006-11-18 0004 aaa 2006-11-18 0005 eee 2006-11-18 0004 aaa 2006-11-18 0002 ccc 20
Android+Jquery Mobile学习系列-目录白糖_ JQuery Mobile
最近在研究学习基于Android的移动应用开发，准备给家里人做一个应用程序用用。向公司手机移动团队咨询了下，觉得使用Android的WebView上手最快，因为WebView等于是一个内置浏览器，可以基于html页面开发，不用去学习Android自带的七七八八的控件。然后加上Jquery mobile的样式渲染和事件等，就能非常方便的做动态应用了。从现在起，往后一段时间，我打算
如何给线程池命名 daysinsun 线程池
在系统运行后，在线程快照里总是看到线程池的名字为pool-xx，这样导致很不好定位，怎么给线程池一个有意义的名字呢。参照ThreadPoolExecutor类的ThreadFactory，自己实现ThreadFactory接口，重写newThread方法即可。参考代码如下： public class Named
IE 中"HTML Parsing Error:Unable to modify the parent container element before the 周凡杨 html 解析 error readyState
错误： IE 中"HTML Parsing Error:Unable to modify the parent container element before the child element is closed" 现象：同事之间几个IE 测试情况下，有的报这个错，有的不报。经查询资料后，可归纳以下原因。
java上传 g21121 java
我们在做web项目中通常会遇到上传文件的情况，用struts等框架的会直接用的自带的标签和组件，今天说的是利用servlet来完成上传。我们这里利用到commons-fileupload组件，相关jar包可以取apache官网下载：http://commons.apache.org/ 下面是servlet的代码： //定义一个磁盘文件工厂 DiskFileItemFactory fact
SpringMVC配置学习 510888780 spring mvc
spring MVC配置详解现在主流的Web MVC框架除了Struts这个主力外，其次就是Spring MVC了，因此这也是作为一名程序员需要掌握的主流框架，框架选择多了，应对多变的需求和业务时，可实行的方案自然就多了。不过要想灵活运用Spring MVC来应对大多数的Web开发，就必须要掌握它的配置及原理。　　一、Spring MVC环境搭建：（Spring 2.5.6 + Hi
spring mvc-jfreeChart 柱图(1) 布衣凌宇 jfreechart
第一步：下载jfreeChart包，注意是jfreeChart文件lib目录下的，jcommon-1.0.23.jar和jfreechart-1.0.19.jar两个包即可；第二步：配置web.xml; web.xml代码如下 <servlet> <servlet-name>jfreechart</servlet-nam
我的spring学习笔记13-容器扩展点之PropertyPlaceholderConfigurer aijuans Spring3
PropertyPlaceholderConfigurer是个bean工厂后置处理器的实现，也就是BeanFactoryPostProcessor接口的一个实现。关于BeanFactoryPostProcessor和BeanPostProcessor类似。我会在其他地方介绍。PropertyPlaceholderConfigurer可以将上下文（配置文件）中的属性值放在另一个单独的标准java P
java 线程池使用 Runnable&Callable&Future antlove java thread Runnable callable future
1. 创建线程池 ExecutorService executorService = Executors.newCachedThreadPool(); 2. 执行一次线程，调用Runnable接口实现 Future<?> future = executorService.submit(new DefaultRunnable()); System.out.prin
XML语法元素结构的总结百合不是茶 xml 树结构
1.XML介绍1969年 gml (主要目的是要在不同的机器进行通信的数据规范)1985年 sgml standard generralized markup language1993年 html(www网)1998年 xml extensible markup language
改变eclipse编码格式 bijian1013 eclipse 编码格式
1.改变整个工作空间的编码格式改变整个工作空间的编码格式，这样以后新建的文件也是新设置的编码格式。 Eclipse->window->preferences->General->workspace-
javascript中return的设计缺陷 bijian1013 JavaScript AngularJS
代码1： <script> var gisService = (function(window) { return { name:function () { alert(1); } }; })(this); gisService.name(); &l
【持久化框架MyBatis3八】Spring集成MyBatis3 bit1129 Mybatis3
pom.xml配置 Maven的pom中主要包括： MyBatis MyBatis-Spring Spring MySQL-Connector-Java Druid applicationContext.xml配置 <?xml version="1.0" encoding="UTF-8"?> &
java web项目启动时自动加载自定义properties文件 bitray java Web 监听器相对路径
创建一个类 public class ContextInitListener implements ServletContextListener 使得该类成为一个监听器。用于监听整个容器生命周期的，主要是初始化和销毁的。类创建后要在web.xml配置文件中增加一个简单的监听器配置，即刚才我们定义的类。 <listener> <des
用nginx区分文件大小做出不同响应 ronin47
昨晚和前21v的同事聊天，说到我离职后一些技术上的更新。其中有个给某大客户(游戏下载类)的特殊需求设计，因为文件大小差距很大——估计是大版本和补丁的区别——又走的是同一个域名，而squid在响应比较大的文件时，尤其是初次下载的时候，性能比较差，所以拆成两组服务器，squid服务于较小的文件，通过pull方式从peer层获取，nginx服务于较大的文件，通过push方式由peer层分发同步。外部发布
java-67-扑克牌的顺子.从扑克牌中随机抽5张牌，判断是不是一个顺子，即这5张牌是不是连续的.2-10为数字本身，A为1，J为11，Q为12，K为13，而大 bylijinnan java
package com.ljn.base; import java.util.Arrays; import java.util.Random; public class ContinuousPoker { /** * Q67 扑克牌的顺子从扑克牌中随机抽5张牌，判断是不是一个顺子，即这5张牌是不是连续的。 * 2-10为数字本身，A为1，J为1
翟鸿燊老师语录 ccii 翟鸿燊
一、国学应用智慧TAT之亮剑精神A 1. 角色就是人格就像你一回家的时候，你一进屋里面，你已经是儿子，是姑娘啦，给老爸老妈倒怀水吧，你还觉得你是老总呢？还拿派呢？就像今天一样，你们往这儿一坐，你们之间是什么，同学，是朋友。还有下属最忌讳的就是领导向他询问情况的时候，什么我不知道，我不清楚，该你知道的你凭什么不知道
[光速与宇宙]进行光速飞行的一些问题 comsci 问题
在人类整体进入宇宙时代，即将开展深空宇宙探索之前，我有几个猜想想告诉大家仅仅是猜想。。。未经官方证实 1：要在宇宙中进行光速飞行，必须首先获得宇宙中的航行通行证，而这个航行通行证并不是我们平常认为的那种带钢印的证书，是什么呢？下面我来告诉
oracle undo解析 cwqcwqmax9 oracle
oracle undo解析2012-09-24 09:02:01 我来说两句作者：虫师收藏我要投稿 Undo是干嘛用的？ &nb
java中各种集合的详细介绍 dashuaifu java 集合
一，java中各种集合的关系图 Collection 接口的接口对象的集合 ├ List 子接口 &n
卸载windows服务的方法 dcj3sjt126com windows service
卸载Windows服务的方法在Windows中，有一类程序称为服务，在操作系统内核加载完成后就开始加载。这里程序往往运行在操作系统的底层，因此资源占用比较大、执行效率比较高，比较有代表性的就是杀毒软件。但是一旦因为特殊原因不能正确卸载这些程序了，其加载在Windows内的服务就不容易删除了。即便是删除注册表中的相应项目，虽然不启动了，但是系统中仍然存在此项服务，只是没有加载而已。如果安装其他
Warning: The Copy Bundle Resources build phase contains this target's Info.plist dcj3sjt126com ios xcode
http://developer.apple.com/iphone/library/qa/qa2009/qa1649.html Excerpt: You are getting this warning because you probably added your Info.plist file to your Copy Bundle
2014之C++学习笔记（一） Etwo C++Etwo Etwo iterator 迭代器
已经有很长一段时间没有写博客了，可能大家已经淡忘了Etwo这个人的存在，这一年多以来，本人从事了AS的相关开发工作，但最近一段时间，AS在天朝的没落，相信有很多码农也都清楚，现在的页游基本上达到饱和，手机上的游戏基本被unity3D与cocos占据，AS基本没有容身之处。so。。。最近我并不打算直接转型
js跨越获取数据问题记录 haifengwuch jsonp json Ajax
js的跨越问题，普通的ajax无法获取服务器返回的值。第一种解决方案，通过getson，后台配合方式，实现。 Java后台代码： protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String ca
蓝色jQuery导航条 ini JavaScript html jquery Web html5
效果体验：http://keleyi.com/keleyi/phtml/jqtexiao/39.htmHTML文件代码： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>jQuery鼠标悬停上下滑动导航条 - 柯乐义<
linux部署jdk,tomcat,mysql kerryg jdk tomcat linux mysql
1、安装java环境jdk: 一般系统都会默认自带的JDK,但是不太好用，都会卸载了，然后重新安装。 1.1）、卸载：（rpm -qa :查询已经安装哪些软件包； rmp -q 软件包：查询指定包是否已
DOMContentLoaded VS onload VS onreadystatechange mutongwu jquery js
1. DOMContentLoaded 在页面html、script、style加载完毕即可触发，无需等待所有资源（image/iframe）加载完毕。（IE9+） 2. onload是最早支持的事件，要求所有资源加载完毕触发。 3. onreadystatechange 开始在IE引入，后来其它浏览器也有一定的实现。涉及以下 document , applet, embed, fra
sql批量插入数据 qifeifei 批量插入
hi，自己在做工程的时候，遇到批量插入数据的数据修复场景。我的思路是在插入前准备一个临时表，临时表的整理就看当时的选择条件了，临时表就是要插入的数据集，最后再批量插入到数据库中。 WITH tempT AS ( SELECT item_id AS combo_id, item_id, now() AS create_date FROM a
log4j打印日志文件如何实现相对路径到项目工程下 thinkfreer Web log4j 应用服务器日志
最近为了实现统计一个网站的访问量，记录用户的登录信息，以方便站长实时了解自己网站的访问情况，选择了Apache 的log4j,但是在选择相对路径那块卡主了，X度了好多方法(其实大多都是一样的内用，还一个字都不差的)，都没有能解决问题，无奈搞了2天终于解决了，与大家分享一下需求：用户登录该网站时，把用户的登录名,ip,时间。统计到一个txt文档里，以方便其他系统调用此txt。项目名
linux下mysql-5.6.23.tar.gz安装与配置笑我痴狂 mysql linux unix
1.卸载系统默认的mysql [root@localhost ~]# rpm -qa | grep mysql mysql-libs-5.1.66-2.el6_3.x86_64 mysql-devel-5.1.66-2.el6_3.x86_64 mysql-5.1.66-2.el6_3.x86_64 [root@localhost ~]# rpm -e mysql-libs-5.1