[CS231n Assignment 2 #03 ] 随机失活（Dropout）

文章目录

- 作业介绍
- 1. 前向传播
- 2. 反向传播
- 3. 带Dropout的全连接网络
- 4. 对比实验

作业介绍

课堂介绍：[Lecture 7 ] Training Neural Networks II（训练神经网络II）
作业主页：Assignment 2
作业目的：Dropout[1]是一种用于正则化神经网络的技术，它通过在正向传递过程中随机设置一些特征为零。在这个练习中，您将实现一个dropout层，并修改您的全连接网络以可选地使用dropout。
官方示例代码： Assignment 2 code
作业源文件 Dropout.ipynb

1. 前向传播

代码cs231n/layers.py

def dropout_forward(x, dropout_param):
    """Performs the forward pass for (inverted) dropout.

    Inputs:
    - x: Input data, of any shape
    - dropout_param: A dictionary with the following keys:
      - p: Dropout parameter. We keep each neuron output with probability p.
      - mode: 'test' or 'train'. If the mode is train, then perform dropout;
        if the mode is test, then just return the input.
      - seed: Seed for the random number generator. Passing seed makes this
        function deterministic, which is needed for gradient checking but not
        in real networks.

    Outputs:
    - out: Array of the same shape as x.
    - cache: tuple (dropout_param, mask). In training mode, mask is the dropout
      mask that was used to multiply the input; in test mode, mask is None.

    NOTE: Please implement **inverted** dropout, not the vanilla version of dropout.
    See http://cs231n.github.io/neural-networks-2/#reg for more details.

    NOTE 2: Keep in mind that p is the probability of **keep** a neuron
    output; this might be contrary to some sources, where it is referred to
    as the probability of dropping a neuron output.
    """
    p, mode = dropout_param['p'], dropout_param['mode']
    if 'seed' in dropout_param:
        np.random.seed(dropout_param['seed'])

    mask = None
    out = None

    if mode == 'train':
        # p for output
        mask = ( np.random.rand(*x.shape) < p ) / p
        out = x * mask
    elif mode == 'test':
        out = x

    cache = (dropout_param, mask)
    out = out.astype(x.dtype, copy=False)

    return out, cache

测试：

np.random.seed(231)
x = np.random.randn(500, 500) + 10

for p in [0.25, 0.4, 0.7]:
  out, _ = dropout_forward(x, {'mode': 'train', 'p': p})
  out_test, _ = dropout_forward(x, {'mode': 'test', 'p': p})

  print('Running tests with p = ', p)
  print('Mean of input: ', x.mean())
  print('Mean of train-time output: ', out.mean())
  print('Mean of test-time output: ', out_test.mean())
  print('Fraction of train-time output set to zero: ', (out == 0).mean())
  print('Fraction of test-time output set to zero: ', (out_test == 0).mean())
  print()

OUT:
Running tests with p =  0.25
Mean of input:  10.000207878477502
Mean of train-time output:  10.014059116977283
Mean of test-time output:  10.000207878477502
Fraction of train-time output set to zero:  0.749784
Fraction of test-time output set to zero:  0.0

Running tests with p =  0.4
Mean of input:  10.000207878477502
Mean of train-time output:  9.977917658761159
Mean of test-time output:  10.000207878477502
Fraction of train-time output set to zero:  0.600796
Fraction of test-time output set to zero:  0.0

Running tests with p =  0.7
Mean of input:  10.000207878477502
Mean of train-time output:  9.987811912159426
Mean of test-time output:  10.000207878477502
Fraction of train-time output set to zero:  0.30074
Fraction of test-time output set to zero:  0.0

可以发现会有 1-p 比例的神经元失活。

2. 反向传播

def dropout_backward(dout, cache):
    """
    Perform the backward pass for (inverted) dropout.

    Inputs:
    - dout: Upstream derivatives, of any shape
    - cache: (dropout_param, mask) from dropout_forward.
    """
    dropout_param, mask = cache
    mode = dropout_param['mode']

    dx = None
    if mode == 'train':
        dx = dout * mask
    elif mode == 'test':
        dx = dout
    return dx

问题1

Q： What happens if we do not divide the values being passed through inverse dropout by p in the dropout layer? Why does that happen?
A：如果不使用inverse dropout技巧，则会导致在测试的时候的输出分布与在训练时候的输出分布不一致。因为大致只有比例p的神经元通过了神经网络，但是测试的时候我们不能以概率p来通过神经元（这可能会让测试的输出发生变化）。所以，我们在训练的时候让输出值除以p，在测试的时候保持不变来做一种近似处理。

3. 带Dropout的全连接网络

class FullyConnectedNet(object):
    """
    A fully-connected neural network with an arbitrary number of hidden layers,
    ReLU nonlinearities, and a softmax loss function. This will also implement
    dropout and batch/layer normalization as options. For a network with L layers,
    the architecture will be

    {affine - [batch/layer norm] - relu - [dropout]} x (L - 1) - affine - softmax

    where batch/layer normalization and dropout are optional, and the {...} block is
    repeated L - 1 times.

    Similar to the TwoLayerNet above, learnable parameters are stored in the
    self.params dictionary and will be learned using the Solver class.
    """

    def __init__(self, hidden_dims, input_dim=3*32*32, num_classes=10,
                 dropout=1, normalization=None, reg=0.0,
                 weight_scale=1e-2, dtype=np.float32, seed=None):
        """
        Initialize a new FullyConnectedNet.

        Inputs:
        - hidden_dims: A list of integers giving the size of each hidden layer.
        - input_dim: An integer giving the size of the input.
        - num_classes: An integer giving the number of classes to classify.
        - dropout: Scalar between 0 and 1 giving dropout strength. If dropout=1 then
          the network should not use dropout at all.
        - normalization: What type of normalization the network should use. Valid values
          are "batchnorm", "layernorm", or None for no normalization (the default).
        - reg: Scalar giving L2 regularization strength.
        - weight_scale: Scalar giving the standard deviation for random
          initialization of the weights.
        - dtype: A numpy datatype object; all computations will be performed using
          this datatype. float32 is faster but less accurate, so you should use
          float64 for numeric gradient checking.
        - seed: If not None, then pass this random seed to the dropout layers. This
          will make the dropout layers deteriminstic so we can gradient check the
          model.
        """
        self.normalization = normalization
        self.use_dropout = dropout != 1
        self.reg = reg
        self.num_layers = 1 + len(hidden_dims)
        self.dtype = dtype
        self.params = {}

        ############################################################################
        # TODO: Initialize the parameters of the network, storing all values in    #
        # the self.params dictionary. Store weights and biases for the first layer #
        # in W1 and b1; for the second layer use W2 and b2, etc.                   #
        # When using batch normalization, store scale and shift parameters for the #
        # first layer in gamma1 and beta1; for the second layer use gamma2 and     #
        # beta2, etc. Scale parameters should be initialized to ones and shift     #
        # parameters should be initialized to zeros.                               #
        ############################################################################
        input_size = input_dim
        for i in range(len(hidden_dims)):
            output_size = hidden_dims[i]
            self.params['W' + str(i+1)] = np.random.randn(input_size,output_size) * weight_scale
            self.params['b' + str(i+1)] = np.zeros(output_size)
            if self.normalization:
                self.params['gamma' + str(i+1)] = np.ones(output_size)
                self.params['beta' + str(i+1)] = np.zeros(output_size)
            input_size = output_size # 下一层的输入
        # 输出层，没有BN操作
        self.params['W' + str(self.num_layers)] = np.random.randn(input_size,num_classes) * weight_scale
        self.params['b' + str(self.num_layers)] = np.zeros(num_classes)
        # When using dropout we need to pass a dropout_param dictionary to each
        # dropout layer so that the layer knows the dropout probability and the mode
        # (train / test). You can pass the same dropout_param to each dropout layer.
        self.dropout_param = {}
        if self.use_dropout:
            self.dropout_param = {'mode': 'train', 'p': dropout}
            if seed is not None:
                self.dropout_param['seed'] = seed

        # With batch normalization we need to keep track of running means and
        # variances, so we need to pass a special bn_param object to each batch
        # normalization layer. You should pass self.bn_params[0] to the forward pass
        # of the first batch normalization layer, self.bn_params[1] to the forward
        # pass of the second batch normalization layer, etc.
        self.bn_params = []
        if self.normalization=='batchnorm':
            self.bn_params = [{'mode': 'train'} for i in range(self.num_layers - 1)]
        if self.normalization=='layernorm':
            self.bn_params = [{} for i in range(self.num_layers - 1)]

        # Cast all parameters to the correct datatype
        for k, v in self.params.items():
            self.params[k] = v.astype(dtype)


    def loss(self, X, y=None):
        """
        Compute loss and gradient for the fully-connected net.

        Input / output: Same as TwoLayerNet above.
        """
        X = X.astype(self.dtype)
        mode = 'test' if y is None else 'train'

        # Set train/test mode for batchnorm params and dropout param since they
        # behave differently during training and testing.
        if self.use_dropout:
            self.dropout_param['mode'] = mode
        if self.normalization=='batchnorm':
            for bn_param in self.bn_params:
                bn_param['mode'] = mode
        ############################################################################
        # TODO: Implement the forward pass for the fully-connected net, computing  #
        # the class scores for X and storing them in the scores variable.          #
        #                                                                          #
        # When using dropout, you'll need to pass self.dropout_param to each       #
        # dropout forward pass.                                                    #
        #                                                                          #
        # When using batch normalization, you'll need to pass self.bn_params[0] to #
        # the forward pass for the first batch normalization layer, pass           #
        # self.bn_params[1] to the forward pass for the second batch normalization #
        # layer, etc.                                                              #
        ############################################################################
        cache = {} # 需要存储反向传播需要的参数
        cache_dropout = {}
        hidden = X
        for i in range(self.num_layers - 1):
            if self.normalization == 'batchnorm':
                hidden,cache[i+1] = affine_bn_relu_forward(hidden,
                                    self.params['W' + str(i+1)],
                                    self.params['b' + str(i+1)],
                                    self.params['gamma' + str(i+1)],
                                    self.params['beta' + str(i+1)],
                                    self.bn_params[i])
            elif self.normalization == 'layernorm':
                hidden, cache[i + 1] = affine_ln_relu_forward(hidden,
                                        self.params['W' + str(i + 1)],
                                        self.params['b' + str(i + 1)],
                                        self.params['gamma' + str(i + 1)],
                                        self.params['beta' + str(i + 1)],
                                        self.bn_params[i])
            else:
                hidden , cache[i+1] = affine_relu_forward(hidden,self.params['W' + str(i+1)],
                                                          self.params['b' + str(i+1)])
            if self.use_dropout:
                hidden , cache_dropout[i+1] = dropout_forward(hidden,self.dropout_param)
        # 最后一层不用激活层
        scores, cache[self.num_layers] = affine_forward(hidden , self.params['W' + str(self.num_layers)],
                                                       self.params['b' + str(self.num_layers)])

        # If test mode return early
        if mode == 'test':
            return scores

        ############################################################################
        # TODO: Implement the backward pass for the fully-connected net. Store the #
        # loss in the loss variable and gradients in the grads dictionary. Compute #
        # data loss using softmax, and make sure that grads[k] holds the gradients #
        # for self.params[k]. Don't forget to add L2 regularization!               #
        #                                                                          #
        # When using batch/layer normalization, you don't need to regularize the scale   #
        # and shift parameters.                                                    #
        #                                                                          #
        # NOTE: To ensure that your implementation matches ours and you pass the   #
        # automated tests, make sure that your L2 regularization includes a factor #
        # of 0.5 to simplify the expression for the gradient.                      #
        ############################################################################
        loss, grads = 0.0, {}
        loss, dS = softmax_loss(scores , y)
        # 最后一层没有relu激活层
        dhidden, grads['W' + str(self.num_layers)], grads['b' + str(self.num_layers)] \
            = affine_backward(dS,cache[self.num_layers])
        loss += 0.5 * self.reg * np.sum(self.params['W' + str(self.num_layers)] * self.params['W' + str(self.num_layers)])
        grads['W' + str(self.num_layers)] += self.reg * self.params['W' + str(self.num_layers)]

        for i in range(self.num_layers - 1, 0, -1):
            loss += 0.5 * self.reg * np.sum(self.params["W" + str(i)] * self.params["W" + str(i)])
            # 倒着求梯度
            if self.use_dropout:
                dhidden = dropout_backward(dhidden,cache_dropout[i])
            if self.normalization == 'batchnorm':
                dhidden, dw, db, dgamma, dbeta = affine_bn_relu_backward(dhidden, cache[i])
                grads['gamma' + str(i)] = dgamma
                grads['beta' + str(i)] = dbeta
            elif self.normalization == 'layernorm':
                dhidden, dw, db, dgamma, dbeta = affine_ln_relu_backward(dhidden, cache[i])
                grads['gamma' + str(i)] = dgamma
                grads['beta' + str(i)] = dbeta
            else:
                dhidden, dw, db = affine_relu_backward(dhidden, cache[i])
            grads['W' + str(i)] = dw + self.reg * self.params['W' + str(i)]
            grads['b' + str(i)] = db
        return loss, grads

4. 对比实验

作为一个实验，我们将在500个训练示例上训练多种设置的带Dropout的网络和不带Dropout的网络。然后我们将随着时间的推移可视化这两个网络的训练和验证准确性。

# Train two identical nets, one with dropout and one without
# Train two identical nets, one with dropout and one without
np.random.seed(231)
num_train = 500
small_data = {
  'X_train': data['X_train'][:num_train],
  'y_train': data['y_train'][:num_train],
  'X_val': data['X_val'],
  'y_val': data['y_val'],
}

solvers = {}
# 原来的设定是p = 1 和 p = 0.25
dropout_choices = [1, 0.75,0.5,0.25,0.1]
for dropout in dropout_choices:
  model = FullyConnectedNet([500], dropout=dropout)
  print(dropout)

  solver = Solver(model, small_data,
                  num_epochs=25, batch_size=100,
                  update_rule='adam',
                  optim_config={
                    'learning_rate': 5e-4,
                  },
                  verbose=False, print_every=100)
  solver.train()
  solvers[dropout] = solver

训练结果展示：

# Plot train and validation accuracies of the two models

train_accs = []
val_accs = []
for dropout in dropout_choices:
  solver = solvers[dropout]
  print(dropout,"train_acc",max(solver.train_acc_history))
  print(dropout,"val_acc",max(solver.val_acc_history))
  train_accs.append(solver.train_acc_history[-1])
  val_accs.append(solver.val_acc_history[-1])

plt.subplot(3, 1, 1)
for dropout in dropout_choices:
  plt.plot(solvers[dropout].train_acc_history, 'o', label='%.2f dropout' % dropout)
plt.title('Train accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(ncol=2, loc='lower right')
  
plt.subplot(3, 1, 2)
for dropout in dropout_choices:
  plt.plot(solvers[dropout].val_acc_history, 'o', label='%.2f dropout' % dropout)
plt.title('Val accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(ncol=2, loc='lower right')

plt.gcf().set_size_inches(15, 15)
plt.show()

问题

Q： Compare the validation and training accuracies with and without dropout – what do your results suggest about dropout as a regularizer?
A：带有DP可能会适当地降低网络的表示能力（因为每次使用的神经元少），但是能在一定程度上提高模型的泛化能力。带有DP的模型相当于多个模型的集成。但是如果P（保留的神经元）太小，则会让模型很难拟合数据集。
Q： Suppose we are training a deep fully-connected network for image classification, with dropout after hidden layers (parameterized by keep probability p). How should we modify p, if at all, if we decide to decrease the size of the hidden layers (that is, the number of nodes in each layer)?
A：如果减少了隐藏层的神经元的数目，一个比较好的建议是增大p，保留更多的神经元输出，提高模型的拟合能力。

比如上一个代码展示的不同的p情况下训最高精度和测试最高精度：

1 train_acc 0.994
1 val_acc 0.317
0.75 train_acc 0.988
0.75 val_acc 0.317
0.5 train_acc 0.99
0.5 val_acc 0.329
0.25 train_acc 0.944
0.25 val_acc 0.337
0.1 train_acc 0.74
0.1 val_acc 0.342

基于yolov8的8种人脸表情检测系统python源码+onnx模型+评估指标曲线+精美GUI界面 FL1623863129 深度学习 YOLO python 开发语言
【算法介绍】基于YOLOv8的人脸表情检测系统是一个结合了先进目标检测算法（YOLOv8）与深度学习技术的项目，旨在实时或离线地识别并分类人脸表情（如快乐、悲伤、愤怒、惊讶、恐惧、厌恶、中立等）。以下是一个简短的介绍，概述了该系统Python源码的核心要点：该系统直接利用YOLOv8模型进行人脸表情识别。YOLOv8以其高效的速度和准确性著称，非常适合实时应用。Python源码实现通常包括以下几个
AI如何创造情绪价值学客汇商业研究商业观察大模型人工智能生成式AI 大模型应用 AI与情绪管理 AI应用
随着科技的飞速发展，人工智能（AI）已经渗透到我们生活的方方面面。从智能家居到自动驾驶，从医疗辅助到金融服务，AI技术的身影无处不在。而如今，AI更是涉足了一个全新的领域——创造情绪价值。AI已经能够处理和分析大量的文本、图像、音频和视频数据，从中提取和识别出人类的情感信息。AI技术通过模拟人类神经网络的工作方式，对复杂的数据进行深度学习和理解，逐渐具备了处理人类情感的能力。在客户服务领域，情绪识
深度学习：探索人工智能的无限可能木小梦(๑• . •๑) 人工智能深度学习
引言：在当今这个数字化时代，人工智能（AI）已经成为了一个热门话题。从自动驾驶汽车到智能助手，AI正在逐渐改变我们的生活方式。而在AI领域，深度学习是近年来发展最为迅速的一个分支。本文将深入探讨深度学习及其相关领域，包括计算机视觉、自然语言处理、神经网络和强化学习。1.深度学习深度学习是一种基于人工神经网络的机器学习方法，它试图模拟人脑的工作方式，通过训练大量数据来自动学习数据的内在规律和表示层次
深度学习100问7-向量降维的算法有那些不断持续学习ing 深度学习机器学习人工智能
一、主成分分析（PCA）PCA就像你整理一堆考试成绩单。假如成绩单上有好多科目成绩，这就像一个高维向量。但有些科目成绩关系很紧密，比如数学好的同学一般物理也不错，化学也还行。那PCA就会找这些成绩单里最主要的特点，把关系近的科目合成几个新的“大科目”。这样就把原来很多科目的高维向量变成几个“大科目”的低维向量啦。二、奇异值分解（SVD）SVD呢，就好比你有一本很厚的书。书的每一页上的字可以看成一个
基于yolov8的绝缘子缺陷检测系统python源码+onnx模型+评估指标曲线+精美GUI界面 FL1623863129 深度学习 YOLO
【算法介绍】基于YOLOv8的绝缘子缺陷检测系统是一种利用先进深度学习技术的高效解决方案，旨在提升电力行业中输电线路的维护和监控水平。YOLOv8作为YOLO系列算法的最新版本，具备更高的检测速度和精度，特别适用于实时物体检测任务。该系统通过深入分析并标注绝缘子数据集，训练YOLOv8模型以精确识别输电线上的绝缘子及其缺陷状态。利用多尺度检测、FPN结构以及CSPDarknet网络等技术，YOLO
机器学习和深度学习中常见损失函数，包括损失函数的数学公式、推导及其在不同场景中的应用早起星人机器学习深度学习人工智能
目录引言什么是损失函数？常见损失函数介绍3.1均方误差（MeanSquaredError,MSE）3.2交叉熵损失（Cross-EntropyLoss）3.3平滑L1损失（SmoothL1Loss）3.4HingeLoss（合页损失）3.5二进制交叉熵损失（BinaryCross-EntropyLoss）3.6KL散度（KLDivergence）3.7Huber损失（HuberLoss）3.8对比
Python在神经网络中优化激活函数选择使用详解 Rocky006 python 开发语言
概要在神经网络中，激活函数扮演着至关重要的角色。它的主要作用是引入非线性因素，使得神经网络能够处理复杂的非线性问题。如果没有激活函数，神经网络仅仅是线性模型的堆叠，无法胜任深度学习中的各种任务。本文将深入探讨几种常用的激活函数，包括Sigmoid、Tanh、ReLU及其变种，并通过具体的代码示例展示它们在Python中的实现和应用。激活函数的重要性激活函数将输入信号进行非线性转换，从而增强神经网络
增强语音对车载语音质量测试的挑战众乐认证 itu
一、什么是增强语音语音助手是实现智慧车联的关键之一，通过助手，方可去掉按键。其中一个比较典型的功能就是目前比较流行的enhancedsiri。二、增强语音的难点1.语音合成技术语音合成技术在车内环境中的表现至关重要。语音合成采用了混合单元选择系统，结合了单元选择和参数合成的优势，并通过深度学习进一步提升了语音质量。这种技术的应用，使得语音助手能够在车内环境中提供流畅自然且易于理解的语音交互体验。2
TensorFlow和它的弟弟们活蹦乱跳酸菜鱼 tensorflow 人工智能 python
TensorFlow、TensorFlowLite、TensorFlowLiteMicro是Google在深度学习领域推出的三个不同产品，它们各自有着不同的设计目标和适用场景。以下是它们之间的主要区别：1.TensorFlow(PC\GPU)设计目标：TensorFlow是一个开源的机器学习框架，由GoogleBrain团队开发，旨在帮助开发者构建和训练深度学习模型。它支持多种编程语言（如Pyth
Datawhale AI夏令营-task03 ghost_him 人工智能
DatawhaleAI夏令营-task03笔记来源：DatawhaleAI夏令营数据增强基础数据增强是一种在机器学习和深度学习领域常用的技术，尤其是在处理图像和视频数据时。**数据增强的目的是通过人工方式增加训练数据的多样性，从而提高模型的泛化能力，使其能够在未见过的数据上表现得更好。**数据增强涉及对原始数据进行一系列的变换操作，生成新的训练样本。这些变换模拟了真实世界中的变化，对于图像而言，数
释放GPU潜能：PyTorch中torch.nn.DataParallel的数据并行实践 2401_85762266 pytorch 人工智能 python
释放GPU潜能：PyTorch中torch.nn.DataParallel的数据并行实践在深度学习模型的训练过程中，计算资源的需求往往随着模型复杂度的提升而增加。PyTorch，作为当前领先的深度学习框架之一，提供了torch.nn.DataParallel这一工具，使得开发者能够利用多个GPU进行数据并行处理，从而显著加速模型训练。本文将详细介绍如何在PyTorch中使用torch.nn.Dat
每天一个数据分析题（五百零五）- 提升方法跟着紫枫学姐学CDA 数据分析题库数据分析
提升方法（Boosting），是一种可以用来减小监督式学习中偏差的机器学习算法。基于Boosting的集成学习，其代表算法不包括？A.AdaboostB.GBDTC.XGBOOSTD.随机森林数据分析认证考试介绍：点击进入题目来源于CDA模拟题库点击此处获取答案数据分析专项练习题库内容涵盖Python，SQL，统计学，数据分析理论，深度学习，可视化，机器学习，Spark八个方向的专项练习题库，数据
每天一个数据分析题（五百零六）- 装袋方法跟着紫枫学姐学CDA 数据分析数据挖掘
装袋方法(bagging)也叫做bootstrapaggregating,是在原始数据集有放回地重采样S次后得到新数据集的一种技术，其代表算法有？A.AdaboostB.GBDTC.XGBOOSTD.随机森林数据分析认证考试介绍：点击进入题目来源于CDA模拟题库点击此处获取答案数据分析专项练习题库内容涵盖Python，SQL，统计学，数据分析理论，深度学习，可视化，机器学习，Spark八个方向的专
DeepArt——AI美术创作工具，能够帮助生成视觉内容爱研究的小牛 AIGC 人工智能深度学习
一、DeepArt的介绍DeepArt是一种基于深度学习的艺术风格迁移应用，能够将输入图像转换成具有特定艺术风格的输出图像。它的核心技术主要依赖于深度卷积神经网络（CNN）和风格迁移算法，能够将著名艺术作品的风格应用到用户的照片或图像上，从而创造出独具特色的艺术效果。二、DeepArt的使用选择内容图像和风格图像：用户首先需要上传一张内容图像，即他们希望转换成艺术风格的图像。接着，可以从提供的艺术
Wonder Dynamics——虚拟角色动画和实时互动生成爱研究的小牛实时互动
一、WonderDynamics介绍WonderDynamics的核心是通过AI驱动的自动化流程，简化和加速虚拟角色动画的制作。其主要功能包括：自动化角色动画：将预录制的动作捕捉数据自动应用到虚拟角色上。实时角色互动：实现虚拟角色与现实场景中的人物和物体实时互动。高精度捕捉和渲染：利用深度学习和计算机视觉技术，捕捉高精度的动作数据并生成高质量的动画。二、WonderDynamics实现技术详解Wo
AIGC深度学习教程：Transformer模型中的Position Embedding实现与应用玩AI的小胡子 embedding transformer AIGC 人工智能
在进入深度学习领域时，Transformer模型几乎是绕不开的话题，而其中的PositionEmbedding更是关键。对于刚入门的朋友，这篇教程将带你深入了解PositionEmbedding是什么、它如何在Transformer中运作，以及它在不同领域中的实际应用。什么是PositionEmbedding？PositionEmbedding是Transformer模型中一种关键机制，用于弥补模
并行处理的魔法：PyTorch中torch.multiprocessing的多进程训练指南 liuxin33445566 人工智能深度学习机器学习
并行处理的魔法：PyTorch中torch.multiprocessing的多进程训练指南在深度学习领域，模型训练往往需要大量的计算资源和时间。PyTorch，作为当前最流行的深度学习框架之一，提供了torch.multiprocessing模块，使得开发者能够利用多核CPU进行多进程训练，从而显著加速训练过程。本文将深入探讨如何在PyTorch中使用torch.multiprocessing进行
GPU算力租用平台推荐 hong161688 gpu算力
在当前快速发展的AI和深度学习领域，GPU算力租用平台成为了研究者、开发者及企业不可或缺的工具。这些平台提供了灵活、高效、可扩展的GPU资源，帮助用户解决计算资源不足的问题，加速模型训练、推理及高性能计算等任务。以下是对几个主流GPU算力租用平台的详细推荐，旨在为用户提供全面的选择和参考。一、国内GPU算力租用平台1.阿里云（AlibabaCloud）平台概述：阿里云作为中国领先的云计算服务提供商
深度学习与OpenCV：解锁计算机视觉的无限可能程序员-李旭亮深度学习
在科技日新月异的今天，计算机视觉作为人工智能领域的一颗璀璨明珠，正以前所未有的速度改变着我们的生活与工作方式。而《深度学习》与OpenCV，作为这一领域的两大重要工具，更是为计算机视觉的入门与深入探索铺设了坚实的基石。本文将带您一窥这两者的魅力，探索它们如何携手开启计算机视觉的无限可能。深度学习：智能的催化剂深度学习，作为机器学习的一个分支，其核心在于通过构建深层次的神经网络模型，模拟人脑的学习过
2021勇气读书会——《学习的逻辑》打卡（第二百一十天）于杰雄
这是我参加勇气读书会打卡第二百一十天我阅读的书籍：《学习的逻辑》出发日期：2021.1.1期待的收获：立足现在，创造未来，让自己的教学能力更上一层楼。一句标语：千里之行，始于足下。小想法：相信明天会更好，我们会战胜困难，迈向更美好的未来。不要放弃每一天的学习，让自己充实起来，加油！勇气读书会，永不散场。深度学习的策略有很多种，思维导图与结构化思维只是其中一个小小的分支而已，而关于学习策略也有更多深
什么是计算机视觉？龙腾AI 计算机视觉人工智能自然语言处理深度学习 ai
计算机视觉概述计算机视觉（ComputerVision）又称机器视觉（MachineVision），是一门让机器学会如何去“看”的学科，是深度学习技术的一个重要应用领域，被广泛应用到安防、工业质检和自动驾驶等场景。具体的说，就是让机器去识别摄像机拍摄的图片或视频中的物体，检测出物体所在的位置，并对目标物体进行跟踪，从而理解并描述出图片或视频里的场景和故事，以此来模拟人脑视觉系统。因此，计算机视觉也
在STM32上实现嵌入式人工智能应用嵌入式详谈 stm32 人工智能嵌入式硬件
引言随着微控制器的计算能力不断增强，人工智能（AI）开始在嵌入式系统中扮演越来越重要的角色。STM32微控制器由于其高性能和低功耗的特性，非常适合部署轻量级AI模型。本文将探讨如何在STM32平台上实现深度学习应用，特别是利用STM32Cube.AI工具链将训练好的神经网络模型部署到STM32设备上。环境准备硬件选择：STM32F746GDiscoverykit，具备足够的计算资源和内存支持复杂模
理解PyTorch版YOLOv5模型构架 LabVIEW_Python
一个深度学习模型，可以拆解为：模型构架(ModelArchitecture):下面详述激活函数(ActivationFunction)：YOLOv5在隐藏层中使用了LeakyReLU激活函数，在最后的检测层中使用了Sigmoid激活函数，参考这里优化函数(OptimizationFunction)：YOLOv5的默认优化算法是：SGD；可以通过命令行参数更改为Adam损失函数(LossFuncti
每天一个数据分析题（五百零二）- 分割式聚类算法跟着紫枫学姐学CDA 数据分析题库算法数据分析聚类
以下哪个选项是分割式聚类算法?A.K-Means。B.CentroidMethodC.Ward’sMethodD.以上皆非数据分析认证考试介绍：点击进入题目来源于CDA模拟题库点击此处获取答案数据分析专项练习题库内容涵盖Python，SQL，统计学，数据分析理论，深度学习，可视化，机器学习，Spark八个方向的专项练习题库，数据分析从业者刷题必备神器！
【学习笔记】第三章深度学习基础——Datawhale X李宏毅苹果书 AI夏令营 MoyiTech 人工智能学习笔记
局部极小值与鞍点梯度为0的点我们统称为临界点，包括局部极小值、鞍点等局部极小值和鞍点的梯度都为0，那如何判断呢？先请出我们损失函数：L(θ)，θ是模型中的参数的取值，是一个向量。由于网络的复杂性，我们无法直接写出损失函数，不过我们可以写出损失函数的近似取值。根据宋浩老师所讲的大学一年级高等数学的知识，我们可以通过三阶泰勒展开对损失函数在θ附近的取值进行近似：其中，θ是模型中的参数的取值，θ’是在θ
机器学习概述与应用：深度学习、人工智能与经典学习方法刷刷刷粉刷匠人工智能机器学习深度学习
引言机器学习（MachineLearning）是人工智能（AI）领域中最为核心的分支之一，其主要目的是通过数据学习和构建模型，帮助计算机系统自动完成特定任务。随着深度学习（DeepLearning）的崛起，机器学习技术在各行各业中的应用变得越来越广泛。在本文中，我们将详细介绍机器学习的基础概念，包括无监督学习、有监督学习、增量学习，以及常见的回归和分类问题，并结合实际代码示例来加深理解。1.机器学
Python深度学习：构建下一代智能系统 2401_83402415 python python 深度学习开发语言 Transformer模型目标检测算法 Attention
近年来，伴随着以卷积神经网络（CNN）为代表的深度学习的快速发展，人工智能迈入了第三次发展浪潮，AI技术在各个领域中的应用越来越广泛。为了帮助广大学员更加深入地学习人工智能领域最近3-5年的新理论与新技术，本文讲解注意力机制、Transformer模型（BERT、GPT-1/2/3/3.5/4、DETR、ViT、SwinTransformer等）、生成式模型（变分自编码器VAE、生成式对抗网络GA
并行计算的艺术：PyTorch中torch.cuda.nccl的多GPU通信精粹 2401_85763639 pytorch 人工智能 python
并行计算的艺术：PyTorch中torch.cuda.nccl的多GPU通信精粹在深度学习领域，模型的规模和复杂性不断增长，单GPU的计算能力已难以满足需求。多GPU并行计算成为提升训练效率的关键。PyTorch作为灵活且强大的深度学习框架，通过torch.cuda.nccl模块提供了对NCCL（NVIDIACollectiveCommunicationsLibrary）的支持，为多GPU通信提供
如何本地搭建 Whisper 语音识别模型？一文解决玩AI的小胡子 whisper AIGC 人工智能语音识别
Whisper是OpenAI开发的强大语音识别模型，适用于多种语言的语音转文字任务。要在本地搭建Whisper模型，需要完成以下几个步骤，确保模型在你的设备上顺利运行。1.准备环境首先，确保你的系统上安装了Python（版本3.8到3.11之间）。此外，还需要安装PyTorch，这是Whisper依赖的深度学习框架。2.安装Whisper在命令行中运行以下命令来安装Whisper和其依赖项：pip
大规模语言模型从理论到实践：智能代理的组成 AGI通用人工智能之禅计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
大规模语言模型从理论到实践：智能代理的组成关键词：大规模语言模型、智能代理、自然语言处理、深度学习、知识表示、推理机制、应用场景文章目录大规模语言模型从理论到实践：智能代理的组成1.背景介绍2.核心概念与联系3.核心算法原理&具体操作步骤3.1算法原理概述3.2算法步骤详解3.3算法优缺点3.4算法应用领域4.数学模型和公式&详细讲解&举例说明4.1数学模型构建4.2公式推导过程4.3案例分析与讲
戴尔笔记本win8系统改装win7系统 sophia天雪 win7 戴尔改装系统 win8
戴尔win8 系统改装win7 系统详述第一步：使用U盘制作虚拟光驱： 1）下载安装UltraISO：注册码可以在网上搜索。 2）启动UltraISO，点击“文件”—》“打开”按钮，打开已经准备好的ISO镜像文
BeanUtils.copyProperties使用笔记 bylijinnan java
BeanUtils.copyProperties VS PropertyUtils.copyProperties 两者最大的区别是： BeanUtils.copyProperties会进行类型转换，而PropertyUtils.copyProperties不会。既然进行了类型转换，那BeanUtils.copyProperties的速度比不上PropertyUtils.copyProp
MyEclipse中文乱码问题 0624chenhong MyEclipse
一、设置新建常见文件的默认编码格式，也就是文件保存的格式。在不对MyEclipse进行设置的时候，默认保存文件的编码，一般跟简体中文操作系统（如windows2000，windowsXP）的编码一致，即GBK。在简体中文系统下，ANSI 编码代表 GBK编码;在日文操作系统下，ANSI 编码代表 JIS 编码。 Window-->Preferences-->General -
发送邮件不懂事的小屁孩 send email
import org.apache.commons.mail.EmailAttachment; import org.apache.commons.mail.EmailException; import org.apache.commons.mail.HtmlEmail; import org.apache.commons.mail.MultiPartEmail;
动画合集换个号韩国红果果 html css
动画指一种样式变为另一种样式 keyframes应当始终定义0 100 过程 1 transition 制作鼠标滑过图片时的放大效果 css .wrap{ width: 340px;height: 340px; position: absolute; top: 30%; left: 20%; overflow: hidden; bor
网络最常见的攻击方式竟然是SQL注入蓝儿唯美 sql注入
NTT研究表明，尽管SQL注入（SQLi）型攻击记录详尽且为人熟知，但目前网络应用程序仍然是SQLi攻击的重灾区。信息安全和风险管理公司NTTCom Security发布的《2015全球智能威胁风险报告》表明，目前黑客攻击网络应用程序方式中最流行的，要数SQLi攻击。报告对去年发生的60亿攻击行为进行分析，指出SQLi攻击是最常见的网络应用程序攻击方式。全球网络应用程序攻击中，SQLi攻击占
java笔记2 a-john java
类的封装： 1，java中，对象就是一个封装体。封装是把对象的属性和服务结合成一个独立的的单位。并尽可能隐藏对象的内部细节（尤其是私有数据） 2，目的：使对象以外的部分不能随意存取对象的内部数据（如属性），从而使软件错误能够局部化，减少差错和排错的难度。 3，简单来说，“隐藏属性、方法或实现细节的过程”称为——封装。 4，封装的特性： 4.1设置
[Andengine]Error：can't creat bitmap form path “gfx/xxx.xxx” aijuans 学习Android遇到的错误
最开始遇到这个错误是很早以前了，以前也没注意，只当是一个不理解的bug，因为所有的texture，textureregion都没有问题，但是就是提示错误。昨天和美工要图片，本来是要背景透明的png格式，可是她却给了我一个jpg的。说明了之后她说没法改，因为没有png这个保存选项。我就看了一下，和她要了psd的文件，还好我有一点
自己写的一个繁体到简体的转换程序 asialee java 转换繁体 filter 简体
今天调研一个任务，基于java的filter实现繁体到简体的转换，于是写了一个demo，给各位博友奉上，欢迎批评指正。实现的思路是重载request的调取参数的几个方法，然后做下转换。
android意图和意图监听器技术百合不是茶 android 显示意图隐式意图意图监听器
Intent是在activity之间传递数据;Intent的传递分为显示传递和隐式传递显式意图：调用Intent.setComponent() 或 Intent.setClassName() 或 Intent.setClass()方法明确指定了组件名的Intent为显式意图，显式意图明确指定了Intent应该传递给哪个组件。隐式意图;不指明调用的名称,根据设
spring3中新增的@value注解 bijian1013 java spring @Value
在spring 3.0中，可以通过使用@value，对一些如xxx.properties文件中的文件，进行键值对的注入，例子如下： 1.首先在applicationContext.xml中加入： <beans xmlns="http://www.springframework.
Jboss启用CXF日志 sunjing log jboss CXF
1. 在standalone.xml配置文件中添加system-properties： <system-properties> <property name="org.apache.cxf.logging.enabled" value=&
【Hadoop三】Centos7_x86_64部署Hadoop集群之编译Hadoop源代码 bit1129 centos
编译必需的软件 Firebugs3.0.0 Maven3.2.3 Ant JDK1.7.0_67 protobuf-2.5.0 Hadoop 2.5.2源码包 Firebugs3.0.0 http://sourceforge.jp/projects/sfnet_findbug
struts2验证框架的使用和扩展白糖_ 框架 xml bean struts 正则表达式
struts2能够对前台提交的表单数据进行输入有效性校验，通常有两种方式： 1、在Action类中通过validatexx方法验证，这种方式很简单，在此不再赘述； 2、通过编写xx-validation.xml文件执行表单验证，当用户提交表单请求后，struts会优先执行xml文件，如果校验不通过是不会让请求访问指定action的。本文介绍一下struts2通过xml文件进行校验的方法并说
记录-感悟 braveCS 感悟
再翻翻以前写的感悟，有时会发现自己很幼稚，也会让自己找回初心。 2015-1-11 1. 能在工作之余学习感兴趣的东西已经很幸福了； 2. 要改变自己，不能这样一直在原来区域，要突破安全区舒适区，才能提高自己，往好的方面发展； 3. 多反省多思考；要会用工具，而不是变成工具的奴隶； 4. 一天内集中一个定长时间段看最新资讯和偏流式博
编程之美-数组中最长递增子序列 bylijinnan 编程之美
import java.util.Arrays; import java.util.Random; public class LongestAccendingSubSequence { /** * 编程之美数组中最长递增子序列 * 书上的解法容易理解 * 另一方法书上没有提到的是，可以将数组排序（由小到大）得到新的数组， * 然后求排序后的数组与原数
读书笔记5 chengxuyuancsdn 重复提交 struts2的token验证
1、重复提交 2、struts2的token验证 3、用response返回xml时的注意 1、重复提交 (1)应用场景 (1-1)点击提交按钮两次。 (1-2)使用浏览器后退按钮重复之前的操作，导致重复提交表单。 (1-3)刷新页面 (1-4)使用浏览器历史记录重复提交表单。 (1-5)浏览器重复的 HTTP 请求。 (2)解决方法 (2-1)禁掉提交按钮 (2-2)
[时空与探索]全球联合进行第二次费城实验的可能性 comsci
二次世界大战前后,由爱因斯坦参加的一次在海军舰艇上进行的物理学实验 -费城实验至今给我们大家留下很多迷团..... 关于费城实验的详细过程,大家可以在网络上搜索一下,我这里就不详细描述了在这里,我的意思是,现在
easy connect 之 ORA-12154: TNS: 无法解析指定的连接标识符 daizj oracle ORA-12154
用easy connect连接出现“tns无法解析指定的连接标示符”的错误，如下： C:\Users\Administrator>sqlplus username/[email protected]:1521/orcl SQL*Plus: Release 10.2.0.1.0 – Production on 星期一 5月 21 18:16:20 2012 Copyright (c) 198
简单排序:归并排序 dieslrae 归并排序
public void mergeSort(int[] array){ int temp = array.length/2; if(temp == 0){ return; } int[] a = new int[temp]; int
C语言中字符串的\0和空格 dcj3sjt126com c
\0 为字符串结束符，比如说： abcd (空格)cdefg；存入数组时，空格作为一个字符占有一个字节的空间，我们
解决Composer国内速度慢的办法 dcj3sjt126com Composer
用法：有两种方式启用本镜像服务： 1 将以下配置信息添加到 Composer 的配置文件 config.json 中（系统全局配置）。见“例1” 2 将以下配置信息添加到你的项目的 composer.json 文件中（针对单个项目配置）。见“例2” 为了避免安装包的时候都要执行两次查询，切记要添加禁用 packagist 的设置，如下 1 2 3 4 5
高效可伸缩的结果缓存 shuizhaosi888 高效可伸缩的结果缓存
/** * 要执行的算法，返回结果v */ public interface Computable<A, V> { public V comput(final A arg); } /** * 用于缓存数据 */ public class Memoizer<A, V> implements Computable<A,
三点定位的算法 haoningabc c 算法
三点定位，已知a,b,c三个顶点的x,y坐标和三个点都z坐标的距离，la，lb,lc 求z点的坐标原理就是围绕a,b,c 三个点画圆，三个圆焦点的部分就是所求但是，由于三个点的距离可能不准，不一定会有结果，所以是三个圆环的焦点，环的宽度开始为0，没有取到则加1 运行 gcc -lm test.c test.c代码如下 #include "stdi
epoll使用详解 jimmee c linux 服务端编程 epoll
epoll - I/O event notification facility在linux的网络编程中，很长的时间都在使用select来做事件触发。在linux新的内核中，有了一种替换它的机制，就是epoll。相比于select，epoll最大的好处在于它不会随着监听fd数目的增长而降低效率。因为在内核中的select实现中，它是采用轮询来处理的，轮询的fd数目越多，自然耗时越多。并且，在linu
Hibernate对Enum的映射的基本使用方法 linzx0212 enum Hibernate
枚举 /** * 性别枚举 */ public enum Gender { MALE(0), FEMALE(1), OTHER(2); private Gender(int i) { this.i = i; } private int i; public int getI
第10章高级事件（下） onestopweb 事件
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
孙子兵法 roadrunners 孙子兵法
始计第一孙子曰：兵者，国之大事，死生之地，存亡之道，不可不察也。故经之以五事，校之以计，而索其情：一曰道，二曰天，三曰地，四曰将，五曰法。道者，令民于上同意，可与之死，可与之生，而不危也；天者，阴阳、寒暑、时制也；地者，远近、险易、广狭、死生也；将者，智、信、仁、勇、严也；法者，曲制、官道、主用也。凡此五者，将莫不闻，知之者胜，不知之者不胜。故校之以计，而索其情，曰
MySQL双向复制 tomcat_oracle mysql
本文包括: 主机配置从机配置建立主-从复制建立双向复制背景按照以下简单的步骤: 参考一下：在机器A配置主机(192.168.1.30) 在机器B配置从机(192.168.1.29) 我们可以使用下面的步骤来实现这一点步骤1：机器A设置主机在主机中打开配置文件 ,
zoj 3822 Domination(dp) 阿尔萨斯 Mina
题目链接：zoj 3822 Domination 题目大意：给定一个N∗M的棋盘，每次任选一个位置放置一枚棋子，直到每行每列上都至少有一枚棋子，问放置棋子个数的期望。解题思路：大白书上概率那一张有一道类似的题目，但是因为时间比较久了，还是稍微想了一下。dp[i][j][k]表示i行j列上均有至少一枚棋子，并且消耗k步的概率（k≤i∗j）,因为放置在i+1~n上等价与放在i+1行上，同理

[CS231n Assignment 2 #03 ] 随机失活（Dropout）

文章目录

作业介绍

1. 前向传播

2. 反向传播

3. 带Dropout的全连接网络

4. 对比实验

你可能感兴趣的:(#,CS231n,深度学习)