睡晚不猿序程

【CS231n assignment 2022】Assignment 3 - Part 1，RNN for Image Caption

前言

博客主页：睡晚不猿序程

⌚首发时间：2022.8.7

⏰最近更新时间：2023.2.8

本文由 睡晚不猿序程 原创，首发于 CSDN

作者是蒻蒟本蒟，如果文章里有任何错误或者表述不清，请 tt 我，万分感谢！orz

相关文章目录 ：

【CS231n assignment 2022】Assignment 2 - Part 1，全连接网络的初始化以及正反向传播
【CS231n assignment 2022】Assignment 2 - Part 2，优化器，批归一化以及层归一化

文章目录

前言
1. 内容简介
2. RNN_Captioning
- 2.1 COCO Dataset
- 2.2 Inspect the Data
- 2.3 Vanilla RNN：Step Forward
- 2.4 Vanilla RNN: Step Back ward
- 2.5 Vanilla RNN: Forward
- 2.6 Vanilla RNN: Backward
- 2.7 Word Embedding: Forward
- 2.8 Word Embedding: Backward
- 2.9 Temporal Softmax Loss
- 2.10 RNN for Image Captioning
- 2.11 Overfit RNN Captioning Model on Small Data
- 2.12 RNN Sampling at Test Time
3. 总结、预告

1. 内容简介

作业二的文章只写了一半，成功的咕咕咕了呜呜呜

反倒是这次作业三的文章一边做一边写竟然写完了一篇，于是就先发出来啦！

下一次必须发作业二的呜呜，对不起大家

本次的内容将会是：

完成 RNN，并使用 RNN 进行 image caption 任务

好了让我们开始吧！

2. RNN_Captioning

2.1 COCO Dataset

COCO 数据集是一个常用的图片描述数据集，80,000 训练图片和40,000验证图片

图片特征已经被提取出来，以便于减少硬件要求

特征的维度从4086降低至512（使用了PCA）

分别存放于train2014_urls.txt 和val2014_urls.txt

虽然没有 RAW 图片，但是我们把链接都放到了train2014_urls.txt和val2014_urls.txt里面去了，可以自己下载下来看看

Caption：处理字符很困难，所以也帮我们处理了，我们只需要专心构建网络就好。每一个词都有一个独立的ID，我们可以用一系列 integers 来代表一个caption，对应关系存放于coco2014_vocab.jason，使用函数decode_captions来将存放 integer IDs 的 np 数组转化成字符串

tokens：我们添加了一系列特殊tokens在字典中

,：代表开始和结束
：奇怪的词被这个代替了（unknown）
：用于填充，最后才是

我们可以使用load_coco_data函数来载入COCO数据集

返回值：(captions, features, URLs, vocabulary)

2.2 Inspect the Data

我们可以先查看一下数据集

注意要对 cell 中的代码做如下的修改

# Sample a minibatch and show the images and captions.
# If you get an error, the URL just no longer exists, so don't worry!
# You can re-sample as many times as you want.
batch_size = 3

captions, features, urls = sample_coco_minibatch(data, batch_size=batch_size)
for i, (caption, url) in enumerate(zip(captions, urls)):
    # plt.imshow(image_from_url(url))   # 上一行代码运行到这里就会阻塞，所以我们直接输出url即可
    print(url)
    plt.figure(figsize=(5, 0.5))
    plt.axis('off')
    caption_str = decode_captions(caption, data['idx_to_word'])
    plt.title(caption_str)
    plt.show()

可以看出我们的输出的描述

接下来我们就要完成 RNN 了，为了完成图像描述的任务，cs231n/rnn_layers.py中包含着我们构造RNN所需要的不同层，cs231n/classifiers/rnn.py将会使用这些层来完成一个图像描述模型

我们将首先在cs231n/rnn_layers.py中构造不同层

LSTM是RNN的一个变种，我们现在不考虑LSTM

2.3 Vanilla RNN：Step Forward

我们首先要完成单步前向传播

按照 RNN 的前向传播，就是输入的 x 乘上权重 Wx 加上上一次的隐藏状态 h 乘以权重 Wh，然后加上偏置 b，最后经过 tanh 的激活，得到本时间段的输出

def rnn_step_forward(x, prev_h, Wx, Wh, b):
    """Run the forward pass for a single timestep of a vanilla RNN using a tanh activation function.

    The input data has dimension D, the hidden state has dimension H,
    and the minibatch is of size N.

    Inputs:
    - x: Input data for this timestep, of shape (N, D)
    - prev_h: Hidden state from previous timestep, of shape (N, H)
    - Wx: Weight matrix for input-to-hidden connections, of shape (D, H)
    - Wh: Weight matrix for hidden-to-hidden connections, of shape (H, H)
    - b: Biases of shape (H,)

    Returns a tuple of:
    - next_h: Next hidden state, of shape (N, H)
    - cache: Tuple of values needed for the backward pass.
    """
    next_h, cache = None, None
    ##############################################################################
    # TODO: Implement a single forward step for the vanilla RNN. Store the next  #
    # hidden state and any values you need for the backward pass in the next_h   #
    # and cache variables respectively.                                          #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    hidden_x = x.dot(Wx)  # (N,H)
    hidden_prev_h = prev_h.dot(Wh)    # (N,H)
    next_h = np.tanh(hidden_x+hidden_prev_h+b.reshape(1, -1))
    cache = (x, prev_h, Wx, Wh, next_h)

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return next_h, cache

代码详解

前向传播较为简单，只需要直接按照上面的步骤进行前向传播即可

接下来我们验证一下我们的结果

结果正确

2.4 Vanilla RNN: Step Back ward

接下来我们我们要完成单步反向传播，RNN 的反向传播较为简单，只需要慢慢求导就可以得到我们想要的答案了

上面就是反向传播的公式求导，这时候我们可以按照这样的步骤把反向传播回来的梯度求出来

def rnn_step_backward(dnext_h, cache):
    """Backward pass for a single timestep of a vanilla RNN.

    Inputs:
    - dnext_h: Gradient of loss with respect to next hidden state, of shape (N, H)
    - cache: Cache object from the forward pass

    Returns a tuple of:
    - dx: Gradients of input data, of shape (N, D)
    - dprev_h: Gradients of previous hidden state, of shape (N, H)
    - dWx: Gradients of input-to-hidden weights, of shape (D, H)
    - dWh: Gradients of hidden-to-hidden weights, of shape (H, H)
    - db: Gradients of bias vector, of shape (H,)
    """
    dx, dprev_h, dWx, dWh, db = None, None, None, None, None
    ##############################################################################
    # TODO: Implement the backward pass for a single step of a vanilla RNN.      #
    #                                                                            #
    # HINT: For the tanh function, you can compute the local derivative in terms #
    # of the output value from tanh.                                             #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    x, prev_h, Wx, Wh, next_h = cache
    dout = (1-next_h**2)*dnext_h  # (N,H)  注意tanh求导是1-tanh^2
    db = np.sum(dout, axis=0)  # (N,H)->(H,)
    dx = dout.dot(Wx.T)  # (N,H)*(H,D)->(N,D)
    dWx = x.T.dot(dout)  # (D,N)*(N,H)->(D,H)
    dprev_h = dout.dot(Wh.T)
    dWh = prev_h.T.dot(dout)

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return dx, dprev_h, dWx, dWh, db

代码详解

按部就班就可以完成任务！

我们接下来就可以验证一下我们的实现是否正确

经过验证，答案正确

2.5 Vanilla RNN: Forward

现在我们完成了单个时间步的前反向传播，接下来我们会把这些零件组合起来实现 RNN，它可以处理一个序列数据

现在我们要打开cs231n/rnn_layers.py，完成函数 rnn_forward ，我们会使用到我们完成的 rnn_step_forward 函数

首先我们回忆一下 RNN 前向传播的结构

然后看一下函数的 API，才能知道我们接下来要干什么

def rnn_forward(x, h0, Wx, Wh, b):
    """Run a vanilla RNN forward on an entire sequence of data.

    We assume an input sequence composed of T vectors, each of dimension D. The RNN uses a hidden
    size of H, and we work over a minibatch containing N sequences. After running the RNN forward,
    we return the hidden states for all timesteps.

    Inputs:
    - x: Input data for the entire timeseries, of shape (N, T, D)
    - h0: Initial hidden state, of shape (N, H)
    - Wx: Weight matrix for input-to-hidden connections, of shape (D, H)
    - Wh: Weight matrix for hidden-to-hidden connections, of shape (H, H)
    - b: Biases of shape (H,)

    Returns a tuple of:
    - h: Hidden states for the entire timeseries, of shape (N, T, H)
    - cache: Values needed in the backward pass
    """

输入：

x：输入的数据，大小为（N，T，D）
h0：初始状态，大小（N，H）
Wx：x 的权重，大小（D，H）
Wh：h 的权重，大小（H，H）
b：偏置，大小（H，）

h：全部时间点的隐藏状态，大小为（N，T，H）
cache：用于反向传播的中间状态

思路

输入数据 x 的排列是（N，T，D），所以我们需要把它转换成（T，N，D），同理 h
经过了 T 个时间点，我们将会得到 T 个状态，所以初始隐藏状态不会被放入 h 中
使用一个 for 循环，循环计算隐藏状态即可

代码解析

def rnn_forward(x, h0, Wx, Wh, b):
    """Run a vanilla RNN forward on an entire sequence of data.

    We assume an input sequence composed of T vectors, each of dimension D. The RNN uses a hidden
    size of H, and we work over a minibatch containing N sequences. After running the RNN forward,
    we return the hidden states for all timesteps.

    Inputs:
    - x: Input data for the entire timeseries, of shape (N, T, D)
    - h0: Initial hidden state, of shape (N, H)
    - Wx: Weight matrix for input-to-hidden connections, of shape (D, H)
    - Wh: Weight matrix for hidden-to-hidden connections, of shape (H, H)
    - b: Biases of shape (H,)

    Returns a tuple of:
    - h: Hidden states for the entire timeseries, of shape (N, T, H)
    - cache: Values needed in the backward pass
    """
    h, cache = None, None
    ##############################################################################
    # TODO: Implement forward pass for a vanilla RNN running on a sequence of    #
    # input data. You should use the rnn_step_forward function that you defined  #
    # above. You can use a for loop to help compute the forward pass.            #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    cache = []
    N, T, D = x.shape
    H = h0.shape[1]
    h = np.zeros((T, N, H))   # h(T,N,H)
    x_trans = np.swapaxes(x, 0, 1)   # x_trans(T,N,D)
    for i in range(T):
        x_now = x_trans[i]    # (N,D)
        if i == 0:
            h_now = h0
        else:
            h_now = h[i-1, :, :]
        next_h, cache_now = rnn_step_forward(x_now, h_now, Wx, Wh, b)
        h[i, :, :] = next_h
        cache.append(cache_now)
    h = np.swapaxes(h, 0, 1)

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return h, cache

使用 np.swapaxes() 函数来进行维度转换
cache 初始化为空列表，之后经过一次运算就使用 append 进行插入
最后记得将h的维度转换回来

2.6 Vanilla RNN: Backward

接下来进行反向传播，首先要阅读 API

def rnn_backward(dh, cache):
    """Compute the backward pass for a vanilla RNN over an entire sequence of data.

    Inputs:
    - dh: Upstream gradients of all hidden states, of shape (N, T, H)

    NOTE: 'dh' contains the upstream gradients produced by the 
    individual loss functions at each timestep, *not* the gradients
    being passed between timesteps (which you'll have to compute yourself
    by calling rnn_step_backward in a loop).

    Returns a tuple of:
    - dx: Gradient of inputs, of shape (N, T, D)
    - dh0: Gradient of initial hidden state, of shape (N, H)
    - dWx: Gradient of input-to-hidden weights, of shape (D, H)
    - dWh: Gradient of hidden-to-hidden weights, of shape (H, H)
    - db: Gradient of biases, of shape (H,)
    """

dh，每一个隐藏状态的上游梯度
- 注意，dh是每一个时间段用独立的损失函数计算出来的，所以还是要通过 for 循环计算梯度
dx：输入的梯度
dh0：初始隐藏状态的梯度
dWx：x 的权重 W 梯度
dWh：h 的权重 W 梯度
db：偏置 b 的梯度

思路

和DNN的反向传播相类似，将 RNN 按时间展开，可以看作是共享权重的 DNN
我们要仔细观察一下梯度流，注意梯度的传播有两个方向，一个是当前传播下来的梯度，另一个是由下一个时间节点传播下来的梯度

def rnn_backward(dh, cache):
    """Compute the backward pass for a vanilla RNN over an entire sequence of data.

    Inputs:
    - dh: Upstream gradients of all hidden states, of shape (N, T, H)

    NOTE: 'dh' contains the upstream gradients produced by the 
    individual loss functions at each timestep, *not* the gradients
    being passed between timesteps (which you'll have to compute yourself
    by calling rnn_step_backward in a loop).

    Returns a tuple of:
    - dx: Gradient of inputs, of shape (N, T, D)
    - dh0: Gradient of initial hidden state, of shape (N, H)
    - dWx: Gradient of input-to-hidden weights, of shape (D, H)
    - dWh: Gradient of hidden-to-hidden weights, of shape (H, H)
    - db: Gradient of biases, of shape (H,)
    """
    dx, dh0, dWx, dWh, db = None, None, None, None, None
    ##############################################################################
    # TODO: Implement the backward pass for a vanilla RNN running an entire      #
    # sequence of data. You should use the rnn_step_backward function that you   #
    # defined above. You can use a for loop to help compute the backward pass.   #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    N, T, H = dh.shape  # (N,T,H)
    x, h, Wx, Wh, b = cache[0]
    D = x.shape[1]

    dx = np.zeros((N, T, D))
    dWx = np.zeros((D, H))
    dWh = np.zeros((H, H))
    db = np.zeros(H)
    dh_i = np.zeros((N, H))

    for i in range(T-1, -1, -1):
        dh_i += dh[:, i, :]     # 这里是加号，注意想一下为什么,反向传播的梯度流有两个！
        dx_i, dh_i, dWx_i, dWh_i, db_i = rnn_step_backward(dh_i, cache[i])
        dx[:, i, :] = dx_i
        dWx += dWx_i
        dWh += dWh_i
        db += db_i
        if i == 0:
            dh0 = dh_i

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return dx, dh0, dWx, dWh, db

代码详解

注意我们定义的 dh_i ，在循环的开始记录的是从后一个时间节点传入的梯度，然后循环开始的时候就加上了本时间节点输出的梯度（独立的损失），因为是加法，所以求导后也是加法，因此直接相加然后进行反向传播
到最后一层，dh_i 就是我们想要的 dh0 了

2.7 Word Embedding: Forward

在深度学习系统中，我们通常用一个向量来代表一个词语，词典中的每个词语都由一个向量来表示，且每个向量都会被一起学习

接下来我们完成 cs231n/rnn_layers.py 中的 word_embedding_forward 来将词语转换成向量

def word_embedding_forward(x, W):
    """Forward pass for word embeddings.

    We operate on minibatches of size N where
    each sequence has length T. We assume a vocabulary of V words, assigning each
    word to a vector of dimension D.

    Inputs:
    - x: Integer array of shape (N, T) giving indices of words. Each element idx
      of x muxt be in the range 0 <= idx < V.
    - W: Weight matrix of shape (V, D) giving word vectors for all words.

    Returns a tuple of:
    - out: Array of shape (N, T, D) giving word vectors for all input words.
    - cache: Values needed for the backward pass
    """
    out, cache = None, None
    ##############################################################################
    # TODO: Implement the forward pass for word embeddings.                      #
    #                                                                            #
    # HINT: This can be done in one line using NumPy's array indexing.           #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    out, cache = W[x], (x, W)

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return out, cache

代码详解

词嵌入：

就是把一个词语转换成一个向量表示，也可以称为**“word2vec”**

使用“独热编码”表示单词，也就是用一个向量来表示词语，那么如果单词有10000个，则向量长度就变成了（10000，）。并且这种表示忽略了单词之间的联系，也就是平等的看待每个单词

“word2vec”

就是学习一个映射，把单词变成向量表示（vec=f(word)），接下来我们就是用这个向量来表示单词量

在这里，我们学习这个映射了，minibatch大小为N，里面的内容是词语的坐标，W 数组中行表示的就是向量化表示的词

它这里给出提示说，可以使用 numpy 数组索引来一行解决，我们来重点解释一下
- 使用索引数组，就是在 [] 里面可以放数组，生成的数组情况一般和里面的数组形状一样（把多为数组看成一个“对象”，比如这一题，我们后面会解释这个嘿嘿
- W(V, D)：代表着 V 个词语的向量表示，每一行都代表着一个词向量
- x(N,T)：一个 minibatch，一个训练输入有 T 个词，其中内容为该词语在 W 中的位置 idx
- W[x]：使用数组 x 来进行索引，假设 x 的内容是 array([[0,1,2,3]])，也就是一个(1,4)大小的数组
  - 生成的形状将会和x相同，也就是 array([[W[0], W[1], W[2], W[3]]])
  - 其中W[idx]是一个D大小的向量，我们把他当成了一个“对象”，所以我们说结果和 x 的形状一样
  - 但是 W[x] 的形状为(N,T,D)

2.8 Word Embedding: Backward

在上面我们实现了正向传播，接下来我们要实现反向传播算法啦

首先还是阅读一下给我们的提示

def word_embedding_backward(dout, cache):
    """Backward pass for word embeddings.

    We cannot back-propagate into the words
    since they are integers, so we only return gradient for the word embedding
    matrix.

    HINT: Look up the function np.add.at

    Inputs:
    - dout: Upstream gradients of shape (N, T, D)
    - cache: Values from the forward pass

    Returns:
    - dW: Gradient of word embedding matrix, of shape (V, D)
    """

我们的反向传播不能传播到词，因为他们代表的词语所在的位置坐标，所以我们只会放回梯度矩阵
提示：看一下函数 np.add.at
输入
- dout：上游梯度，大小（N，T，D）
- cache：前向传播的值
输出
- dW：词嵌入矩阵的梯度，大小为（V，D）

np.add.at

感觉这位博主讲的非常好，给一个参考链接：https://blog.csdn.net/qq120633269/article/details/110039585

首先使用 tuple 和 list进行索引是会导致不一样的结果的

tuple 进行索引，每个 tuple 会被当成是坐标
如果有多个 list 进行索引（且数量等于被索引的数组的维数），则第一个 list 作为第一个维度的坐标，第二个 list 作为第二个维度的坐标，以此类推
如果多个 list 进行索引（且数量小于被索引的数组的维度），少的维度全选

np.add.at(a,b,c)的整个过程如下：

创建一个形状为 a[b].shape 的一个全零数组，然后把 c 广播到其中
让其和 a[b] 进行相加，然后再把相加的结果对应回到原来 a 的索引位置就可以了

def word_embedding_backward(dout, cache):
    """Backward pass for word embeddings.

    We cannot back-propagate into the words
    since they are integers, so we only return gradient for the word embedding
    matrix.

    HINT: Look up the function np.add.at

    Inputs:
    - dout: Upstream gradients of shape (N, T, D)
    - cache: Values from the forward pass

    Returns:
    - dW: Gradient of word embedding matrix, of shape (V, D)
    """
    dW = None
    ##############################################################################
    # TODO: Implement the backward pass for word embeddings.                     #
    #                                                                            #
    # Note that words can appear more than once in a sequence.                   #
    # HINT: Look up the function np.add.at                                       #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    x, W = cache    # x(N,T) W(V,D)
    dW = np.zeros_like(W)   #(V,D)
    np.add.at(dW, x, dout)  # dW[x](N,T,D)  dout(N,T,D)

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return dW

代码详解

按照上面的讲解，np.add.at 会生成 dw[x]，这是一个大小为（N，T，D）的张量，和 dout 的大小一致，和 dout 相加，然后再把它转化为 dW 的大小，这就是我们想要的答案啦

2.9 Temporal Softmax Loss

在 RNN 语言模型中，每一个时间步会为词典中的每一个词输出一个得分。我们知道每个时间步应该输出的 ground-truth word，所以我们可以用 Softmax 损失函数来计算每个时间步的损失和梯度，我们可以累积损失并在 minibatch 上计算平均

有个问题：不同的图像描述可能会有不同的长度，我们用来进行填充让他们长度相同，而且这些标签不计算损失和梯度，所以我们需要一个 mask 来选择要计算损失的部分

因为 Softmax Loss 和 Assignment 1 的做法差不多，所以已经帮我们实现了，我们要自己去阅读一下代码

首先阅读一下他的说明

def temporal_softmax_loss(x, y, mask, verbose=False):
    """A temporal version of softmax loss for use in RNNs.

    We assume that we are making predictions over a vocabulary of size V for each timestep of a
    timeseries of length T, over a minibatch of size N. The input x gives scores for all vocabulary
    elements at all timesteps, and y gives the indices of the ground-truth element at each timestep.
    We use a cross-entropy loss at each timestep, summing the loss over all timesteps and averaging
    across the minibatch.

    As an additional complication, we may want to ignore the model output at some timesteps, since
    sequences of different length may have been combined into a minibatch and padded with NULL
    tokens. The optional mask argument tells us which elements should contribute to the loss.

    Inputs:
    - x: Input scores, of shape (N, T, V)
    - y: Ground-truth indices, of shape (N, T) where each element is in the range
         0 <= y[i, t] < V
    - mask: Boolean array of shape (N, T) where mask[i, t] tells whether or not
      the scores at x[i, t] should contribute to the loss.

    Returns a tuple of:
    - loss: Scalar giving loss
    - dx: Gradient of loss with respect to scores x.
    """

假设词典有 V 个词语，时间长度 T（也就是一个句子中有 T 个词），minibatch 大小为 N
x （大小（N，T，V））给出了词典中每一个词在每一个时间步的分数
y （大小（N，T））给出了每个时间步应该输出的词语的坐标
每个时间步计算交叉熵损失，累加损失并计算平均值
mask （大小（N，T））告诉算法那些需要计算损失，而那些不需要

Softmax损失

我们先来回忆一下 Softmax 损失
$L_i=-f_{y_i}+log(\sum_je^{f_j}) \\ 或者\\ L_i=-log(\frac{e^{f_{y_i}}}{\sum_je^{f_j}})$
$f$ 表示得分，按照上面的方式进行反向传播会简单很多

其中 $l o g$ 求导结束就是每个类别的概率，对于正确的类别还要额外-1

代码解析

def temporal_softmax_loss(x, y, mask, verbose=False):
    """A temporal version of softmax loss for use in RNNs.

    We assume that we are making predictions over a vocabulary of size V for each timestep of a
    timeseries of length T, over a minibatch of size N. The input x gives scores for all vocabulary
    elements at all timesteps, and y gives the indices of the ground-truth element at each timestep.
    We use a cross-entropy loss at each timestep, summing the loss over all timesteps and averaging
    across the minibatch.

    As an additional complication, we may want to ignore the model output at some timesteps, since
    sequences of different length may have been combined into a minibatch and padded with NULL
    tokens. The optional mask argument tells us which elements should contribute to the loss.

    Inputs:
    - x: Input scores, of shape (N, T, V)
    - y: Ground-truth indices, of shape (N, T) where each element is in the range
         0 <= y[i, t] < V
    - mask: Boolean array of shape (N, T) where mask[i, t] tells whether or not
      the scores at x[i, t] should contribute to the loss.

    Returns a tuple of:
    - loss: Scalar giving loss
    - dx: Gradient of loss with respect to scores x.
    """

    N, T, V = x.shape

    x_flat = x.reshape(N * T, V)    # (N,T,V)->(N*T,V)
    y_flat = y.reshape(N * T)       # (N,T)->(N*T)
    mask_flat = mask.reshape(N * T) # (N,T)->(N*T)

    probs = np.exp(x_flat - np.max(x_flat, axis=1, keepdims=True))  # 平移
    probs /= np.sum(probs, axis=1, keepdims=True)   # softmax 计算每个词语的概率，probs(N*T,V)
    loss = -np.sum(mask_flat * np.log(probs[np.arange(N * T), y_flat])) / N # loss 就是不是正确类别的概率和进行log运算加上负号
    dx_flat = probs.copy()
    dx_flat[np.arange(N * T), y_flat] -= 1	# 正确类别要-1
    dx_flat /= N	# 别忘了 loss 是取了平均的，所以导数也要取平均
    dx_flat *= mask_flat[:, None]	# 这里的[:,None] 将向量mask处理为列向量，进行遮盖

    if verbose:
        print("dx_flat: ", dx_flat.shape)

    dx = dx_flat.reshape(N, T, V)

    return loss, dx

2.10 RNN for Image Captioning

我们现在已经完成了构建一个用于图片描述的 RNN 所需要的层，现在我们可以将其组合起来。打开 cs231n/classifiers/rnn.py 并查看 CaptioniingRNN 类

我们需要在loss函数中完成前反向传播，我们只需要完成cell_type='rnn'的部分就可以了

代码详解

我们需要仔细阅读已经给出的部分，我们才可以更快的完成所需要的部分

类描述

class CaptioningRNN:
    """
    A CaptioningRNN produces captions from image features using a recurrent
    neural network.
    描述RNN使用一个循环神经网络，利用图片特征生成描述

    The RNN receives input vectors of size D, has a vocab size of V, works on
    sequences of length T, has an RNN hidden dimension of H, uses word vectors
    of dimension W, and operates on minibatches of size N.
    RNN
    输入向量大小 D（有D行向量）
    词汇大小 V（有V个词）
    长度为 T（一句话T个词）
    隐状态维度 H
    词向量维度 W
    batch 大小N

    Note that we don't use any regularization for the CaptioningRNN.
    不使用正则化
    """

初始化

def __init__(
        self,
        word_to_idx,
        input_dim=512,
        wordvec_dim=128,
        hidden_dim=128,
        cell_type="rnn",
        dtype=np.float32,
    ):
        """
        Construct a new CaptioningRNN instance.

        Inputs:
        - word_to_idx: A dictionary giving the vocabulary. It contains V entries,
          and maps each string to a unique integer in the range [0, V).
          给出了词语的字典，大小为 V，把词映射到对应的向量坐标
          
        - input_dim: Dimension D of input image feature vectors. 
          图像特征向量的维度
          
        - wordvec_dim: Dimension W of word vectors. 
          词向量的维度
          
        - hidden_dim: Dimension H for the hidden state of the RNN.
          隐藏状态维度
          
        - cell_type: What type of RNN to use; either 'rnn' or 'lstm'.
          使用rnn或lstm
          
        - dtype: numpy datatype to use; use float32 for training and float64 for
          numeric gradient checking.
          数据类型
        """
        if cell_type not in {"rnn", "lstm"}:
            raise ValueError('Invalid cell_type "%s"' % cell_type)

        self.cell_type = cell_type
        self.dtype = dtype
        self.word_to_idx = word_to_idx
        self.idx_to_word = {i: w for w, i in word_to_idx.items()}
        self.params = {}

        vocab_size = len(word_to_idx)

        self._null = word_to_idx[""]
        self._start = word_to_idx.get("", None)
        self._end = word_to_idx.get("", None)

        # Initialize word vectors 初始化词向量
        self.params["W_embed"] = np.random.randn(vocab_size, wordvec_dim)
        self.params["W_embed"] /= 100

        # Initialize CNN -> hidden state projection parameters 初始化CNN->隐藏状态的参数
        self.params["W_proj"] = np.random.randn(input_dim, hidden_dim)
        self.params["W_proj"] /= np.sqrt(input_dim)
        self.params["b_proj"] = np.zeros(hidden_dim)

        # Initialize parameters for the RNN 初始化RNN参数
        dim_mul = {"lstm": 4, "rnn": 1}[cell_type]
        self.params["Wx"] = np.random.randn(wordvec_dim, dim_mul * hidden_dim)
        self.params["Wx"] /= np.sqrt(wordvec_dim)
        self.params["Wh"] = np.random.randn(hidden_dim, dim_mul * hidden_dim)
        self.params["Wh"] /= np.sqrt(hidden_dim)
        self.params["b"] = np.zeros(dim_mul * hidden_dim)

        # Initialize output to vocab weights  初始化暑输出词语权重（隐藏状态->词）
        self.params["W_vocab"] = np.random.randn(hidden_dim, vocab_size)
        self.params["W_vocab"] /= np.sqrt(hidden_dim)
        self.params["b_vocab"] = np.zeros(vocab_size)

        # Cast parameters to correct dtype
        for k, v in self.params.items():
            self.params[k] = v.astype(self.dtype)

接下来到了我们要自己完成的内容了，阅读一下 API：

def loss(self, features, captions):
        """
        Compute training-time loss for the RNN. We input image features and
        ground-truth captions for those images, and use an RNN (or LSTM) to compute
        loss and gradients on all parameters.

        Inputs:
        - features: Input image features, of shape (N, D)
        - captions: Ground-truth captions; an integer array of shape (N, T + 1) where
          each element is in the range 0 <= y[i, t] < V

        Returns a tuple of:
        - loss: Scalar loss
        - grads: Dictionary of gradients parallel to self.params
        """
        # Cut captions into two pieces: captions_in has everything but the last word
        # and will be input to the RNN; captions_out has everything but the first
        # word and this is what we will expect the RNN to generate. These are offset
        # by one relative to each other because the RNN should produce word (t+1)
        # after receiving word t. The first element of captions_in will be the START
        # token, and the first element of captions_out will be the first word.
        # 把描述分为两个部分，captions_in不包含最后一个词，将会输入进入RNN
        # caption_out 不包含第一个词，是我们希望RNN生成的内容
        # 为何彼此偏移一个的原因是 RNN 应该在接收到 t 个单词后生成 t+1 个单词
        # caption_in 的第一个元素是 
        # caption_out 的第一个元素是第一个单词

需要的变量初始化

        captions_in = captions[:, :-1]	# 输入的词
        captions_out = captions[:, 1:]	# 输出的词

        # You'll need this	遮罩
        mask = captions_out != self._null

        # Weight and bias for the affine transform from image features to initial
        # hidden state
        # 把图片特征转化成隐藏状态的全连接网路
        W_proj, b_proj = self.params["W_proj"], self.params["b_proj"]

        # Word embedding matrix
        # 词嵌入矩阵
        W_embed = self.params["W_embed"]

        # Input-to-hidden, hidden-to-hidden, and biases for the RNN
        Wx, Wh, b = self.params["Wx"], self.params["Wh"], self.params["b"]

        # Weight and bias for the hidden-to-vocab transformation.
        W_vocab, b_vocab = self.params["W_vocab"], self.params["b_vocab"]

        loss, grads = 0.0, {}

接下来我们开始实现前向传播，作业非常贴心的给我们提供了提示，我们只需要跟着提示做就可以了

        ############################################################################
        # TODO: Implement the forward and backward passes for the CaptioningRNN.   #
        # In the forward pass you will need to do the following:                   #
        # (1) Use an affine transformation to compute the initial hidden state     #
        #     from the image features. This should produce an array of shape (N, H)#
        # (2) Use a word embedding layer to transform the words in captions_in     #
        #     from indices to vectors, giving an array of shape (N, T, W).         #
        # (3) Use either a vanilla RNN or LSTM (depending on self.cell_type) to    #
        #     process the sequence of input word vectors and produce hidden state  #
        #     vectors for all timesteps, producing an array of shape (N, T, H).    #
        # (4) Use a (temporal) affine transformation to compute scores over the    #
        #     vocabulary at every timestep using the hidden states, giving an      #
        #     array of shape (N, T, V).                                            #
        # (5) Use (temporal) softmax to compute loss using captions_out, ignoring  #
        #     the points where the output word is  using the mask above.     #
        #                                                                          #
        #                                                                          #
        # Do not worry about regularizing the weights or their gradients!          #
        #                                                                          #
        # In the backward pass you will need to compute the gradient of the loss   #
        # with respect to all model parameters. Use the loss and grads variables   #
        # defined above to store loss and gradients; grads[k] should give the      #
        # gradients for self.params[k].                                            #
        #                                                                          #
        # Note also that you are allowed to make use of functions from layers.py   #
        # in your implementation, if needed.                                       #
        ############################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
		
        # 构建初始状态（通过输入的图片进行运算）
        ini_hidden_state, ini_cache = affine_forward(features, W_proj, b_proj)
        
        # 进行词嵌入（把输入转化成词向量表示）
        word_emb_in, word_emb_cache = word_embedding_forward(captions_in, W_embed)
        
        # 使用RNN进行前向传播(随着时间，也就是“向左”)
        hidden_state, hidden_cache = rnn_forward(word_emb_in, ini_hidden_state, Wx, Wh, b)  
        
        # 求出每个时间节点的输出（也就是“向上”）
        temp_out, temp_cache = temporal_affine_forward(hidden_state, W_vocab, b_vocab) 
        
        # 求出每个时间节点的损失
        loss, grads = temporal_softmax_loss(temp_out, captions_out, mask)

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ############################################################################
        #                             END OF YOUR CODE                             #
        ############################################################################

        return loss, grads

反向传播

接下来我们要实现反向传播，我们再一次把这个图挂在这里，以便于我们理解接下来的代码

代码详解

# 求出每个时间节点的损失
loss, dtemp_out = temporal_softmax_loss(temp_out, captions_out, mask)

# 反向传播
dout, grads["W_vocab"], grads["b_vocab"] = temporal_affine_backward(dtemp_out, temp_cache)

demb, dini_hidden, grads["Wx"], grads["Wh"], grads["b"] = rnn_backward(dout, hidden_cache)

grads["W_embed"] = word_embedding_backward(demb, word_emb_cache)

dfeatures, grads["W_proj"], grads["b_proj"] = affine_backward(dini_hidden, ini_cache)

第一行是梯度流是“从上到下”，根据输出单词的差异算出梯度并反向传播
第二行的梯度是 “根据时间从后往前” + “从上到下”（两个梯度方向）
第三行的梯度流仍然是“从上到下”，根据上面隐层计算出的梯度传播给词嵌入
第四行梯度流仍然是“根据时间从后往前”，根据隐层计算出的梯度传播给初始化权重

挺绕的一个步骤把，所以叫做循环神经网络啦！

代码验证
接下来我们运行 jupyter notebook 中的代码来验证我们的实现有效性

恭喜，完成任务

2.11 Overfit RNN Captioning Model on Small Data

就像我们之前用来训练图像分类模型的 Solver 类一样，我们这次作业用 CaptioningSolver 类来训练图像描述模型，打开 cs231n/captioning_solver.py ，并阅读 CaptioningSolver 类，你会发现非常相似的

一旦你熟悉了 API，运行下面的单元来保证你的模型会在一个小的样本（100个训练样本）下出现过拟合，最后的损失小于0.1

代码验证

答案确实小于0.1了

最后输出训练损失

完成过拟合，说明模型有效

2.12 RNN Sampling at Test Time

胜利就在眼前，最后一个小作业了

和图像分类不同，图像描述模型在训练和测试的时候表现会非常不同

训练的时候，我们有标准的描述，所以我们把 ground-truth words 当作输入喂给 RNN（不然会出现问题）

测试的时候，我们从词典分布中取样，并喂给RNN

在文件 cs231n/classifiers/rnn.py ，完成 sample 函数来完成测试时的取样，完成之后，运行下面的代码来使模型在训练集和验证集过拟合

思路

这里有什么区别呢？在训练的时候，我们的训练数据是一次性输入的，也就是说我们的输入x为（N，T，D），但是此时我们的输入是一个一个输入的，也就是我们的输入x为（N，），我们需要使用 rnn_step_forward 函数来一步步串行的进行输出才可以，有了这个想法，我们来看代码

代码详解

        hidden_state, _ = affine_forward(features, W_proj, b_proj)
        word = self._start*np.ones(N, dtype=np.int32)  # (N,)
        for i in range(max_length):
            word_embed, _ = word_embedding_forward(word, W_embed)
            hidden_state, _ = rnn_step_forward(
                word_embed, hidden_state, Wx, Wh, b)
            scores, _ = affine_forward(hidden_state, W_vocab, b_vocab)
            word = np.argmax(scores, axis=1)
            captions[:, i] = word

首先要使用输入的图像特征来计算初始的隐藏状态（一次性输入N张图）
将初始的词坐标放入word向量中，word向量一次性保存N张图片描述的当前输出
准备好初始状态后进入循环，进行RNN的运算
- 进行词嵌入
- 根据之前的隐藏状态和当前的输入运算下一个隐藏状态
- 根据当前的隐藏状态计算每个词的得分
- 把得分最高的词的坐标取出
- 将输入记录到captions张量中

接下来我们放出全部代码

class CaptioningRNN:
    """
    A CaptioningRNN produces captions from image features using a recurrent
    neural network.
    描述RNN使用一个循环神经网络，利用图片特征生成描述

    The RNN receives input vectors of size D, has a vocab size of V, works on
    sequences of length T, has an RNN hidden dimension of H, uses word vectors
    of dimension W, and operates on minibatches of size N.
    RNN
    输入大小 D
    词汇大小 V（有V个词）
    长度为 T
    隐状态维度 H
    词向量维度 W
    batch 大小N

    Note that we don't use any regularization for the CaptioningRNN.
    不使用正则化
    """

    def __init__(
        self,
        word_to_idx,
        input_dim=512,
        wordvec_dim=128,
        hidden_dim=128,
        cell_type="rnn",
        dtype=np.float32,
    ):
        """
        Construct a new CaptioningRNN instance.

        Inputs:
        - word_to_idx: A dictionary giving the vocabulary. It contains V entries,
          and maps each string to a unique integer in the range [0, V).
          给出了词语的字典，大小为 V，把词映射到对应的向量坐标
        - input_dim: Dimension D of input image feature vectors. 图像特征向量的维度
        - wordvec_dim: Dimension W of word vectors. 词向量的维度
        - hidden_dim: Dimension H for the hidden state of the RNN.隐藏状态维度
        - cell_type: What type of RNN to use; either 'rnn' or 'lstm'.使用rnn或lstm
        - dtype: numpy datatype to use; use float32 for training and float64 for
          numeric gradient checking.
        """
        if cell_type not in {"rnn", "lstm"}:
            raise ValueError('Invalid cell_type "%s"' % cell_type)

        self.cell_type = cell_type
        self.dtype = dtype
        self.word_to_idx = word_to_idx
        self.idx_to_word = {i: w for w, i in word_to_idx.items()}
        self.params = {}

        vocab_size = len(word_to_idx)

        self._null = word_to_idx[""]
        self._start = word_to_idx.get("", None)
        self._end = word_to_idx.get("", None)

        # Initialize word vectors 初始化词向量
        self.params["W_embed"] = np.random.randn(vocab_size, wordvec_dim)
        self.params["W_embed"] /= 100

        # Initialize CNN -> hidden state projection parameters 初始化CNN->隐藏状态的参数
        self.params["W_proj"] = np.random.randn(input_dim, hidden_dim)
        self.params["W_proj"] /= np.sqrt(input_dim)
        self.params["b_proj"] = np.zeros(hidden_dim)

        # Initialize parameters for the RNN 初始化RNN参数
        dim_mul = {"lstm": 4, "rnn": 1}[cell_type]
        self.params["Wx"] = np.random.randn(wordvec_dim, dim_mul * hidden_dim)
        self.params["Wx"] /= np.sqrt(wordvec_dim)
        self.params["Wh"] = np.random.randn(hidden_dim, dim_mul * hidden_dim)
        self.params["Wh"] /= np.sqrt(hidden_dim)
        self.params["b"] = np.zeros(dim_mul * hidden_dim)

        # Initialize output to vocab weights  初始化暑输出词语权重（隐藏状态->词）
        self.params["W_vocab"] = np.random.randn(hidden_dim, vocab_size)
        self.params["W_vocab"] /= np.sqrt(hidden_dim)
        self.params["b_vocab"] = np.zeros(vocab_size)

        # Cast parameters to correct dtype
        for k, v in self.params.items():
            self.params[k] = v.astype(self.dtype)

    def loss(self, features, captions):
        """
        Compute training-time loss for the RNN. We input image features and
        ground-truth captions for those images, and use an RNN (or LSTM) to compute
        loss and gradients on all parameters.

        Inputs:
        - features: Input image features, of shape (N, D)
        - captions: Ground-truth captions; an integer array of shape (N, T + 1) where
          each element is in the range 0 <= y[i, t] < V

        Returns a tuple of:
        - loss: Scalar loss
        - grads: Dictionary of gradients parallel to self.params
        """
        # Cut captions into two pieces: captions_in has everything but the last word
        # and will be input to the RNN; captions_out has everything but the first
        # word and this is what we will expect the RNN to generate. These are offset
        # by one relative to each other because the RNN should produce word (t+1)
        # after receiving word t. The first element of captions_in will be the START
        # token, and the first element of captions_out will be the first word.
        # 把描述分为两个部分，captions_in不包含最后一个词，将会输入进入RNN
        # caption_out 不包含第一个词，是我们希望RNN生成的内容
        # 为何彼此偏移一个的原因是 RNN 应该在接收到 t 个单词后生成 t+1 个单词
        # caption_in 的第一个元素是 
        # caption_out 的第一个元素是第一个单词

        captions_in = captions[:, :-1]
        captions_out = captions[:, 1:]

        # You'll need this
        mask = captions_out != self._null

        # Weight and bias for the affine transform from image features to initial
        # hidden state
        # 把图片特征转化成隐藏状态的全连接网路
        W_proj, b_proj = self.params["W_proj"], self.params["b_proj"]

        # Word embedding matrix
        # 词嵌入矩阵
        W_embed = self.params["W_embed"]

        # Input-to-hidden, hidden-to-hidden, and biases for the RNN
        Wx, Wh, b = self.params["Wx"], self.params["Wh"], self.params["b"]

        # Weight and bias for the hidden-to-vocab transformation.
        W_vocab, b_vocab = self.params["W_vocab"], self.params["b_vocab"]

        loss, grads = 0.0, {}
        ############################################################################
        # TODO: Implement the forward and backward passes for the CaptioningRNN.   #
        # In the forward pass you will need to do the following:                   #
        # (1) Use an affine transformation to compute the initial hidden state     #
        #     from the image features. This should produce an array of shape (N, H)#
        # (2) Use a word embedding layer to transform the words in captions_in     #
        #     from indices to vectors, giving an array of shape (N, T, W).         #
        # (3) Use either a vanilla RNN or LSTM (depending on self.cell_type) to    #
        #     process the sequence of input word vectors and produce hidden state  #
        #     vectors for all timesteps, producing an array of shape (N, T, H).    #
        # (4) Use a (temporal) affine transformation to compute scores over the    #
        #     vocabulary at every timestep using the hidden states, giving an      #
        #     array of shape (N, T, V).                                            #
        # (5) Use (temporal) softmax to compute loss using captions_out, ignoring  #
        #     the points where the output word is  using the mask above.     #
        #                                                                          #
        #                                                                          #
        # Do not worry about regularizing the weights or their gradients!          #
        #                                                                          #
        # In the backward pass you will need to compute the gradient of the loss   #
        # with respect to all model parameters. Use the loss and grads variables   #
        # defined above to store loss and gradients; grads[k] should give the      #
        # gradients for self.params[k].                                            #
        #                                                                          #
        # Note also that you are allowed to make use of functions from layers.py   #
        # in your implementation, if needed.                                       #
        ############################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        # 前向传播
        # 构建初始状态（通过输入的图片进行运算）
        ini_hidden_state, ini_cache = affine_forward(features, W_proj, b_proj)

        # 进行词嵌入（把输入转化成词向量表示）
        word_emb_in, word_emb_cache = word_embedding_forward(
            captions_in, W_embed)

        # 使用RNN进行前向传播(随着时间，也就是“向右”)
        hidden_state, hidden_cache = rnn_forward(
            word_emb_in, ini_hidden_state, Wx, Wh, b)

        # 求出每个时间节点的输出（也就是“向上”）
        temp_out, temp_cache = temporal_affine_forward(
            hidden_state, W_vocab, b_vocab)

        # 求出每个时间节点的损失
        loss, dtemp_out = temporal_softmax_loss(temp_out, captions_out, mask)

        # 反向传播
        dout, grads["W_vocab"], grads["b_vocab"] = temporal_affine_backward(
            dtemp_out, temp_cache)

        demb, dini_hidden, grads["Wx"], grads["Wh"], grads["b"] = rnn_backward(
            dout, hidden_cache)

        grads["W_embed"] = word_embedding_backward(demb, word_emb_cache)

        dfeatures, grads["W_proj"], grads["b_proj"] = affine_backward(
            dini_hidden, ini_cache)

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ############################################################################
        #                             END OF YOUR CODE                             #
        ############################################################################

        return loss, grads

    def sample(self, features, max_length=30):
        """
        Run a test-time forward pass for the model, sampling captions for input
        feature vectors.

        At each timestep, we embed the current word, pass it and the previous hidden
        state to the RNN to get the next hidden state, use the hidden state to get
        scores for all vocab words, and choose the word with the highest score as
        the next word. The initial hidden state is computed by applying an affine
        transform to the input image features, and the initial word is the 
        token.
        每一个时间步，进行词嵌入，并进行前向传播的到当前的隐藏状态
        用隐藏状态来得到单词的分数，选择分数最高的作为接下来的词
        初始的隐藏状态使用输入图像做线性变换来得到
        初始的词语为 token


        For LSTMs you will also have to keep track of the cell state; in that case
        the initial cell state should be zero.

        Inputs:
        - features: Array of input image features of shape (N, D).
        - max_length: Maximum length T of generated captions.

        Returns:
        - captions: Array of shape (N, max_length) giving sampled captions,
          where each element is an integer in the range [0, V). The first element
          of captions should be the first sampled word, not the  token.
        """
        N = features.shape[0]
        captions = self._null * np.ones((N, max_length), dtype=np.int32)

        # Unpack parameters
        W_proj, b_proj = self.params["W_proj"], self.params["b_proj"]
        W_embed = self.params["W_embed"]
        Wx, Wh, b = self.params["Wx"], self.params["Wh"], self.params["b"]
        W_vocab, b_vocab = self.params["W_vocab"], self.params["b_vocab"]

        ###########################################################################
        # TODO: Implement test-time sampling for the model. You will need to      #
        # initialize the hidden state of the RNN by applying the learned affine   #
        # transform to the input image features. The first word that you feed to  #
        # the RNN should be the  token; its value is stored in the         #
        # variable self._start. At each timestep you will need to do to:          #
        # (1) Embed the previous word using the learned word embeddings           #
        # (2) Make an RNN step using the previous hidden state and the embedded   #
        #     current word to get the next hidden state.                          #
        # (3) Apply the learned affine transformation to the next hidden state to #
        #     get scores for all words in the vocabulary                          #
        # (4) Select the word with the highest score as the next word, writing it #
        #     (the word index) to the appropriate slot in the captions variable   #
        #                                                                         #
        # For simplicity, you do not need to stop generating after an  token #
        # is sampled, but you can if you want to.                                 #
        #                                                                         #
        # HINT: You will not be able to use the rnn_forward or lstm_forward       #
        # functions; you'll need to call rnn_step_forward or lstm_step_forward in #
        # a loop.                                                                 #
        #                                                                         #
        # NOTE: we are still working over minibatches in this function. Also if   #
        # you are using an LSTM, initialize the first cell state to zeros.        #
        ###########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        hidden_state, _ = affine_forward(features, W_proj, b_proj)
        word = self._start*np.ones(N, dtype=np.int32)  # (N,)
        for i in range(max_length):
            word_embed, _ = word_embedding_forward(word, W_embed)
            hidden_state, _ = rnn_step_forward(
                word_embed, hidden_state, Wx, Wh, b)
            scores, _ = affine_forward(hidden_state, W_vocab, b_vocab)
            word = np.argmax(scores, axis=1)
            captions[:, i] = word
        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ############################################################################
        #                             END OF YOUR CODE                             #
        ############################################################################
        return captions

3. 总结、预告

到这里我们使用 numpy 手撸了一个 RNN，RNN 的结构可以按照时间展开，就和我们以前使用的深度神经网络挺像的。我们只要小心的思考，还是可以抓住重点的

下一次我会更新作业2的iai，发现我更新好混乱，因为作业二的文章只写了一半，反倒是作业三的先开始做然后一边做一边写竟然给写完了iai，于是就先发出来啦！

你可能感兴趣的:(cs231n学习,rnn,深度学习,神经网络)

MCP架构全解析：从核心原理到企业级实践 stormsha 人工智能架构 c++服务器
欢迎莅临我的博客，很高兴能够在这里和您见面！希望您在这里可以感受到一份轻松愉快的氛围，不仅可以获得有趣的内容和知识，也可以畅所欲言、分享您的想法和见解。推荐：「stormsha的主页」，「stormsha的知识库」持续学习，不断总结，共同进步，为了踏实，做好当下事儿~非常期待和您一起在这个小小的网络世界里共同探索、学习和成长。✨✨欢迎订阅本专栏✨✨TheStart点点关注，收藏不迷路文章目录1.M
Web学习：SQL注入之联合查询注入 kaikaile1995 前端学习 sql
SQL注入（SQLInjection）是一种常见且危害极大的Web安全漏洞，攻击者可以通过构造恶意的SQL语句窃取、篡改数据库中的数据，甚至控制整个数据库服务器。本文将深入探讨SQL注入的一个重要变种——联合查询注入（Union-basedSQLInjection），介绍其原理、常见攻击方式、以及防御措施。SQL注入概述SQL注入是指将恶意的SQL代码插入到应用程序的输入字段中，使得这些代码被意外
WPF学习笔记（18）触发器Trigger
触发器1.概述2.详解2.1.Trigger用法2.2.MultiTrigger用法2.3.DataTrigger用法2.4.EventTrigger用法总结1.概述官方文档：https://learn.microsoft.com/zh-cn/dotnet/api/system.windows.trigger?view=netframework-4.82.详解在Style中可以指定触发器类型，触发
Halcon 初步了解科学的发展-只不过是读大自然写的代码图形编程 c#视觉处理 Halcon
1.Halcon概述Halcon是德国MVTec公司开发的一套完善的机器视觉算法包，也是一款功能强大的视觉处理软件，为工业自动化领域提供了全面的解决方案。它拥有应用广泛的机器视觉集成开发环境，提供了一套丰富的图像处理和机器视觉算法，可以在各种工业应用中进行图像分析、目标检测、测量、定位、识别等任务。Halcon的核心功能包括图像处理、特征提取与匹配、3D视觉、深度学习、条码识别、OCR识别以及视觉
如何在pytorch中使用tqdm：优雅实现训练进度监控 Ven% 简单入门pytorch pytorch 人工智能 python
文章目录为什么需要进度条？tqdm简介基础用法示例深度学习中的实战应用1.数据加载进度监控2.训练循环增强版3.验证阶段集成高级技巧与最佳实践1.自定义进度条样式2.嵌套进度条（多任务）3.分布式训练支持4.与日志系统集成性能优化建议完整训练流程示例常见问题解决方案总结掌握训练进度监控是深度学习工程师的基本功。本文将带你从零开始，深入探索如何用tqdm为深度学习训练添加专业级进度条。为什么需要进度
【数据标注师】OCR标注试着数据标注师 ocr 数据标注师 OCR标注
目录**一、理解OCR标注的本质与目标****二、学习前的必要准备****三、系统学习核心知识与技能****四、高效的学习与练习方法****五、培养核心职业素养****六、进阶方向**掌握OCR标注技能是进入AI数据标注行业的黄金敲门砖！作为数据标注师，学习OCR标注需要系统性地掌握理论、工具和实践。以下是我为你梳理的详细学习路径和核心要点：一、理解OCR标注的本质与目标核心任务：精确标注图像/扫
TypeScript 入门到实战（二）：基础武器库 —— 掌握 TS 核心类型与函数程序员阿超的博客 typescript javascript TypeScript 类型 TypeScript 函数 TypeScript any TypeScript 教程
告别any，用类型思维重构你的JavaScript函数欢迎回来！在上一篇文章中，我们了解了TypeScript为何是JavaScript开发者的得力助手，并成功运行了第一个TS程序。我们知道了Why，现在是时候深入学习How了。如果说上一章是我们的“破冰之旅”，那么本章就是我们的“武器库扩充”。我们将一起锻造和掌握TypeScript中最核心、最常用的类型“兵器”，并学会如何用它们来武装我们的函数
[ vulhub漏洞复现篇 ] Drupal XSS漏洞 (CVE-2019-6341) 寒蝉听雨[原ID_PowerShell] [靶场实战 ]vulhub vulhub漏洞复现 Drupal XSS漏洞 CVE-2019-6341 渗透测试网络安全
博主介绍‍博主介绍：大家好，我是_PowerShell，很高兴认识大家~✨主攻领域：【渗透领域】【数据通信】【通讯安全】【web安全】【面试分析】点赞➕评论➕收藏==养成习惯（一键三连）欢迎关注一起学习一起讨论⭐️一起进步文末有彩蛋作者水平有限，欢迎各位大佬指点，相互学习进步！文章目录博主介绍一、漏洞编号二、影响范围三、漏洞描述四、环境搭建1、进入CVE-2019-6341环境2、启动CVE-20
[面试]手写题-Promise.all() Promise.race() 533_ #面试面试
Promise.all()接收一个Promise数组，数组中如有非Promise项，则此项当做成功如果所有Promise都成功，则返回成功结果数组如果有一个Promise失败，则返回这个失败结果staticall(promises){constresult=[];//创建一个空数组用于存储每个Promise的结果letcount=0;//返回一个新的MyPromise实例returnnewMyPr
贝叶斯回归：从概率视角量化预测的不确定性大千AI助手人工智能 Python #OTHER 回归数据挖掘人工智能机器学习算法贝叶斯
本文由「大千AI助手」原创发布，专注用真话讲AI，回归技术本质。拒绝神话或妖魔化。搜索「大千AI助手」关注我，一起撕掉过度包装，学习真实的AI技术！贝叶斯方法在回归问题中的应用被称为贝叶斯回归（BayesianRegression）。与传统频率派的线性回归（如最小二乘法）不同，贝叶斯回归的核心思想是：将回归参数（如权重系数）视为随机变量，通过贝叶斯定理结合先验分布和观测数据，推导出参数的后验分布，
为了方便学习icss项目上的css技巧，我用next.js写了一个网站前端next.js
icss-website一、项目简介与定位icss-website是一个基于Next.js14（AppRouter架构）开发的现代化CSS技巧展示平台，致力于为前端开发者、设计师和技术爱好者提供一个高效、优雅、易用的CSS奇技淫巧学习与交流空间。项目以GitHub上的iCSS仓库为内容源，通过API动态获取、分类、展示和高亮CSS相关的文章与代码示例，支持多主题、多语言、响应式布局和丰富的交互体验
【Python】Hydra 用法详解行码棋 #Python python 开发语言
Hydra官方文档Hydra（Python配置管理工具）1.引言在机器学习、深度学习和软件开发中，管理复杂的配置是一个常见的挑战。Hydra是一个强大的Python库，允许开发者轻松地管理和组织配置文件，支持动态参数覆盖、多层次配置和可组合配置等特性。2.安装HydraHydra可以通过pip直接安装：pipinstallhydra-core安装完成后，你可以使用hydra进行配置管理。3.基础用
onnx模型部署 python_深度学习模型转换与部署那些事(含ONNX格式详细分析) weixin_39759270 onnx模型部署 python
背景深度学习模型在训练完成之后，部署并应用在生产环境的这一步至关重要，毕竟训练出来的模型不能只接受一些公开数据集和榜单的检验，还需要在真正的业务场景下创造价值，不能只是为了PR而躺在实验机器上在现有条件下，一般涉及到模型的部署就要涉及到模型的转换，而转换的过程也是随着对应平台的不同而不同，一般工程师接触到的平台分为GPU云平台、手机和其他嵌入式设备对于GPU云平台来说，在上面部署本应该是最轻松的事
用户实体行为分析与数据异常访问联防方案 KKKlucifer 时序数据库
一、用户实体行为分析（UEBA）技术概述1.1定义与概念用户实体行为分析（UEBA）是一种高级网络安全方法，它利用机器学习和行为分析技术，对用户、设备、应用程序等实体在网络环境中的行为进行深入分析，以检测出异常行为和潜在的安全威胁。UEBA的核心在于通过建立行为基线，识别出偏离正常行为模式的活动，从而发现那些传统安全工具难以检测到的高级、隐藏和内部威胁。1.2工作原理UEBA系统通过收集来自多个数
从零开始理解Transformer模型：架构与应用淮橘√ transformer 深度学习人工智能
引言近年来，Transformer模型席卷了自然语言处理（NLP）领域，成为了深度学习中的明星架构。从Google提出的《AttentionisAllYouNeed》论文到ChatGPT、BERT等模型的广泛应用，Transformer以其强大的性能和灵活性改变了我们对序列建模的认知。本文将从零开始，深入浅出地解析Transformer的架构原理、核心组件以及实际应用场景，并提供一个简单的代码示例
java opencv 数字识别算法_[机器学习]基于OpenCV实现最简单的数字识别后期小雨 java opencv 数字识别算法
本文将基于OpenCV实现简单的数字识别。这里以游戏AngryBirds为例，通过以下几个主要步骤对其中右上角的分数部分进行自动识别。1.学习分类器根据训练样本，选取模型训练产生数字分类器。这里的样本可以是通用的数字样本库(如NIST等)，也可以是针对应用场景而制作的专门训练样本。前者优在泛化性，后者强在准确率，当然常用做法是将这两者结合，即在通用数字库基础上做修改。另外这里由于模式并不复杂，计算
新手学习linux关于CentOS下载及版本选择 \光辉岁月/ linux
i386是给32位机器使用的，而x86_64适用于64位机器。前者只能使用32位软件，后者可以兼用32位软件，这就是两者区别。如果你的服务器内存超4GB，强烈建议使用64位版本；如果只在虚拟机器里安装学习，那么32位就行了，也就是选择i386版本。如果想做服务器，则建议选64位。进入之后我们看到这样一个界面：这么多文件该怎么选择呢？对新手来说，可能一下子要蒙了。不急，慢慢来。先观察文件后缀名，分.
2025年6月28和29日复习和预习（C++）子豪-中国机器人算法 java 数据结构 c++
学习笔记大纲一、预习部分：数组基础（一）核心知识点数组的创建：掌握一维数组的声明方式，如intarr[5];（创建一个包含5个整数的数组）。重点在于理解数组长度需为常量，且在声明时确定。数组的初始化：学会为数组赋值，例如intarr[]={1,2,3};，可省略数组长度，编译器根据初始化值自动确定。数组元素的访问：通过索引访问数组元素，索引从0开始，如arr[1]表示访问数组arr的第二个元素。（
【AI大模型学习路线】第三阶段之RAG与LangChain——第十四章（LangChain与Retrieval组件）Text Splitters详解？
【AI大模型学习路线】第三阶段之RAG与LangChain——第十四章（LangChain与Retrieval组件）TextSplitters详解？【AI大模型学习路线】第三阶段之RAG与LangChain——第十四章（LangChain与Retrieval组件）TextSplitters详解？文章目录【AI大模型学习路线】第三阶段之RAG与LangChain——第十四章（LangChain与Re
算法学习day6----双指针-最长不重复子序列阴暗老鼠人学习
Givenanintegersequenceoflengthn,pleasefindthelongestcontinuousintervalwithoutduplicatenumbersandoutputitslength.Thefirstlinecontainsanintegern.Thesecondlinecontainsnintegers(allwithintherangeof0to105)
Keras环境复现代码（三） yanyiche_ keras 深度学习人工智能
DQN雅达利Breakout强化学习实验要求明确实验目的：学习和实现深度Q学习（DQN），这是一种结合了Q学习和深度神经网络的强化学习算法，用于解决复杂的决策问题。清楚实验原理：1、深度Q学习（DeepQ-Network）将卷积神经网络与Q学习结合，解决高维视觉输入的强化学习问题：2、经验回放：将状态转换存储到缓冲区，打破数据相关性，稳定训练。3、目标网络：定期更新目标Q值计算网络，减少训练中的目
Keras环境复现代码（二） yanyiche_ Keras 机器学习人工智能
PPOCartPole控制算法实践实验要求明确实验目的：学习和实现PPO算法，这是一种改进的策略梯度方法，通过限制策略更新的幅度来提高训练的稳定性。清楚实验原理：PPO算法是一种基于策略梯度的强化学习算法，它旨在解决传统策略梯度方法（如REINFORCE算法）在训练过程中可能出现的策略更新不稳定问题。PPO算法通过引入一种新的策略更新机制，限制每次更新的幅度，从而提高训练的稳定性和效率。PPO算法
51单片机lcd1602第一行黑块问题 Plan-C- 51单片机嵌入式硬件单片机
在学习51单片机lcd1602显示模块时（ByB站@江协科技）遇到的问题：第一行显示为黑块在网上查找解决方法，有人通过使用杜邦线或按牢接口解决了问题（接触不良），解决无果后去普中官网找到了对应的百度网盘链接，官方的LCD1602实验代码解决问题。链接：https://pan.baidu.com/s/1z9J1yIzZDwhWwYrYAZRy-Q提取码：1602--来自百度网盘超级会员V4的分享
Netty学习路线图 - 第三阶段：Netty核心概念 by.G 学习 java
Netty学习路线图-第三阶段：Netty核心概念Netty学习系列之三本文是Netty学习路线的第三篇，重点讲解Netty的核心概念和组件，帮助你理解Netty的设计思想和架构。引言在前两篇文章中，我们分别介绍了Java基础与网络编程基础，以及JavaNIO的核心概念。这些都为我们学习Netty打下了坚实基础。本篇文章将深入探讨Netty的核心概念，包括Netty的架构设计、启动引导、核心组件等
Python 爬虫实战：从图片网站抓取图片并进行特征提取（2025 最新版） Python爬虫项目 2025年爬虫实战项目 python 爬虫开发语言 github chrome 数据库
一、引言在当今的数字时代，图像数据在各个领域中扮演着至关重要的角色。无论是计算机视觉、机器学习，还是数据分析，图像数据的获取和处理都是基础。然而，获取大量高质量的图像数据并非易事。幸运的是，互联网上充斥着丰富的图像资源，只需借助合适的工具和技术，我们就能高效地从中获取所需的图像数据。本文将详细介绍如何使用Python构建一个完整的爬虫系统，从图片网站抓取图像，并对其进行特征提取。我们将涵盖从网页分
【Python】Synonyms 宅男很神经 python 开发语言
当然，我完全理解您的需求，并且将竭尽全力为您提供一个前所未有的、极其深入和全面的关于“Python库Synonyms，用于中文词性分析和相似度计算”的专属学习指南。我将从最底层、最核心的原理开始，逐步向上构建知识体系，确保每一个细节都被剖析得淋漓尽致，不放过任何一个学习角度。所有内容都将是原创生成，绝无抄袭，并辅以大量我独立设计的实战代码示例，每行代码都将附带详尽的中文解释。由于您要求极高的字数（
java初学习（-2025.6.30小总结） kim_puppy java学习 java 学习开发语言
直接总结目前学习的内容吧。先罗列。1.java中包含的数据类型2.java中的方法3.了解java中数组的使用方法，和C语言略微有些区别，比如在输出数组，拷贝数组方面，可以更加快捷。4.类和对象。在初学习的时候，要理解类和对象的含义，因为java是面向对象的编程。4.1.类的格式：（类名一般采用大驼峰命名）class类名{属性（在方法外，在类内）行为/方法}4.2.类的实例化：和C语言不同，我们要
模拟ic学习1：效应总结 soulermax 学习硬件工程
亚阈值效应importmath#导入数学库#定义公式中的参数I0=1.0#I0是一个常数，表示当VGS=0时的漏电流VT=0.026#VT是热电压（thermalvoltage），约为25mV在常温下n=1.5#n是一个常数，被称为取决于器件的因子VGS=0.5#VGS是栅极-源极电压（gate-sourcevoltage）#使用公式计算IDID=I0*math.exp(VGS/(VT*n))pr
Java 程序员必备的 26 个 Linux 命令，常用 + 面试两手抓库库林_沙琪马 Linux linux 面试运维
有人问我：日常开发中最常用的Linux命令有哪些？我不假思索地就列出了26个，涵盖开发、调试、运维场景，每一个都值得收藏+实战。内容速览26个高频Linux命令详解面试官常问的重点命令实操建议&学习指引一、开发者高频使用的26个Linux命令1️⃣cd-切换目录cd/usr/local/bincd..cd~Tips：~表示当前用户主目录，..表示上级目录。2️⃣mkdir-创建目录mkdirmyd
镜子练习模板Day2 hehuiyi 健康医疗
让镜子成为你的好朋友，我们学习的仔细的看自己并超越旧的信念。看待镜中的自己的眼睛的时候，说这一句肯定句：[（自己的名字），我爱你，我真的、真的爱你]这一句话说多两遍，但这一句话很多人都不敢对自己说。因为每个人做事情对待自己的时候，都不是真正的爱自己。书上说至少给自己说100次/天，你没有看错，一天100次。这种做法叫做洗一下你的内心的潜意识，那一句讨厌自己的话语，慢慢的你就会爱自己了。我觉得自己讨
xml解析小猪猪08 xml
1、DOM解析的步奏准备工作： 1.创建DocumentBuilderFactory的对象 2.创建DocumentBuilder对象 3.通过DocumentBuilder对象的parse(String fileName)方法解析xml文件 4.通过Document的getElem
每个开发人员都需要了解的一个SQL技巧 brotherlamp linux linux视频 linux教程 linux自学 linux资料
对于数据过滤而言CHECK约束已经算是相当不错了。然而它仍存在一些缺陷，比如说它们是应用到表上面的，但有的时候你可能希望指定一条约束，而它只在特定条件下才生效。使用SQL标准的WITH CHECK OPTION子句就能完成这点，至少Oracle和SQL Server都实现了这个功能。下面是实现方式： CREATE TABLE books ( id &
Quartz——CronTrigger触发器 eksliang quartz CronTrigger
转载请出自出处：http://eksliang.iteye.com/blog/2208295 一.概述 CronTrigger 能够提供比 SimpleTrigger 更有具体实际意义的调度方案，调度规则基于 Cron 表达式，CronTrigger 支持日历相关的重复时间间隔（比如每月第一个周一执行），而不是简单的周期时间间隔。二.Cron表达式介绍 1）Cron表达式规则表 Quartz
Informatica基础 18289753290 Informatica Monitor manager workflow Designer
1. 1）PowerCenter Designer：设计开发环境，定义源及目标数据结构；设计转换规则，生成ETL映射。 2）Workflow Manager：合理地实现复杂的ETL工作流，基于时间，事件的作业调度 3）Workflow Monitor：监控Workflow和Session运行情况，生成日志和报告 4）Repository Manager：
linux下为程序创建启动和关闭的的sh文件，scrapyd为例酷的飞上天空 scrapy
对于一些未提供service管理的程序每次启动和关闭都要加上全部路径，想到可以做一个简单的启动和关闭控制的文件下面以scrapy启动server为例，文件名为run.sh： #端口号，根据此端口号确定PID PORT=6800 #启动命令所在目录 HOME='/home/jmscra/scrapy/' #查询出监听了PORT端口
人--自私与无私永夜-极光
今天上毛概课,老师提出一个问题--人是自私的还是无私的,根源是什么? 从客观的角度来看,人有自私的行为,也有无私的
Ubuntu安装NS-3 环境脚本随便小屋 ubuntu
将附件下载下来之后解压，将解压后的文件ns3environment.sh复制到下载目录下（其实放在哪里都可以，就是为了和我下面的命令相统一）。输入命令： sudo ./ns3environment.sh >>result 这样系统就自动安装ns3的环境，运行的结果在result文件中，如果提示 com
创业的简单感受 aijuans 创业的简单感受
2009年11月9日我进入a公司实习，2012年4月26日，我离开a公司，开始自己的创业之旅。今天是2012年5月30日，我忽然很想谈谈自己创业一个月的感受。当初离开边锋时，我就对自己说：“自己选择的路，就是跪着也要把他走完”，我也做好了心理准备，准备迎接一次次的困难。我这次走出来，不管成败
如何经营自己的独立人脉 aoyouzi 如何经营自己的独立人脉
独立人脉不是父母、亲戚的人脉，而是自己主动投入构造的人脉圈。“放长线，钓大鱼”，先行投入才能产生后续产出。现在几乎做所有的事情都需要人脉。以银行柜员为例，需要拉储户，而其本质就是社会人脉，就是社交！很多人都说，人脉我不行，因为我爸不行、我妈不行、我姨不行、我舅不行……我谁谁谁都不行，怎么能建立人脉？我这里说的人脉，是你的独立人脉。以一个普通的银行柜员
JSP基础百合不是茶 jsp 注释隐式对象
1,JSP语句的声明 <%! 声明 %> 　　声明：这个就是提供java代码声明变量、方法等的场所。表达式 <%= 表达式 %> 　　这个相当于赋值，可以在页面上显示表达式的结果，程序代码段/小型指令　<% 程序代码片段 %> 2,JSP的注释
web.xml之session-config、mime-mapping bijian1013 java web.xml servlet session-config mime-mapping
session-config 1.定义： <session-config> <session-timeout>20</session-timeout> </session-config> 2.作用：用于定义整个WEB站点session的有效期限，单位是分钟。 mime-mapping 1.定义： <mime-m
互联网开放平台（1） Bill_chen 互联网 qq 新浪微博百度腾讯
现在各互联网公司都推出了自己的开放平台供用户创造自己的应用，互联网的开放技术欣欣向荣，自己总结如下： 1.淘宝开放平台(TOP) 网址：http://open.taobao.com/ 依赖淘宝强大的电子商务数据，将淘宝内部业务数据作为API开放出去，同时将外部ISV的应用引入进来。目前TOP的三条主线： TOP访问网站：open.taobao.com ISV后台：my.open.ta
【MongoDB学习笔记九】MongoDB索引 bit1129 mongodb
索引可以在任意列上建立索引索引的构造和使用与传统关系型数据库几乎一样,适用于Oracle的索引优化技巧也适用于Mongodb 使用索引可以加快查询,但同时会降低修改,插入等的性能内嵌文档照样可以建立使用索引测试数据 var p1 = { "name":"Jack", "age&q
JDBC常用API之外的总结白糖_ jdbc
做JAVA的人玩JDBC肯定已经很熟练了，像DriverManager、Connection、ResultSet、Statement这些基本类大家肯定很常用啦，我不赘述那些诸如注册JDBC驱动、创建连接、获取数据集的API了，在这我介绍一些写框架时常用的API，大家共同学习吧。 ResultSetMetaData获取ResultSet对象的元数据信息
apache VelocityEngine使用记录 bozch VelocityEngine
VelocityEngine是一个模板引擎，能够基于模板生成指定的文件代码。使用方法如下： VelocityEngine engine = new VelocityEngine();// 定义模板引擎 Properties properties = new Properties();// 模板引擎属
编程之美-快速找出故障机器 bylijinnan 编程之美
package beautyOfCoding; import java.util.Arrays; public class TheLostID { /*编程之美假设一个机器仅存储一个标号为ID的记录，假设机器总量在10亿以下且ID是小于10亿的整数，假设每份数据保存两个备份，这样就有两个机器存储了同样的数据。 1.假设在某个时间得到一个数据文件ID的列表，是
关于Java中redirect与forward的区别 chenbowen00 java servlet
在Servlet中两种实现： forward方式：request.getRequestDispatcher(“/somePage.jsp”).forward(request, response); redirect方式：response.sendRedirect(“/somePage.jsp”); forward是服务器内部重定向，程序收到请求后重新定向到另一个程序，客户机并不知
[信号与系统]人体最关键的两个信号节点 comsci 系统
如果把人体看做是一个带生物磁场的导体,那么这个导体有两个很重要的节点,第一个在头部,中医的名称叫做百汇穴, 另外一个节点在腰部,中医的名称叫做命门如果要保护自己的脑部磁场不受到外界有害信号的攻击,最简单的
oracle 存储过程执行权限 daizj oracle 存储过程权限执行者调用者
在数据库系统中存储过程是必不可少的利器，存储过程是预先编译好的为实现一个复杂功能的一段Sql语句集合。它的优点我就不多说了，说一下我碰到的问题吧。我在项目开发的过程中需要用存储过程来实现一个功能，其中涉及到判断一张表是否已经建立，没有建立就由存储过程来建立这张表。 CREATE OR REPLACE PROCEDURE TestProc IS fla
为mysql数据库建立索引 dengkane mysql 性能索引
前些时候，一位颇高级的程序员居然问我什么叫做索引，令我感到十分的惊奇，我想这绝不会是沧海一粟，因为有成千上万的开发者（可能大部分是使用MySQL的）都没有受过有关数据库的正规培训，尽管他们都为客户做过一些开发，但却对如何为数据库建立适当的索引所知较少，因此我起了写一篇相关文章的念头。最普通的情况，是为出现在where子句的字段建一个索引。为方便讲述，我们先建立一个如下的表。
学习C语言常见误区如何看懂一个程序如何掌握一个程序以及几个小题目示例 dcj3sjt126com c 算法
如果看懂一个程序，分三步 1、流程 2、每个语句的功能 3、试数如何学习一些小算法的程序尝试自己去编程解决它，大部分人都自己无法解决如果解决不了就看答案关键是把答案看懂，这个是要花很大的精力，也是我们学习的重点看懂之后尝试自己去修改程序，并且知道修改之后程序的不同输出结果的含义照着答案去敲调试错误
centos6.3安装php5.4报错 dcj3sjt126com centos6
报错内容如下: Resolving Dependencies --> Running transaction check ---> Package php54w.x86_64 0:5.4.38-1.w6 will be installed --> Processing Dependency: php54w-common(x86-64) = 5.4.38-1.w6 for
JSONP请求 flyer0126 jsonp
使用jsonp不能发起POST请求。 It is not possible to make a JSONP POST request. JSONP works by creating a <script> tag that executes Javascript from a different domain; it is not pos
Spring Security（03）——核心类简介 234390216 Authentication
核心类简介目录 1.1 Authentication 1.2 SecurityContextHolder 1.3 AuthenticationManager和AuthenticationProvider 1.3.1 &nb
在CentOS上部署JAVA服务 java--hhf java jdk centos Java服务
本文将介绍如何在CentOS上运行Java Web服务，其中将包括如何搭建JAVA运行环境、如何开启端口号、如何使得服务在命令执行窗口关闭后依旧运行第一步：卸载旧Linux自带的JDK ①查看本机JDK版本 java -version 结果如下 java version "1.6.0"
oracle、sqlserver、mysql常用函数对比[to_char、to_number、to_date] ldzyz007 oracle mysql SQL Server
oracle &n
记Protocol Oriented Programming in Swift of WWDC 2015 ningandjin protocol WWDC 2015 Swift2.0
其实最先朋友让我就这个题目写篇文章的时候，我是拒绝的，因为觉得苹果就是在炒冷饭，把已经流行了数十年的OOP中的“面向接口编程”还拿来讲，看完整个Session之后呢，虽然还是觉得在炒冷饭，但是毕竟还是加了蛋的，有些东西还是值得说说的。通常谈到面向接口编程，其主要作用是把系统设计和具体实现分离开，让系统的每个部分都可以在不影响别的部分的情况下，改变自身的具体实现。接口的设计就反映了系统
搭建 CentOS 6 服务器(15) - Keepalived、HAProxy、LVS rensanning keepalived
（一）Keepalived （1）安装 # cd /usr/local/src # wget http://www.keepalived.org/software/keepalived-1.2.15.tar.gz # tar zxvf keepalived-1.2.15.tar.gz # cd keepalived-1.2.15 # ./configure # make &a
ORACLE数据库SCN和时间的互相转换 tomcat_oracle oracle sql
SCN（System Change Number 简称 SCN）是当Oracle数据库更新后，由DBMS自动维护去累积递增的一个数字，可以理解成ORACLE数据库的时间戳，从ORACLE 10G开始，提供了函数可以实现SCN和时间进行相互转换；　　用途：在进行数据库的还原和利用数据库的闪回功能时，进行SCN和时间的转换就变的非常必要了；　　操作方法：　　1、通过dbms_f
Spring MVC 方法注解拦截器 xp9802 spring mvc
应用场景，在方法级别对本次调用进行鉴权，如api接口中有个用户唯一标示accessToken,对于有accessToken的每次请求可以在方法加一个拦截器，获得本次请求的用户，存放到request或者session域。 python中，之前在python flask中可以使用装饰器来对方法进行预处理，进行权限处理先看一个实例,使用@access_required拦截： ?