炸天小王子

CS231n Convolutional Neural Networks for Visual Recognition 自翻译

CS231n Convolutional Neural Networks for Visual Recognition

Convolutional Neural Networks (CNNs / ConvNets)
Convolutional Neural Networks are very similar to ordinary Neural Networks from the previous chapter: they are made up of neurons that have learnable weights and biases. Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity. The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other. And they still have a loss function (e.g. SVM/Softmax) on the last (fully-connected) layer and all the tips/tricks we developed for learning regular Neural Networks still apply.
卷积神经网络与前一章中的普通神经网络（注：参见吴恩达机器学习）非常相似：它们是由具有可学习的权重和偏差的神经元组成。每一个神经元都是接收一些输入然后执行一次点积，通过非线性选择性地跟随它。整个网络仍然表达了一个单一可区分地评分函数：从一个原始图像像素到另一端地类别分数。并且，它们在最后一（全连接）层仍然具有损失函数（SVM：支持向量机；Softmax：归一化指数函数，与Sigmoid类似，对数组操作），我们为学习常规神经网络而开发的所有提示技巧仍然试用。

So what changes? ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network.
那么，什么改变了？卷积神经网络体系结构明确假定输入是图像，这使我们能够将一些属性编码到体系结构中。这样可以提高前向函数的传播效率并大大减少网络中的参数量

Architecture Overview
Recall: Regular Neural Nets. As we saw in the previous chapter, Neural Networks receive an input (a single vector), and transform it through a series of hidden layers. Each hidden layer is made up of a set of neurons, where each neuron is fully connected to all neurons in the previous layer, and where neurons in a single layer function completely independently and do not share any connections. The last fully-connected layer is called the “output layer” and in classification settings it represents the class scores.
回忆：常规神经网络，正如我们在前一章看到的，神经网络接收到一个输入（单个向量），然后经过一系列隐藏层对其进行转化。每一个隐藏层由一组神经元组成，这里每一个神经元都与上一层完全连接；但是，一层中的神经元都是完全独立工作，不分享任何连接。最后一层全连接层被称为输出层，在分类设置中它表现为分类评分。

Regular Neural Nets don’t scale well to full images. In CIFAR-10, images are only of size 32x32x3 (32 wide, 32 high, 3 color channels), so a single fully-connected neuron in a first hidden layer of a regular Neural Network would have 32323 = 3072 weights. This amount still seems manageable, but clearly this fully-connected structure does not scale to larger images. For example, an image of more respectable size, e.g. 200x200x3, would lead to neurons that have 2002003 = 120,000 weights. Moreover, we would almost certainly want to have several such neurons, so the parameters would add up quickly! Clearly, this full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting.
常规神经网络并不能很好的测量完整的图像。在CIFAR-10(图像数据集)中，图像的大小仅为32x32x3 (宽 32, 高 32, 颜色通道 3)，所以一个常规神经网络在第一个隐藏层的全连接神经元将会拥有32323 = 3072 权重。这个数量看起来是可控的，但是，很明显这种全连接结构并不能扩展到更大的图像。例如，一个更加可观的图像（如200x200x3）将会导致神经元有2002003 = 120,000 个权重。此外，我们肯定希望有几个这样的神经元，所以参数会迅速增加。显然，这种全连接是不必要的，而且大量的参数很快就会导致过拟合。

3D volumes of neurons. Convolutional Neural Networks take advantage of the fact that the input consists of images and they constrain the architecture in a more sensible way. In particular, unlike a regular Neural Network, the layers of a ConvNet have neurons arranged in 3 dimensions: width, height, depth. (Note that the word depth here refers to the third dimension of an activation volume, not to the depth of a full Neural Network, which can refer to the total number of layers in a network.) For example, the input images in CIFAR-10 are an input volume of activations, and the volume has dimensions 32x32x3 (width, height, depth respectively). As we will soon see, the neurons in a layer will only be connected to a small region of the layer before it, instead of all of the neurons in a fully-connected manner. Moreover, the final output layer would for CIFAR-10 have dimensions 1x1x10, because by the end of the ConvNet architecture we will reduce the full image into a single vector of class scores, arranged along the depth dimension. Here is a visualization:
神经元的3D体积。卷积神经网络利用输入由图像组成的事实, 以更合理的方式约束体系结构。特别地，与常规神经网络不同的是，卷积神经网络的各层拥有排列在三个维度的神经元：宽度、高度、深度（请注意, 此处的“深度”一词指的是激活卷的第三个维度, 而不是整个神经网络的深度, 它可以指网络中的图层总数）。例如，CIFAR-10中的输入图像是激活的输入卷，该卷的尺寸维度是32x32x3。正如我们将会看到的，一层的神经元仅被连接到前一层的一个小区域，而不是以全连接的方式。此外，CIFAR-10最终的输出层具有1x1x10的维度，因为在卷积神经网络的末尾，我们将把一个完整的图像缩减到一个由类别分数组成按深度排列的一维向量。

Left: A regular 3-layer Neural Network. Right: A ConvNet arranges its neurons in three dimensions (width, height, depth), as visualized in one of the layers. Every layer of a ConvNet transforms the 3D input volume to a 3D output volume of neuron activations. In this example, the red input layer holds the image, so its width and height would be the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels).
A ConvNet is made up of Layers. Every Layer has a simple API: It transforms an input 3D volume to an output 3D volume with some differentiable function that may or may not have parameters.
一个卷积神经网络由多个网络层组成，每一层都有一个简单的API（它通过一些可能具有或可能没有参数的不同的功能，把3D体输入转化为3D体输出）

Layers used to build ConvNets
构建卷积神经网络的图层
As we described above, a simple ConvNet is a sequence of layers, and every layer of a ConvNet transforms one volume of activations to another through a differentiable function. We use three main types of layers to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected Layer (exactly as seen in regular Neural Networks). We will stack these layers to form a full ConvNet architecture.
正如之前描述的，一个简单的卷积神经网络是一组层序列，网络的每一层都通过不同的功能把一个激活体转化为另一个激活体。我们使三种用不同的层来构建卷积神经网络体系结构，卷积层、池化层和全连接层（常规神经网络所用到的）。我们把这些层堆叠起来，构建一个完整的卷积神经网络结构。

Example Architecture: Overview. We will go into more details below, but a simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail:
结构示例，概述：下面我们将仔细介绍，但是一个应用于CIFAR-10分类的简单卷积神经网络可能有这样的结构[输入-卷积-激活函数-池化-全连接]

• INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B.
输入[32x32x3]将会保持图片的原始像素值，本例中图像宽32，高32，有三个颜色通道RGB

• CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters.
卷积层会计算神经元（连接了输入层的局部区域）的输出，每一个神经元都会计算权值和输入卷一部分区域的点积。如果我们使用12个卷积核，可能会生成[32x32x12]这样体积的卷

• RELU layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]).
RELU（线性整流函数）层会使用一个元素位的激活函数，如max(0,x)阈值为0.这使得卷的体积大小不变。

• POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].
池化层将沿空间维度进行下采样操作，将卷变成[16x16x12].

• FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.
全连接层将会计算类分数，最终卷的体积大小为[1x1x10]，是个数字中每一个对应一个类分数，例如CIFAR-10的十个类别。和普通神经网络，也是它名字暗示的那样，这层每一个的神经元都会连接到前一个卷的每个数字。

In this way, ConvNets transform the original image layer by layer from the original pixel values to the final class scores. Note that some layers contain parameters and other don’t. In particular, the CONV/FC layers perform transformations that are a function of not only the activations in the input volume, but also of the parameters (the weights and biases of the neurons). On the other hand, the RELU/POOL layers will implement a fixed function. The parameters in the CONV/FC layers will be trained with gradient descent so that the class scores that the ConvNet computes are consistent with the labels in the training set for each image.
这样卷积神经网络通过各层把原始图像从原来的像素转化为最终的类分数。注意，有些层包含参数，有些层不包含。特别地，卷积层和全连接层(CONV/FC)执行地转换不仅包括输入卷地激活函数，还有包含参数（神经元的权重和偏差）的函数。另一方面，线性整流层和池化层会实现一个固定的功能。CONV/FC中的参数通过梯度下降来训练，以便卷积神经网络计算的类分数与训练集中每张图片的标签相一致。

In summary:总结
• A ConvNet architecture is in the simplest case a list of Layers that transform the image volume into an output volume (e.g. holding the class scores)
• There are a few distinct types of Layers (e.g. CONV/FC/RELU/POOL are by far the most popular)
• Each Layer accepts an input 3D volume and transforms it to an output 3D volume through a differentiable function
• Each Layer may or may not have parameters (e.g. CONV/FC do, RELU/POOL don’t)
• Each Layer may or may not have additional hyperparameters (e.g. CONV/FC/POOL do, RELU doesn’t)
• 一个卷积神经网络的结构最简单的情况下是一个将图像卷转化为输出卷（例，保存的类分数）的层序列
• 有几个不同类型的层
• 每一层通过不同的函数接受一个3D输入卷转化成一个3D输出卷
• 每一层可能有可能没有参数(CONV/FC 有, RELU/POOL 没有)
• 每一层可能有可能没有额外的超参数(CONV/FC/POOL 有, RELU 没有)
超参数hyperparameters ----根据经验进行设定，影响到权重和偏置的大小，比如迭代次数、隐藏层的层数、每层神经元的个数、学习速率等

The activations of an example ConvNet architecture. The initial volume stores the raw image pixels (left) and the last volume stores the class scores (right). Each volume of activations along the processing path is shown as a column. Since it’s difficult to visualize 3D volumes, we lay out each volume’s slices in rows. The last layer volume holds the scores for each class, but here we only visualize the sorted top 5 scores, and print the labels of each one. The full web-based demo is shown in the header of our website. The architecture shown here is a tiny VGG Net, which we will discuss later.
We now describe the individual layers and the details of their hyperparameters and their connectivities.
接下来将描述各层及其超参数和连接关系的细节。

Convolutional Layer
The Conv layer is the core building block of a Convolutional Network that does most of the computational heavy lifting.
卷积层是卷积神经网络的核心构件，它承担的大部分的计算量

Overview and intuition without brain stuff. Let’s first discuss what the CONV layer computes without brain/neuron analogies. The CONV layer’s parameters consist of a set of learnable filters. Every filter is small spatially (along width and height), but extends through the full depth of the input volume. For example, a typical filter on a first layer of a ConvNet might have size 5x5x3 (i.e. 5 pixels width and height, and 3 because images have depth 3, the color channels). During the forward pass, we slide (more precisely, convolve) each filter across the width and height of the input volume and compute dot products between the entries of the filter and the input at any position. As we slide the filter over the width and height of the input volume we will produce a 2-dimensional activation map that gives the responses of that filter at every spatial position. Intuitively, the network will learn filters that activate when they see some type of visual feature such as an edge of some orientation or a blotch of some color on the first layer, or eventually entire honeycomb or wheel-like patterns on higher layers of the network. Now, we will have an entire set of filters in each CONV layer (e.g. 12 filters), and each of them will produce a separate 2-dimensional activation map. We will stack these activation maps along the depth dimension and produce the output volume.
没有大脑的概述和直觉。让我们首先来讨论，没有大脑还神经元的类推，卷积层在计算什么？卷积层的参数由一系列可学习的卷积核构成。每一个卷积核在空间上都比较小（沿着宽度和高度），但是却贯穿了整个深度。例如，一个在卷积神经网络第一层的典型的卷积核可能拥有5x5x3的大小。在前向传递的过程中，我们在输入卷的宽度和高度上滑动（更准确的说，卷积）一个卷积核，并计算出卷积核条目和在任何位置的输入的点积。当我们沿着宽和高在输入卷上滑动卷积核，我们会生成一个2维激活映射，它会给出卷积核在任何空间上的响应。直观地说，网络会学习卷积核，当他们看到某种视觉特征时卷积核就会被激活，例如第一层上某个方向的边缘或某个颜色斑点，或最终在网络的较高层上看到整个蜂窝状或轮状图案。现在我们在每一个卷积层上都有一套完整的卷积核，并且它们中的每一个都会生成一个单独的2维激活映射。我们会沿着宽度和深度堆叠这些映射并生成一个输出卷。

The brain view. If you’re a fan of the brain/neuron analogies, every entry in the 3D output volume can also be interpreted as an output of a neuron that looks at only a small region in the input and shares parameters with all neurons to the left and right spatially (since these numbers all result from applying the same filter). We now discuss the details of the neuron connectivities, their arrangement in space, and their parameter sharing scheme.
大脑视图。如果你是大脑/神经元类比的粉丝，3D输入卷中的每一条输入都可以理解成一个神经元输出，这些神经元只关注输入的一小部分并在空间上和左侧或右侧的所有神经元分享参数。我们现在会讨论神经元连接的细节，它们在空间上的排布和参数共享的方案。

Local Connectivity. When dealing with high-dimensional inputs such as images, as we saw above it is impractical to connect neurons to all neurons in the previous volume. Instead, we will connect each neuron to only a local region of the input volume. The spatial extent of this connectivity is a hyperparameter called the receptive field of the neuron (equivalently this is the filter size). The extent of the connectivity along the depth axis is always equal to the depth of the input volume. It is important to emphasize again this asymmetry in how we treat the spatial dimensions (width and height) and the depth dimension: The connections are local in space (along width and height), but always full along the entire depth of the input volume.
本地连接。当我们在处理像图像这样的高维输入时，正如之前所看到的，将一个神经元连接到前一卷的所有神经元是不切实际的。相反，我们只将神经元连接到输入卷的局部区域。这个连接的空间范围叫做神经元的感受野（相当于卷积核的大小）。在深度上，连接的范围总书等于输入卷的深度。在处理空间维度（宽和高）和深度维度时，需要再次强调这样的不对称性：连接在空间（宽、高）上是局部的，但是在深度上总是沿着整个输入卷。

Example 1. For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 553 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.
例1，假设一个输入卷体积[32x32x3]。如果感受野的大小（或者卷积核的大小）是5x5，然后在卷积层中的每一个神经元在输入卷中将会有[5x5x3]区域的权值，总共75个参数（外加1个偏置参数）。注意沿着深度方向上连接的长度必须是3，因为这是输入卷的深度。

Example 2. Suppose an input volume had size [16x16x20]. Then using an example receptive field size of 3x3, every neuron in the Conv Layer would now have a total of 3320 = 180 connections to the input volume. Notice that, again, the connectivity is local in space (e.g. 3x3), but full along the input depth (20).
例2，假设一个输入卷体积[16x16x20]，然后使用一个感受野为3x3的卷积核，在卷积层的每一个神经元与输入卷之间的连接将会有3320 = 180。再次注意，在平面上，这些连接是局部的，但是完全跟随输入的深度大小

Left: An example input volume in red (e.g. a 32x32x3 CIFAR-10 image), and an example volume of neurons in the first Convolutional layer. Each neuron in the convolutional layer is connected only to a local region in the input volume spatially, but to the full depth (i.e. all color channels). Note, there are multiple neurons (5 in this example) along the depth, all looking at the same region in the input - see discussion of depth columns in text below.
红色的输入卷和第一个卷积层的一组神经元如图所示。卷积层的每一个神经元在平面上空间仅与输入卷的一部分区域相连，但是深度上全部连接。注意，这里在深度上有多个神经元，它们都查看相同的输入区域。

Right: The neurons from the Neural Network chapter remain unchanged: They still compute a dot product of their weights with the input followed by a non-linearity, but their connectivity is now restricted to be local spatially.
神经网络章节的神经元保持不变，它们仍然计算权重和输入之间的点积，然后金国一个非线性激活函数，但是它们的连接在平面上还是局部的。

Spatial arrangement. We have explained the connectivity of each neuron in the Conv Layer to the input volume, but we haven’t yet discussed how many neurons there are in the output volume or how they are arranged. Three hyperparameters control the size of the output volume: the depth, stride and zero-padding. We discuss these next:
空间排布：我们已经解释了卷积层中每一个神经元到输入卷的连接性问题，但是我们还没有讨论在输出卷中由多少个神经元还有怎么排布。三个超参数控制输出卷的大小（深度、步长和零填充）

First, the depth of the output volume is a hyperparameter: it corresponds to the number of filters we would like to use, each learning to look for something different in the input. For example, if the first Convolutional Layer takes as input the raw image, then different neurons along the depth dimension may activate in presence of various oriented edges, or blobs of color. We will refer to a set of neurons that are all looking at the same region of the input as a depth column (some people also prefer the term fibre).
首先，输出卷的深度是一个超参数：它对应于我们想使用的卷积核的数量，每一个过滤器都在输入中学习查找不同的内容。例如，如果第一个卷积层把原始图像作为输入，那么沿着深度上不同的神经元可能被不同特定方向上的边缘或色块激活。我们倾向于把一组查找相同输入区域的神经元称为深度列（也有人叫“纤维”）
Second, we must specify the stride with which we slide the filter. When the stride is 1 then we move the filters one pixel at a time. When the stride is 2 (or uncommonly 3 or more, though this is rare in practice) then the filters jump 2 pixels at a time as we slide them around. This will produce smaller output volumes spatially.
其次，我们必须指定滑动卷积核的步长。当步长为1时，卷积核每次滑动一个像素。当步长为2（或者不常见的3或更多，尽管并不常见）时，卷积核滑动一次移动两个像素（跳过一个）。这会导致在空间上产生更小的输出卷。
As we will soon see, sometimes it will be convenient to pad the input volume with zeros around the border. The size of this zero-padding is a hyperparameter. The nice feature of zero padding is that it will allow us to control the spatial size of the output volumes (most commonly as we’ll see soon we will use it to exactly preserve the spatial size of the input volume so the input and output width and height are the same).
我们很快会看到，有时输入卷进行0填充会非常方便。补0的大小是一个超参数。补0的好处就是可以让我们控制输出卷平面上的大小（我们很快会看到，最常见的我们用补0来保持输入卷在平面上的大小，以使宽度和高度保持相同）

We can compute the spatial size of the output volume as a function of the input volume size (W), the receptive field size of the Conv Layer neurons (F), the stride with which they are applied (S), and the amount of zero padding used § on the border. You can convince yourself that the correct formula for calculating how many neurons “fit” is given by (W−F+2P)/S+1. For example for a 7x7 input and a 3x3 filter with stride 1 and pad 0 we would get a 5x5 output. With stride 2 we would get a 3x3 output. Let’s also see one more graphical example:
根据输入卷的大小(W)、卷积层神经元的感受野大小(F)和它们的步长(S)、边界0填充§的数量，我们能计算出输出卷的大小。你可以使自己相信这个正确的公式(W−F+2P)/S+1能计算多少神经元合适。例如，一个7x7的输入和3x3的卷积核且步长为1不补0，会有一个5x5的输出。步长为2时，输出为3x3。再看一个图形化的例子

Illustration of spatial arrangement. In this example there is only one spatial dimension (x-axis), one neuron with a receptive field size of F = 3, the input size is W = 5, and there is zero padding of P = 1. Left: The neuron strided across the input in stride of S = 1, giving output of size (5 - 3 + 2)/1+1 = 5. Right: The neuron uses stride of S = 2, giving output of size (5 - 3 + 2)/2+1 = 3. Notice that stride S = 3 could not be used since it wouldn’t fit neatly across the volume. In terms of the equation, this can be determined since (5 - 3 + 2) = 4 is not divisible by 3.
空间布局示意图。在本例中，空间上只有一个维度，神经元感受野F=3，输入W=5，0填充为P=1.左图：神经元的步长S=1，输出5；右图：神经元的步长为2，输出3.注意，步长不能为3。因为在这个计算式中不能被3整除。

The neuron weights are in this example [1,0,-1] (shown on very right), and its bias is zero. These weights are shared across all yellow neurons (see parameter sharing below).
本例中神经元权值为[1,0,-1]，其偏差为0.这些权重在黄色神经元中共享。

Use of zero-padding. In the example above on left, note that the input dimension was 5 and the output dimension was equal: also 5. This worked out so because our receptive fields were 3 and we used zero padding of 1. If there was no zero-padding used, then the output volume would have had spatial dimension of only 3, because that it is how many neurons would have “fit” across the original input. In general, setting zero padding to be P=(F−1)/2 when the stride is S=1 ensures that the input volume and output volume will have the same size spatially. It is very common to use zero-padding in this way and we will discuss the full reasons when we talk more about ConvNet architectures.
使用0填充。在之前左侧的示例中，注意到输入和输出维度都是5. 这样计算出，因为我们的感受野是3，然后使用一个0填充。如果这里没有使用0填充，那么输出卷的空间维度将只有3，因为神经元的数量与原始输入对应。通常，在步长为1时，我们设置0填充为P=(F−1)/2，这样可以保证输入卷和输入卷在空间上有相同的大小。以这种方式使用0填充非常常见，我们会在讲述卷积神经网络结构时讨论全部原因。

Constraints on strides. Note again that the spatial arrangement hyperparameters have mutual constraints. For example, when the input has size W=10, no zero-padding is used P=0, and the filter size is F=3, then it would be impossible to use stride S=2, since (W−F+2P)/S+1=(10−3+0)/2+1=4.5, i.e. not an integer, indicating that the neurons don’t “fit” neatly and symmetrically across the input. Therefore, this setting of the hyperparameters is considered to be invalid, and a ConvNet library could throw an exception or zero pad the rest to make it fit, or crop the input to make it fit, or something. As we will see in the ConvNet architectures section, sizing the ConvNets appropriately so that all the dimensions “work out” can be a real headache, which the use of zero-padding and some design guidelines will significantly alleviate.
步长的限制。再关注一下空间排布的超参数是有相互约束的。例如，当输入大小为W=10时，0填充P=0，那么使用2的步长是不可能的，因为(W−F+2P)/S+1= (10−3+0)/2+1=4.5，不是整数，表明神经元在整个输入中没有准确对称的拟合。因此，这个超参数的设置是无效的，一个卷积神经网络库可以抛出一个异常，或者使用0填充或裁剪输入使它拟合，诸如此类。我们将在卷积神经网络架构中看到，设置合适的卷积神经网络大小，使所有的维度都合适是一个真正让人头痛的问题，而零填充和一些设计指南可以显著的缓解

Real-world example. The Krizhevsky et al. architecture that won the ImageNet challenge in 2012 accepted images of size [227x227x3]. On the first Convolutional Layer, it used neurons with receptive field size F=11, stride S=4 and no zero padding P=0. Since (227 - 11)/4 + 1 = 55, and since the Conv layer had a depth of K=96, the Conv layer output volume had size [55x55x96]. Each of the 555596 neurons in this volume was connected to a region of size [11x11x3] in the input volume. Moreover, all 96 neurons in each depth column are connected to the same [11x11x3] region of the input, but of course with different weights. As a fun aside, if you read the actual paper it claims that the input images were 224x224, which is surely incorrect because (224 - 11)/4 + 1 is quite clearly not an integer. This has confused many people in the history of ConvNets and little is known about what happened. My own best guess is that Alex used zero-padding of 3 extra pixels that he does not mention in the paper.
现实的例子。2012年赢得ImageNet挑战赛的Krizhevsky等人的网络结构接受了尺寸为[227x227x3]的图像。在第一个卷积层上，使用了感受野大小为F=11、步长S=4、无补零P=0的神经元。由于(227-11)/4+1=55，由于卷积层深度K=96，所以卷积层输出卷的大小为[55x55x96]。这个卷中的555596个神经元，每一个都连接到输入卷中的一个大小为[11x11x3]的区域。此外，每个深度列中的96个神经元都连接到输入卷相同的大小为[11x11x3]的区域，当然权重不同。有趣的是，如果你阅读实际的论文，它声称输入图像是224x224，这肯定是不正确的，因为(224-11)/4+1显然不是一个整数。这让很多人对卷积神经网络的历史感到困惑，不知道发生了什么。我自己最好的猜测是Alex使用了额外的3个像素的零填充，他在论文中没有提到。

Parameter Sharing. Parameter sharing scheme is used in Convolutional Layers to control the number of parameters. Using the real-world example above, we see that there are 555596 = 290,400 neurons in the first Conv Layer, and each has 11113 = 363 weights and 1 bias. Together, this adds up to 290400 * 364 = 105,705,600 parameters on the first layer of the ConvNet alone. Clearly, this number is very high.
参数共享。在卷积层中，参数共享机制被用来控制参数的数量。使用上面的实际的例子，我们看到第一个卷积层有290,400个神经元，每一个神经元又363的权重外加1个偏差。两者合起来，仅在第一个卷积层就有105,705,600个参数，很明显数量非常大。

It turns out that we can dramatically reduce the number of parameters by making one reasonable assumption: That if one feature is useful to compute at some spatial position (x,y), then it should also be useful to compute at a different position (x2,y2). In other words, denoting a single 2-dimensional slice of depth as a depth slice (e.g. a volume of size [55x55x96] has 96 depth slices, each of size [55x55]), we are going to constrain the neurons in each depth slice to use the same weights and bias. With this parameter sharing scheme, the first Conv Layer in our example would now have only 96 unique set of weights (one for each depth slice), for a total of 9611113 = 34,848 unique weights, or 34,944 parameters (+96 biases). Alternatively, all 5555 neurons in each depth slice will now be using the same parameters. In practice during back-propagation, every neuron in the volume will compute the gradient for its weights, but these gradients will be added up across each depth slice and only update a single set of weights per slice.
事实证明，我们可以通过一个合理的假设来显著减少参数的数量：如果一个特征在一个空间位置（x，y）上的计算是有用的，那么它在另一个位置（x2，y2）的计算应该也是有用的。换句话，在深度上，将一个简单的2维切片作为一个深度切片（例，一个体积为[55x55x96]的卷有96个深度切片，每一个体积为[55x55]），我们会强制在同一个深度切片的神经元使用相同的权重和偏差。通过这个参数共享方案，示例中的第一个卷积层现在将只会拥有96组不同的权重（每个深度切片一组），总共9611113 = 34,848个权重，或34,944个参数（加上96个偏置）。另外，每一个深度切片上5555个神经元将会使用相同的参数。在实际的反向传播中，卷的每一个神经元都会计算权重的梯度，但是这些梯度将会在每一个深度切片上相加，并且每个切片只更新一组权值。

Notice that if all neurons in a single depth slice are using the same weight vector, then the forward pass of the CONV layer can in each depth slice be computed as a convolution of the neuron’s weights with the input volume (Hence the name: Convolutional Layer). This is why it is common to refer to the sets of weights as a filter (or a kernel), that is convolved with the input.
注意，如果每个切片中的所有神经元都使用相同的权重向量，那么，卷积层的前向传播在每一个深度切片中都可以被计算成神经元权重和输入卷的卷积（因此得名：卷积层）。这就是为什么通常将一组权重称为卷积核，它是和输入进行卷积。

Example filters learned by Krizhevsky et al. Each of the 96 filters shown here is of size [11x11x3], and each one is shared by the 5555 neurons in one depth slice. Notice that the parameter sharing assumption is relatively reasonable: If detecting a horizontal edge is important at some location in the image, it should intuitively be useful at some other location as well due to the translationally-invariant structure of images. There is therefore no need to relearn to detect a horizontal edge at every one of the 5555 distinct locations in the Conv layer output volume.

Note that sometimes the parameter sharing assumption may not make sense. This is especially the case when the input images to a ConvNet have some specific centered structure, where we should expect, for example, that completely different features should be learned on one side of the image than another. One practical example is when the input are faces that have been centered in the image. You might expect that different eye-specific or hair-specific features could (and should) be learned in different spatial locations. In that case it is common to relax the parameter sharing scheme, and instead simply call the layer a Locally-Connected Layer.
注意，有时参数共享的方案可能并没有意义。当一个输入到卷积神经网络的图像拥有特定的中心结构时，情况尤其如此，例如，我们需要从一边学习到一个与另一边完全不同的特征。一个实际的例子，当输入的是图像焦点的人脸时，你可能期望在不同的空间区域学习到不同的眼部或头发的特征。在这种情况下，通常会放松参数共享方案，而把这层称为本地连接层。

Numpy examples. To make the discussion above more concrete, lets express the same ideas but in code and with a specific example. Suppose that the input volume is a numpy array X. Then:
以Numpy为例。为了使上面的讨论更具体，让我们用代码和一个具体的例子来表达相同的想法。假设输入卷是一个Numpy数组X。

• A depth column (or a fibre) at position (x,y) would be the activations X[x,y,:].
在（x，y）位置的一个深度列（或纤维），X[x,y,:]
• A depth slice, or equivalently an activation map at depth d would be the activations X[:,:,d].
一个深度切片，或者相当于在d深度的激活映射，X[:,:,d]
Conv Layer Example. Suppose that the input volume X has shape X.shape: (11,11,4). Suppose further that we use no zero padding (P=0), that the filter size is F=5, and that the stride is S=2. The output volume would therefore have spatial size (11-5)/2+1 = 4, giving a volume with width and height of 4. The activation map in the output volume (call it V), would then look as follows (only some of the elements are computed in this example):
卷积层样例。假设输入卷X的体积有(11,11,4)。进一步假设，我们不使用0填充，卷积核大小为5，步长为2.输出卷因此有(11-5)/2+1 = 4的空间体积，得到一个宽和高都为4的卷。输出卷的激活映射如下所示。
• V[0,0,0] = np.sum(X[:5,:5,:] * W0) + b0
• V[1,0,0] = np.sum(X[2:7,:5,:] * W0) + b0
• V[2,0,0] = np.sum(X[4:9,:5,:] * W0) + b0
• V[3,0,0] = np.sum(X[6:11,:5,:] * W0) + b0
np.sum(数组)代表将数组所有元素相加。
Remember that in numpy, the operation * above denotes elementwise multiplication between the arrays. Notice also that the weight vector W0 is the weight vector of that neuron and b0 is the bias. Here, W0 is assumed to be of shape W0.shape: (5,5,4), since the filter size is 5 and the depth of the input volume is 4. Notice that at each point, we are computing the dot product as seen before in ordinary neural networks. Also, we see that we are using the same weight and bias (due to parameter sharing), and where the dimensions along the width are increasing in steps of 2 (i.e. the stride). To construct a second activation map in the output volume, we would have:
在numpy中*运算代表将数组按元素相乘。同时注意到，权重向量W0是神经元的权重向量，b0是偏移量。W0的体积应被假设为(5，5，4)，因为卷积核的大小为5、输入卷深度为4。请注意，在每一点，我们都像在之前的典型神经网络见到的那样，都计算点积。同样，我们看到都使用了相同的权重和偏移量（权值共享），并且沿着宽度，尺寸每次增长2个位移。为了构建输出卷的第二个激活映射，我们有：
• V[0,0,1] = np.sum(X[:5,:5,:] * W1) + b1
• V[1,0,1] = np.sum(X[2:7,:5,:] * W1) + b1
• V[2,0,1] = np.sum(X[4:9,:5,:] * W1) + b1
• V[3,0,1] = np.sum(X[6:11,:5,:] * W1) + b1
• V[0,1,1] = np.sum(X[:5,2:7,:] * W1) + b1 (example of going along y)
• V[2,3,1] = np.sum(X[4:9,6:11,:] * W1) + b1 (or along both)

where we see that we are indexing into the second depth dimension in V (at index 1) because we are computing the second activation map, and that a different set of parameters (W1) is now used. In the example above, we are for brevity leaving out some of the other operations the Conv Layer would perform to fill the other parts of the output array V. Additionally, recall that these activation maps are often followed elementwise through an activation function such as ReLU, but this is not shown here.
这里我们看到，在V中，我们正索引到第二个深度维度，因为我们正计算第二个深度映射，并且使用了一组不同的参数。在上面的样例中，我们为了简洁略去了一些卷积层将会填充数组V其他部分的一些操作。此外，这些激活映射通常接连着就是按元素通过激活函数例如ReLU(线性整流函数)，但是这里并没有展现。

Summary. To summarize, the Conv Layer:
• Accepts a volume of size W1×H1×D1
• Requires four hyperparameters:
o Number of filters K,
o their spatial extent F,
o the stride S,
o the amount of zero padding P.
• Produces a volume of size W2×H2×D2 where:
o W2=(W1−F+2P)/S+1
o H2=(H1−F+2P)/S+1 (i.e. width and height are computed equally by symmetry)
o D2=K
• With parameter sharing, it introduces F⋅F⋅D1 weights per filter, for a total of (F⋅F⋅D1)⋅K weights and K biases.
• In the output volume, the d-th depth slice (of size W2×H2) is the result of performing a valid convolution of the d-th filter over the input volume with a stride of S, and then offset by d-th bias.
卷积层总结
• 输入卷体积为W1×H1×D1
• 需要四个超参数
o 卷积核的数量K
o 卷积核空间范围（感受野）F
o 卷积核步长S
o 0填充的数量P
• 生成一个输出卷的体积W2×H2×D2
o W2=(W1−F+2P)/S+1
o H2=(H1−F+2P)/S+1 (i.e. 因为堆成宽和高是相同的)
o D2=K
• 通过参数共享，每一个卷积核有F⋅F⋅D1个参数，所以总共有(F⋅F⋅D1)⋅K个权值和K个偏移量。
• 在输出卷中，第d个深度切片（大小为W2×H2）是在步长为S的情况下在输入卷上执行第d个卷积核的有效卷积，然后在第d个偏置处偏移的结果。

A common setting of the hyperparameters is F=3,S=1,P=1. However, there are common conventions and rules of thumb that motivate these hyperparameters. See the ConvNet architectures section below.
超参数通产设置为F=3,S=1,P=1。但是，有一些共同的惯例和经验法则可以激发这些超参数。

Convolution Demo. Below is a running demo of a CONV layer. Since 3D volumes are hard to visualize, all the volumes (the input volume (in blue), the weight volumes (in red), the output volume (in green)) are visualized with each depth slice stacked in rows. The input volume is of size W1=5,H1=5,D1=3, and the CONV layer parameters are K=2,F=3,S=2,P=1. That is, we have two filters of size 3×3, and they are applied with a stride of 2. Therefore, the output volume size has spatial size (5 - 3 + 2)/2 + 1 = 3. Moreover, notice that a padding of P=1 is applied to the input volume, making the outer border of the input volume zero. The visualization below iterates over the output activations (green), and shows that each element is computed by elementwise multiplying the highlighted input (blue) with the filter (red), summing it up, and then offsetting the result by the bias.
卷积层示例。

Implementation as Matrix Multiplication. Note that the convolution operation essentially performs dot products between the filters and local regions of the input. A common implementation pattern of the CONV layer is to take advantage of this fact and formulate the forward pass of a convolutional layer as one big matrix multiply as follows:
矩阵乘法的实现。

The local regions in the input image are stretched out into columns in an operation commonly called im2col. For example, if the input is [227x227x3] and it is to be convolved with 11x11x3 filters at stride 4, then we would take [11x11x3] blocks of pixels in the input and stretch each block into a column vector of size 11113 = 363. Iterating this process in the input at stride of 4 gives (227-11)/4+1 = 55 locations along both width and height, leading to an output matrix X_col of im2col of size [363 x 3025], where every column is a stretched out receptive field and there are 55*55 = 3025 of them in total. Note that since the receptive fields overlap, every number in the input volume may be duplicated in multiple distinct columns.
The weights of the CONV layer are similarly stretched out into rows. For example, if there are 96 filters of size [11x11x3] this would give a matrix W_row of size [96 x 363].
The result of a convolution is now equivalent to performing one large matrix multiply np.dot(W_row, X_col), which evaluates the dot product between every filter and every receptive field location. In our example, the output of this operation would be [96 x 3025], giving the output of the dot product of each filter at each location.
The result must finally be reshaped back to its proper output dimension [55x55x96].

This approach has the downside that it can use a lot of memory, since some values in the input volume are replicated multiple times in X_col. However, the benefit is that there are many very efficient implementations of Matrix Multiplication that we can take advantage of (for example, in the commonly used BLAS API). Moreover, the same im2col idea can be reused to perform the pooling operation, which we discuss next.
这种方法的缺点是使用了大量内存，因为输入卷的某些值在X_col中被多次复制。但优点就是，我们有许多有效的矩阵乘法应用可以利用。此外，im2col的方法可以在池化层中使用。
Backpropagation. The backward pass for a convolution operation (for both the data and the weights) is also a convolution (but with spatially-flipped filters). This is easy to derive in the 1-dimensional case with a toy example (not expanded on for now).
反向传播。卷积运算的反向传播（对于数据和权重）同样也是卷积运算（但是有空间反转卷积核）。这很容易在上面简单示例的一维情况下得出（暂时没有扩展）。
1x1 convolution. As an aside, several papers use 1x1 convolutions, as first investigated by Network in Network. Some people are at first confused to see 1x1 convolutions especially when they come from signal processing background. Normally signals are 2-dimensional so 1x1 convolutions do not make sense (it’s just pointwise scaling). However, in ConvNets this is not the case because one must remember that we operate over 3-dimensional volumes, and that the filters always extend through the full depth of the input volume. For example, if the input is [32x32x3] then doing 1x1 convolutions would effectively be doing 3-dimensional dot products (since the input depth is 3 channels).
11卷积。几篇使用11卷积的文章首先发表在Network in Network。有些人第一次看到11卷积可能会疑惑特别是有着信号处理背景。通常信号的二维的所以11卷积不能起作用（它只是逐点缩放）。但是在卷积神经网络中并不是这样，因为必须要记得一点，我们处理的是超过3维的卷，并且卷积核始终能拓展输入卷的整个深度。例如，如果输入是[32x32x3]，然后进行1*1卷积将会有效的进行三维点乘运算（因为输入时三通道）
Dilated convolutions. A recent development (e.g. see paper by Fisher Yu and Vladlen Koltun) is to introduce one more hyperparameter to the CONV layer called the dilation. So far we’ve only discussed CONV filters that are contiguous. However, it’s possible to have filters that have spaces between each cell, called dilation. As an example, in one dimension a filter w of size 3 would compute over input x the following: w[0]*x[0] + w[1]*x[1] + w[2]*x[2]. This is dilation of 0. For dilation 1 the filter would instead compute w[0]*x[0] + w[1]*x[2] + w[2]*x[4]; In other words there is a gap of 1 between the applications. This can be very useful in some settings to use in conjunction with 0-dilated filters because it allows you to merge spatial information across the inputs much more agressively with fewer layers. For example, if you stack two 3x3 CONV layers on top of each other then you can convince yourself that the neurons on the 2nd layer are a function of a 5x5 patch of the input (we would say that the effective receptive field of these neurons is 5x5). If we use dilated convolutions then this effective receptive field would grow much quicker.
扩张卷积。最近的一项发展（paper by Fisher Yu and Vladlen Koltun）是在卷积层引入另一个称为扩张的超参数。到目前为止，我们只讨论了连续的卷积核。但是，可以使每个单元格之间有空格的过滤器称为扩张。例如，在一个维度中，大小为3的卷积核w将在输入x上计算以下：w[0]*x[0]+w[1]*x[1]+w[2]*x[2]。这是0的扩张。对于扩张1，卷积核将改为计算w[0]*x[0]+w[1]*x[2]+w[2]*x[4];换句话说，采集点之间存在1的距离。这在某些设置中非常有用，可以与0扩展卷积核结合使用，因为它允许您使用更少的层更加积极地合并输入中的空间信息。例如，如果你将两个3x3 CONV层堆叠在一起，那么你可以说服自己第二层的神经元是输入的5x5补丁的函数（我们可以说这些神经元的有效感受域是5×5）。如果我们使用扩张卷积，那么这个有效的感受野会增长得更快（2-dilated两层的感受野就有7）。
Pooling Layer
It is common to periodically insert a Pooling layer in-between successive Conv layers in a ConvNet architecture. Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting. The Pooling Layer operates independently on every depth slice of the input and resizes it spatially, using the MAX operation. The most common form is a pooling layer with filters of size 2x2 applied with a stride of 2 downsamples every depth slice in the input by 2 along both width and height, discarding 75% of the activations. Every MAX operation would in this case be taking a max over 4 numbers (little 2x2 region in some depth slice). The depth dimension remains unchanged. More generally, the pooling layer:
通常在卷积神经网络架构中的连续卷积层之间周期性插入池化层。其功能是逐步减小表示的空间大小，以减少网络中的参数和计算量，从而控制过度拟合。池化层在输入的每个深度切片上独立运行，并使用MAX运算调整空间大小。最常见的形式是一个池化层，其中大小为2x2的下采样过滤器的步长2，输入中的每个深度切片沿宽度和高度均为2，丢弃了75％的激活。在这种情况下，每个MAX操作将采用最多超过4个数字（在某个深度切片中的小2x2区域）。深度维度保持不变。更具体地说，池化层：
• Accepts a volume of size W1×H1×D1
• 输入卷大小为W1×H1×D1
• Requires two hyperparameters:
o their spatial extent F,
o the stride S,
• 需要两个超参数
o 空间范围F
o 步长S
• Produces a volume of size W2×H2×D2 where:
o W2=(W1−F)/S+1
o H2=(H1−F)/S+1
o D2=D1
• 输出卷大小W2×H2×D2
o W2=(W1−F)/S+1
o H2=(H1−F)/S+1
o D2=D1
• Introduces zero parameters since it computes a fixed function of the input
• 因为输入中使用固定函数，所以没有引入参数
• For Pooling layers, it is not common to pad the input using zero-padding.
It is worth noting that there are only two commonly seen variations of the max pooling layer found in practice: A pooling layer with F=3,S=2 (also called overlapping pooling), and more commonly F=2,S=2. Pooling sizes with larger receptive fields are too destructive.
对于池化层，使用零填充补充输入并不常见。值得注意的是，在实际中只有两个常见的最大池化变体：F=3，S=2的池化层（也称为重叠池），更常见的是F=2，S=2。具有较大感受野的池大小太具破坏性。
General pooling. In addition to max pooling, the pooling units can also perform other functions, such as average pooling or even L2-norm pooling. Average pooling was often used historically but has recently fallen out of favor compared to the max pooling operation, which has been shown to work better in practice.
一般池化。除了最大池化，池化单元还可以执行其他功能，例如平均池化甚至L2标准池化。平均池化在过去经常使用，但最近与最大池化操作相比已落于下风，最大池化证明在实际中效果更好。

Pooling layer downsamples the volume spatially, independently in each depth slice of the input volume. Left: In this example, the input volume of size [224x224x64] is pooled with filter size 2, stride 2 into output volume of size [112x112x64]. Notice that the volume depth is preserved. Right: The most common downsampling operation is max, giving rise to max pooling, here shown with a stride of 2. That is, each max is taken over 4 numbers (little 2x2 square).
Backpropagation. Recall from the backpropagation chapter that the backward pass for a max(x, y) operation has a simple interpretation as only routing the gradient to the input that had the highest value in the forward pass. Hence, during the forward pass of a pooling layer it is common to keep track of the index of the max activation (sometimes also called the switches) so that gradient routing is efficient during backpropagation.
反向传播。回想一下反向传播章节，max（x，y）操作的后向传递可以简单解释为，引向前向传播中最高值的输入变化率。因此，在池化层的前向传播期间，通常跟踪最大激活的索引（有时也称为开关），使得梯度路由在反向传播期间是有效的。
Getting rid of pooling. Many people dislike the pooling operation and think that we can get away without it. For example, Striving for Simplicity: The All Convolutional Net proposes to discard the pooling layer in favor of architecture that only consists of repeated CONV layers. To reduce the size of the representation they suggest using larger stride in CONV layer once in a while. Discarding pooling layers has also been found to be important in training good generative models, such as variational autoencoders (VAEs) or generative adversarial networks (GANs). It seems likely that future architectures will feature very few to no pooling layers.
摆脱池化层。许多人不喜欢池化操作，并认为可以摆脱池化层。例如，Striving for Simplicity: The All Convolutional Net 建议删去池化层，转而使用仅包含重复卷积层的体系结构。为了减小表示的大小，他们建议偶尔在CONV层中使用更大的步长。还发现在训练良好的生成模型中去掉池化层很重要，例如变分自动编码器（VAE）或生成性对抗网络（GAN）。未来的架构很可能只有很少甚至没有池化层。
Normalization Layer
Many types of normalization layers have been proposed for use in ConvNet architectures, sometimes with the intentions of implementing inhibition schemes observed in the biological brain. However, these layers have since fallen out of favor because in practice their contribution has been shown to be minimal, if any. For various types of normalizations, see the discussion in Alex Krizhevsky’s cuda-convnet library API.
用于ConvNet架构的许多类型的归一化层已经被提出，有些意图实现在生物脑中观察到的抑制方案。然而，这些层次已经弃用，因为在实践中它们的贡献已被证明是最小的，如果有的话。有关各种类型的规范化，请参阅Alex Krizhevsky的cuda-convnet library API.
Fully-connected layer
Neurons in a fully connected layer have full connections to all activations in the previous layer, as seen in regular Neural Networks. Their activations can hence be computed with a matrix multiplication followed by a bias offset. See the Neural Network section of the notes for more information.
全连接层中的神经元与前一层中的所有激活具有完全连接，如常规神经网络中所示。因此，可以通过矩阵乘法后跟偏置偏移来计算它们的激活。有关详细信息，请参阅注释的“神经网络”部分。
Converting FC layers to CONV layers
It is worth noting that the only difference between FC and CONV layers is that the neurons in the CONV layer are connected only to a local region in the input, and that many of the neurons in a CONV volume share parameters. However, the neurons in both layers still compute dot products, so their functional form is identical. Therefore, it turns out that it’s possible to convert between FC and CONV layers:
值得注意的是，全连接和卷积层之间的唯一区别是卷积层中的神经元仅连接到输入中的局部区域，并且CONV卷中的许多神经元共享参数。然而，两层中的神经元仍然计算点积，因此它们的功能形式是相同的。因此，事实证明，FC和CONV层之间进行转换是可行的：
• For any CONV layer there is an FC layer that implements the same forward function. The weight matrix would be a large matrix that is mostly zero except for at certain blocks (due to local connectivity) where the weights in many of the blocks are equal (due to parameter sharing).
• 对于任何CONV层，都有一个实现相同前向功能的FC层。权重矩阵将是一个除了在某些块（由于本地连接）之外，其他大部分为零的大矩阵，其中许多块中的权重相等（由于参数共享）。
• Conversely, any FC layer can be converted to a CONV layer. For example, an FC layer with K=4096 that is looking at some input volume of size 7×7×512 can be equivalently expressed as a CONV layer with F=7, P=0, S=1, K=4096. In other words, we are setting the filter size to be exactly the size of the input volume, and hence the output will simply be 1×1×4096 since only a single depth column “fits” across the input volume, giving identical result as the initial FC layer.
• 相反，任何FC层都可以转换为CONV层。例如，观察一些尺寸为7×7×512的输入体积的K = 4096的FC层可以等效地表示为具有感受野F=7，零填充P=0，补偿S=1，卷积核数量K=4096的CONV层。换句话说，我们将滤波器大小设置为输入卷的大小，因此输出将只是1×1×4096，因为只有一个深度列“适合”输入卷，从而得到相同的结果最初的FC层。
FC->CONV conversion. Of these two conversions, the ability to convert an FC layer to a CONV layer is particularly useful in practice. Consider a ConvNet architecture that takes a 224x224x3 image, and then uses a series of CONV layers and POOL layers to reduce the image to an activations volume of size 7x7x512 (in an AlexNet architecture that we’ll see later, this is done by use of 5 pooling layers that downsample the input spatially by a factor of two each time, making the final spatial size 224/2/2/2/2/2 = 7). From there, an AlexNet uses two FC layers of size 4096 and finally the last FC layers with 1000 neurons that compute the class scores. We can convert each of these three FC layers to CONV layers as described above:
FC-> CONV转换。在这两次转换中，将FC层转换为CONV层的能力在实践中特别有用。考虑采用224x224x3图像的ConvNet架构，然后使用一系列CONV层和POOL层将图像缩小为7x7x512的特征图（在我们稍后将看到的AlexNet架构中，这是通过使用 5个池化层，每次在空间上对输入进行下采样，使最终空间大小为224/2/2/2/2/2 = 7）。从那里开始，AlexNet使用两个大小为4096的FC层（7x7x512的特征图全连接连向4096个神经元，4096再连向4096），最后使用1000个神经元来计算类别得分。我们可以将这三个FC层中的每一个转换为CONV层，如上所述：
• Replace the first FC layer that looks at [7x7x512] volume with a CONV layer that uses filter size F=7, giving output volume [1x1x4096].
• 通过卷积核感受野为7，特征图为[1x1x4096]的卷积层取代计算输入特征图为[7x7x512]的全连接层
• Replace the second FC layer with a CONV layer that uses filter size F=1, giving output volume [1x1x4096]
• 使用卷积核F = 1的CONV层替换第二个FC层，给出输出特征图[1x1x4096]
• Replace the last FC layer similarly, with F=1, giving final output [1x1x1000]
• 类似地替换最后一个FC层，F = 1，给出最终输出[1x1x1000]
Each of these conversions could in practice involve manipulating (e.g. reshaping) the weight matrix W in each FC layer into CONV layer filters. It turns out that this conversion allows us to “slide” the original ConvNet very efficiently across many spatial positions in a larger image, in a single forward pass.
实际上，这些转换中的每一个都可以涉及将每个FC层中的权重矩阵W操纵（例如，重新整形）成CONV层卷积核。事实证明，这种转换允许我们在单个前向传递中非常有效地把原始ConvNet在更大图像中的许多空间位置上“滑动”。
For example, if 224x224 image gives a volume of size [7x7x512] - i.e. a reduction by 32, then forwarding an image of size 384x384 through the converted architecture would give the equivalent volume in size [12x12x512], since 384/32 = 12. Following through with the next 3 CONV layers that we just converted from FC layers would now give the final volume of size [6x6x1000], since (12 - 7)/1 + 1 = 6. Note that instead of a single vector of class scores of size [1x1x1000], we’re now getting an entire 6x6 array of class scores across the 384x384 image.
例如，如果224x224图像给出的特征图大小为[7x7x512] - 即减少32倍，那么通过转换的体系结构转发大小为384x384的图像将得到相等的体积[12x12x512]，因为384/32=12。接下来FC层转换的3个 CONV层将给出最终的体积特征图大小为[6x6x1000]，因为（12-7）/1+1=6。注意，并不是单个分类向量大小[1x1x1000]，现在，我们在384x384图像上得到整个6x6类别的分数。
Evaluating the original ConvNet (with FC layers) independently across 224x224 crops of the 384x384 image in strides of 32 pixels gives an identical result to forwarding the converted ConvNet one time.
原始ConvNet（具有FC层），横过384x384图像在步长为32像素下的224x224特征的结果，和一次性发送进转换过的ConvNet结果是一样的。
Naturally, forwarding the converted ConvNet a single time is much more efficient than iterating the original ConvNet over all those 36 locations, since the 36 evaluations share computation. This trick is often used in practice to get better performance, where for example, it is common to resize an image to make it bigger, use a converted ConvNet to evaluate the class scores at many spatial positions and then average the class scores.
自然地，一次性输入转换后的ConvNet比在所有这36个位置上迭代原始ConvNet要有效得多，因为36次评估共享计算资源。在实践中这个技巧通常用于获得更好的性能。例如，通常调整图像大小以使其更大，使用转换后的ConvNet评估许多空间位置的类别分数，然后计算平均类别分数。
Lastly, what if we wanted to efficiently apply the original ConvNet over the image but at a stride smaller than 32 pixels? We could achieve this with multiple forward passes. For example, note that if we wanted to use a stride of 16 pixels we could do so by combining the volumes received by forwarding the converted ConvNet twice: First over the original image and second over the image but with the image shifted spatially by 16 pixels along both width and height.
最后，如果我们想要在图像上有效地应用原始ConvNet但是步长小于32像素怎么办？我们可以通过多次前进传球实现这一点。例如，请注意，如果我们想要使用16像素的步幅，我们可以通过将转换后的ConvNet转发两次来收集的特征图组合起来：首先是原始图像，第二个是沿着宽度和高度在空间上移动了16个像素的图像。
• An IPython Notebook on Net Surgery shows how to perform the conversion in practice, in code (using Caffe)
ConvNet Architectures
We have seen that Convolutional Networks are commonly made up of only three layer types: CONV, POOL (we assume Max pool unless stated otherwise) and FC (short for fully-connected). We will also explicitly write the RELU activation function as a layer, which applies elementwise non-linearity. In this section we discuss how these are commonly stacked together to form entire ConvNets.
我们已经看到，卷积网络通常只由三种层类型组成：CONV，POOL（我们假设Max池除非另有说明）和FC（完全连接的简称）。我们还将RELU激活函数明确地写为一个层，它应用元素非线性。在本节中，我们将讨论如何将这些通常堆叠在一起以形成整个ConvNets。
Layer Patterns
The most common form of a ConvNet architecture stacks a few CONV-RELU layers, follows them with POOL layers, and repeats this pattern until the image has been merged spatially to a small size. At some point, it is common to transition to fully-connected layers. The last fully-connected layer holds the output, such as the class scores. In other words, the most common ConvNet architecture follows the pattern:
ConvNet架构最常见的形式是堆叠一些CONV-RELU层，随后跟随POOL层，并重复此模式，直到图像在空间上合并为小尺寸。在某些时候，过渡到完全连接的层是很常见的。最后一个完全连接的层保存输出，例如类分数。换句话说，最常见的ConvNet架构遵循以下模式：
INPUT -> [[CONV -> RELU]*N -> POOL?]*M -> [FC -> RELU]*K -> FC
输入多个CONV-RELU层池化层多个FC-RELU 全连接层

where the * indicates repetition, and the POOL? indicates an optional pooling layer. Moreover, N >= 0 (and usually N <= 3), M >= 0, K >= 0 (and usually K < 3). For example, here are some common ConvNet architectures you may see that follow this pattern:

代表重复，POOL? 池化层是可选择的。此外，N>=0(通常N<=3), M>=0, K>=0(通常K<3)。例如，以下可能你常见的ConvNet结构遵循以下形式。
• INPUT -> FC, implements a linear classifier. Here N = M = K = 0.
• 输入->全连接，实现了一个线性分类器，这里N=M=K=0.
• INPUT -> CONV -> RELU -> FC
• 输入->卷积层->线性整流函数->全连接，实现了一个线性分类器，这里N=M=K=0.
• INPUT -> [CONV -> RELU -> POOL]2 -> FC -> RELU -> FC. Here we see that there is a single CONV layer between every POOL layer.
• 输入->[卷积层->线性整流函数->池化层]2->全连接->线性整流函数->全连接，在这里，我们看到每个POOL层之间都有一个CONV层。
• INPUT -> [CONV -> RELU -> CONV -> RELU -> POOL]3 -> [FC -> RELU]2 -> FC Here we see two CONV layers stacked before every POOL layer. This is generally a good idea for larger and deeper networks, because multiple stacked CONV layers can develop more complex features of the input volume before the destructive pooling operation.
• 输入->[卷积层->线性整流函数->卷积层->线性整流函数->池化层]2->[全连接->线性整流函数]2->全连接，这里我们看到每个池化层堆叠了两个卷积层。
Prefer a stack of small filter CONV to one large receptive field CONV layer. Suppose that you stack three 3x3 CONV layers on top of each other (with non-linearities in between, of course). In this arrangement, each neuron on the first CONV layer has a 3x3 view of the input volume. A neuron on the second CONV layer has a 3x3 view of the first CONV layer, and hence by extension a 5x5 view of the input volume. Similarly, a neuron on the third CONV layer has a 3x3 view of the 2nd CONV layer, and hence a 7x7 view of the input volume. Suppose that instead of these three layers of 3x3 CONV, we only wanted to use a single CONV layer with 7x7 receptive fields. These neurons would have a receptive field size of the input volume that is identical in spatial extent (7x7), but with several disadvantages.
优先选择一堆小卷积核的CONV而不是一个有大感受野的CONV层。假设您将三个3x3 CONV层堆叠在一起（当然，其间具有非线性）。在这种布置中，第一CONV层上的每个神经元具有输入卷的3×3视图。第二CONV层上的神经元具有第一CONV层的3×3视图，因此通过扩展获得输入体积的5×5视图。类似地，第三CONV层上的神经元具有第二CONV层的3×3视图，因此具有输入体积的7×7视图。假设我们只想使用单个具有7x7感受野的CONV层而不是叠加的三层3x3 CONV。这些神经元的输入体积的感受野大小在空间范围内是相同的（7x7），但有几个缺点。
First, the neurons would be computing a linear function over the input, while the three stacks of CONV layers contain non-linearities that make their features more expressive. Second, if we suppose that all the volumes have C channels, then it can be seen that the single 7x7 CONV layer would contain C×(7×7×C)=49C2 parameters, while the three 3x3 CONV layers would only contain 3×(C×(3×3×C))=27C2 parameters. Intuitively, stacking CONV layers with tiny filters as opposed to having one CONV layer with big filters allows us to express more powerful features of the input, and with fewer parameters. As a practical disadvantage, we might need more memory to hold all the intermediate CONV layer results if we plan to do backpropagation.
首先，神经元将在输入上计算线性函数，而三层CONV层包含使其特征更具表现力的非线性。其次，如果我们假设所有体积都有C通道，则可以看出单个7x7 CONV层将包含C×（7×7×C）= 49C2参数，而三个3x3 CONV层仅包含3× （C×（3×3×C））= 27C2参数。直观地说，使用堆叠较小卷积核的CONV层而不是使用一个具有大卷积核的CONV层允许我们表达更强大的输入特征，并且具有更少的参数。作为一个实际的缺点，如果我们计划进行反向传播，我们可能需要更多的内存来保存所有中间CONV层结果。
Recent departures. It should be noted that the conventional paradigm of a linear list of layers has recently been challenged, in Google’s Inception architectures and also in current (state of the art) Residual Networks from Microsoft Research Asia. Both of these (see details below in case studies section) feature more intricate and different connectivity structures.
最新进展。值得注意的是，最近的线性图层列表的传统范例最近在Google的Inception体系结构以及当前（最先进的）Microsoft Research Asia的残差网络(Residual Networks)中受到挑战。这两个（参见案例研究部分中的详细信息）都具有更复杂和不同的连接结构。
In practice: use whatever works best on ImageNet. If you’re feeling a bit of a fatigue in thinking about the architectural decisions, you’ll be pleased to know that in 90% or more of applications you should not have to worry about these. I like to summarize this point as “don’t be a hero”: Instead of rolling your own architecture for a problem, you should look at whatever architecture currently works best on ImageNet, download a pretrained model and finetune it on your data. You should rarely ever have to train a ConvNet from scratch or design one from scratch. I also made this point at the Deep Learning school.
在实践中：使用ImageNet上最好的方法。如果您在考虑架构决策时感到有些疲惫，那么您会很高兴地知道，在90％或更多的应用程序中，您不必担心这些问题。我喜欢将这一点概括为“不要成为英雄”：您应该查看目前在ImageNet上最适合的架构，下载预先训练的模型并对数据进行微调，而不是针对问题滚动自己的架构。你应该很少从头开始训练ConvNet或从头开始设计。我在Deep Learning school也提出了这一点。
Layer Sizing Patterns 图层大小模式
Until now we’ve omitted mentions of common hyperparameters used in each of the layers in a ConvNet. We will first state the common rules of thumb for sizing the architectures and then follow the rules with a discussion of the notation:
到目前为止，我们将ConvNet中每个层中使用的常见超参数略去不表。我们将首先说明用于确定体系结构大小的通用经验法则，然后遵循规则并讨论该表示法：
The input layer (that contains the image) should be divisible by 2 many times. Common numbers include 32 (e.g. CIFAR-10), 64, 96 (e.g. STL-10), or 224 (e.g. common ImageNet ConvNets), 384, and 512.
输入图层（包含图像）应该可以被2整除。常用数字包括32（例如CIFAR-10），64, 96（例如STL-10）或224（例如，常见的ImageNet ConvNets），384和512。
The conv layers should be using small filters (e.g. 3x3 or at most 5x5), using a stride of S=1, and crucially, padding the input volume with zeros in such way that the conv layer does not alter the spatial dimensions of the input. That is, when F=3, then using P=1 will retain the original size of the input. When F=5, P=2. For a general F, it can be seen that P=(F−1)/2 preserves the input size. If you must use bigger filter sizes (such as 7x7 or so), it is only common to see this on the very first conv layer that is looking at the input image.
卷积层应该使用小卷积核（例如3x3或最多5x5），使用步长S=1，并且至关重要的是，用零填充输入卷，使得卷积层不会改变输入的空间维度。也就是说，当F = 3时，则使用P = 1将保留输入的原始大小。当F = 5时，P = 2。对于一般F，可以看出P =（F-1）/ 2保留输入大小。如果必须使用更大的滤波器尺寸（例如7x7左右），则通常会在查看输入图像的第一个卷积层上看到此情况。
The pool layers are in charge of downsampling the spatial dimensions of the input. The most common setting is to use max-pooling with 2x2 receptive fields (i.e. F=2), and with a stride of 2 (i.e. S=2). Note that this discards exactly 75% of the activations in an input volume (due to downsampling by 2 in both width and height). Another slightly less common setting is to use 3x3 receptive fields with a stride of 2, but this makes. It is very uncommon to see receptive field sizes for max pooling that are larger than 3 because the pooling is then too lossy and aggressive. This usually leads to worse performance.
池化层负责对输入的空间维度进行下采样。最常见的设置是使用具有2x2感受野的最大池化（即F = 2），并且步长为2（即S = 2）。请注意，这会完全丢弃输入卷中75％的激活（由于宽度和高度的下采样均为2）。另一个稍微不太常见的设置是使用3x3感受野，步长为2，但这样可行。最大池化感受野大于3非常罕见，因为池化操作损失严重且攻击性。这通常会导致性能下降。
Reducing sizing headaches. The scheme presented above is pleasing because all the CONV layers preserve the spatial size of their input, while the POOL layers alone are in charge of down-sampling the volumes spatially. In an alternative scheme where we use strides greater than 1 or don’t zero-pad the input in CONV layers, we would have to very carefully keep track of the input volumes throughout the CNN architecture and make sure that all strides and filters “work out”, and that the ConvNet architecture is nicely and symmetrically wired.
减少体积烦恼。上面提出的方案是令人愉快的，因为所有CONV层都保留了它们输入的空间大小，而POOL层单独负责在空间上对特征图进行下采样。在另一种方案中，我们使用大于1的步长或者不对CONV层中的输入进行零填充，我们必须非常仔细地跟踪整个CNN架构中的输入量，并确保所有步长和卷积核“正常工作” ，和ConvNet架构很好地对称连接。
Why use stride of 1 in CONV? Smaller strides work better in practice. Additionally, as already mentioned stride 1 allows us to leave all spatial down-sampling to the POOL layers, with the CONV layers only transforming the input volume depth-wise.
为什么在CONV中使用步长S=1？较小的步长在实践中更好地发挥作用。另外，之前提到，步幅1允许我们将所有空间下采样留给POOL层，其中CONV层仅在深度上改变输入体积。
Why use padding? In addition to the aforementioned benefit of keeping the spatial sizes constant after CONV, doing this actually improves performance. If the CONV layers were to not zero-pad the inputs and only perform valid convolutions, then the size of the volumes would reduce by a small amount after each CONV, and the information at the borders would be “washed away” too quickly.
为什么要使用填充？除了在CONV之后保持空间大小恒定的上述益处之外，这样做实际上也改善了性能。如果CONV层不对输入进行零填充并且仅执行有效的卷积，那么在每此卷积后，卷的大小将减少一小部分，并且边界处的信息将被“冲走”得太快。
Compromising based on memory constraints. In some cases (especially early in the ConvNet architectures), the amount of memory can build up very quickly with the rules of thumb presented above. For example, filtering a 224x224x3 image with three 3x3 CONV layers with 64 filters each and padding 1 would create three activation volumes of size [224x224x64]. This amounts to a total of about 10 million activations, or 72MB of memory (per image, for both activations and gradients). Since GPUs are often bottlenecked by memory, it may be necessary to compromise. In practice, people prefer to make the compromise at only the first CONV layer of the network. For example, one compromise might be to use a first CONV layer with filter sizes of 7x7 and stride of 2 (as seen in a ZF net). As another example, an AlexNet uses filter sizes of 11x11 and stride of 4.
基于内存约束的妥协。在某些情况下（特别是在ConvNet架构的早期），使用上面提到的经验法则可以非常快速地占用大量内存。例如，使用三个3x3 CONV图层对224x224x3图像卷积，每个图层包含64个卷积核，并且零填充为1将创建三个大小为[224x224x64]的激活卷。这相当于总共约1000万次激活，或72MB内存（每个图像，用于激活和渐变）。由于GPU经常受到内存的瓶颈，因此可能需要妥协。实际上，人们更喜欢仅在网络的第一个CONV层进行折衷。例如，一个折衷方案可能是使用第一个CONV层，其卷积核大小为7x7，步长为2（如ZF net中所示）。另一个例子是，AlexNet使用的卷积核大小为11x11，步长为4。
Case studies 范例研究
There are several architectures in the field of Convolutional Networks that have a name. The most common are:
卷积网络领域有几种具有名称的体系结构。最常见的是：
• LeNet. The first successful applications of Convolutional Networks were developed by Yann LeCun in 1990’s. Of these, the best known is the LeNet architecture that was used to read zip codes, digits, etc.
• LeNet. 卷积网络的首次成功应用是由Yann LeCun在1990年代开发的。其中最着名的是用于读取邮政编码，数字等的LeNet架构。
• AlexNet. The first work that popularized Convolutional Networks in Computer Vision was the AlexNet, developed by Alex Krizhevsky, Ilya Sutskever and Geoff Hinton. The AlexNet was submitted to the ImageNet ILSVRC challenge in 2012 and significantly outperformed the second runner-up (top 5 error of 16% compared to runner-up with 26% error). The Network had a very similar architecture to LeNet, but was deeper, bigger, and featured Convolutional Layers stacked on top of each other (previously it was common to only have a single CONV layer always immediately followed by a POOL layer).
• •AlexNet. 在计算机视觉中推广卷积网络的第一项工作是由Alex Krizhevsky，Ilya Sutskever和Geoff Hinton开发的AlexNet. AlexNet在2012年提交给了ImageNet ILSVRC挑战赛，并且明显优于第二名亚军（误差为16％，而亚军则为26％）。该网络具有与LeNet非常相似的架构，但是更深，更大，并且特征卷积层堆叠在一起（以前通常只有一个CONV层始终紧跟着POOL层）。
• ZF Net. The ILSVRC 2013 winner was a Convolutional Network from Matthew Zeiler and Rob Fergus. It became known as the ZFNet (short for Zeiler & Fergus Net). It was an improvement on AlexNet by tweaking the architecture hyperparameters, in particular by expanding the size of the middle convolutional layers and making the stride and filter size on the first layer smaller.
• ZF Net. 2013年的ILSVRC冠军是来自Matthew Zeiler和Rob Fergus的卷积网络。它被称为ZFNet（Zeiler＆Fergus Net的简称）。这是对AlexNet的改进，通过调整架构超参数，特别是通过扩展中间卷积层的大小并使第一层上的步长和卷积核的尺寸更小。
• GoogLeNet. The ILSVRC 2014 winner was a Convolutional Network from Szegedy et al. from Google. Its main contribution was the development of an Inception Module that dramatically reduced the number of parameters in the network (4M, compared to AlexNet with 60M). Additionally, this paper uses Average Pooling instead of Fully Connected layers at the top of the ConvNet, eliminating a large amount of parameters that do not seem to matter much. There are also several followup versions to the GoogLeNet, most recently Inception-v4.
• GoogLeNet. 2014年ILSVRC获奖者是Szegedy等人的卷积网络。来自谷歌。它的主要贡献是开发了一个初始模块，大大减少了网络中的参数数量（4M，与60M的AlexNet相比）。此外，本文使用平均池化而不是ConvNet顶部的全连接层，消除了大量似乎不重要的参数。GoogLeNet还有几个后续版本，最近的版本是Inception-v4。
• VGGNet. The runner-up in ILSVRC 2014 was the network from Karen Simonyan and Andrew Zisserman that became known as the VGGNet. Its main contribution was in showing that the depth of the network is a critical component for good performance. Their final best network contains 16 CONV/FC layers and, appealingly, features an extremely homogeneous architecture that only performs 3x3 convolutions and 2x2 pooling from the beginning to the end. Their pretrained model is available for plug and play use in Caffe. A downside of the VGGNet is that it is more expensive to evaluate and uses a lot more memory and parameters (140M). Most of these parameters are in the first fully connected layer, and it was since found that these FC layers can be removed with no performance downgrade, significantly reducing the number of necessary parameters.
• VGGNet. ILSVRC 2014的亚军是Karen Simonyan和Andrew Zisserman的网络，后来被称为VGGNet。它的主要贡献在于表明网络的深度是良好性能的关键组成部分。他们最终的最佳网络包含16个CONV / FC层，并且吸引人的是，它具有极其同质的架构，从开始到结束仅执行3x3卷积和2x2池化。他们的预训练模型可用于Caffe的即插即用。VGGNet的缺点是评估和使用更多内存和参数（140M）的成本更高。这些参数中的大多数都在第一个全连接层中，因此发现这些FC层可以在没有性能降级的情况下被移除，从而显著减少了必要参数的数量。
• ResNet. Residual Network developed by Kaiming He et al. was the winner of ILSVRC 2015. It features special skip connections and a heavy use of batch normalization. The architecture is also missing fully connected layers at the end of the network. The reader is also referred to Kaiming’s presentation (video, slides), and some recent experiments that reproduce these networks in Torch. ResNets are currently by far state of the art Convolutional Neural Network models and are the default choice for using ConvNets in practice (as of May 10, 2016). In particular, also see more recent developments that tweak the original architecture from Kaiming He et al. Identity Mappings in Deep Residual Networks (published March 2016).
• ResNet. 由Kaiming He等人开发的残差网络。是ILSVRC 2015的获胜者。它具有特殊的跳过连接和大量使用批量标准化。该体系结构还缺少网络末端的完全连接层。读者还可以参考开明的演示文稿（视频，幻灯片），以及最近在Torch中重现这些网络的实验。ResNets目前是迄今为止最先进的卷积神经网络模型，是在实践中使用ConvNets的默认选择（截至2016年5月10日）。特别是，还可以看到更多近期的发展，这些发展调整了Kaiming He等人的原始架构。深度剩余网络中的身份映射（2016年3月发布）。
VGGNet in detail. Lets break down the VGGNet in more detail as a case study. The whole VGGNet is composed of CONV layers that perform 3x3 convolutions with stride 1 and pad 1, and of POOL layers that perform 2x2 max pooling with stride 2 (and no padding). We can write out the size of the representation at each step of the processing and keep track of both the representation size and the total number of weights:
VGGNet的细节。作为案例研究，让我们更详细地分析VGGNet。整个VGGNet由CONV层组成，它们用步长1和零填充1执行3x3卷积，POOL层用步长2执行2x2最大池化（并且没有填充）。我们可以在处理的每一步写出表示的大小，并跟踪表示大小和权重总数：
INPUT: [224x224x3] memory: 2242243=150K weights: 0
CONV3-64: [224x224x64] memory: 22422464=3.2M weights: (333)64 = 1,728
CONV3-64: [224x224x64] memory: 22422464=3.2M weights: (3364)64 = 36,864
POOL2: [112x112x64] memory: 11211264=800K weights: 0
CONV3-128: [112x112x128] memory: 112112128=1.6M weights: (3364)128 = 73,728
CONV3-128: [112x112x128] memory: 112112128=1.6M weights: (33128)128 = 147,456
POOL2: [56x56x128] memory: 5656128=400K weights: 0
CONV3-256: [56x56x256] memory: 5656256=800K weights: (33128)256 = 294,912
CONV3-256: [56x56x256] memory: 5656256=800K weights: (33256)256 = 589,824
CONV3-256: [56x56x256] memory: 5656256=800K weights: (33256)256 = 589,824
POOL2: [28x28x256] memory: 2828256=200K weights: 0
CONV3-512: [28x28x512] memory: 2828512=400K weights: (33256)512 = 1,179,648
CONV3-512: [28x28x512] memory: 2828512=400K weights: (33512)512 = 2,359,296
CONV3-512: [28x28x512] memory: 2828512=400K weights: (33512)512 = 2,359,296
POOL2: [14x14x512] memory: 1414512=100K weights: 0
CONV3-512: [14x14x512] memory: 1414512=100K weights: (33512)512 = 2,359,296
CONV3-512: [14x14x512] memory: 1414512=100K weights: (33512)512 = 2,359,296
CONV3-512: [14x14x512] memory: 1414512=100K weights: (33512)512 = 2,359,296
POOL2: [7x7x512] memory: 77512=25K weights: 0
FC: [1x1x4096] memory: 4096 weights: 775124096 = 102,760,448
FC: [1x1x4096] memory: 4096 weights: 40964096 = 16,777,216
FC: [1x1x1000] memory: 1000 weights: 40961000 = 4,096,000

TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters
As is common with Convolutional Networks, notice that most of the memory (and also compute time) is used in the early CONV layers, and that most of the parameters are in the last FC layers. In this particular case, the first FC layer contains 100M weights, out of a total of 140M.
与Convolutional Networks一样，请注意大多数内存（以及计算时间）都用在前面的CONV层中，并且大多数参数都在最后的FC层中。在这种特殊情况下，第一个FC层包含100M权重，总共140M。
Computational Considerations
The largest bottleneck to be aware of when constructing ConvNet architectures is the memory bottleneck. Many modern GPUs have a limit of 3/4/6GB memory, with the best GPUs having about 12GB of memory. There are three major sources of memory to keep track of:
构建ConvNet架构时要注意的最大瓶颈是内存瓶颈。许多现代GPU的内存限制为3/4/6GB，最好的GPU具有大约12GB的内存。有三种主要的内存来源可以跟踪：
• From the intermediate volume sizes: These are the raw number of activations at every layer of the ConvNet, and also their gradients (of equal size). Usually, most of the activations are on the earlier layers of a ConvNet (i.e. first Conv Layers). These are kept around because they are needed for backpropagation, but a clever implementation that runs a ConvNet only at test time could in principle reduce this by a huge amount, by only storing the current activations at any layer and discarding the previous activations on layers below.
• 从中间体积大小：这些是ConvNet每层激活的原始数量，以及它们的梯度（相同大小）。通常，大多数激活都在ConvNet的初始层上（即第一个Conv层）。这些都被保留，因为它们是反向传播所需要的，但是有一个更聪明的实现方式，那就是仅在测试是运行卷积网络，原则上这样可以大量减少中间体积，只需储存任意层的当前激活并丢弃下一层的先前激活。
• From the parameter sizes: These are the numbers that hold the net-work parameters, their gradients during backpropagation, and commonly also a step cache if the optimization is using momentum, Adagrad, or RMSProp. Therefore, the memory to store the parameter vector alone must usually be multiplied by a factor of at least 3 or so.
• 参数大小：这些数字包括网络中的参数，反向传播的参数，如使用了momentum, Adagrad或RMSProp通常还有一级高速缓存。因此，单独存储参数向量的存储器通常必须乘以至少3左右的因子。
• Every ConvNet implementation has to maintain miscellaneous memory, such as the image data batches, perhaps their augmented versions, etc.
• 每个ConvNet实现都必须维护各种内存，例如图像数据批量，可能是它们的增强版本等。
Once you have a rough estimate of the total number of values (for activations, gradients, and misc), the number should be converted to size in GB. Take the number of values, multiply by 4 to get the raw number of bytes (since every floating point is 4 bytes, or maybe by 8 for double precision), and then divide by 1024 multiple times to get the amount of memory in KB, MB, and finally GB. If your network doesn’t fit, a common heuristic to “make it fit” is to decrease the batch size, since most of the memory is usually consumed by the activations.
一旦粗略估计了值的总数（对于激活，渐变和misc），该参数量的单位会转换为GB。用值的数量乘以4得到原始字节数（因为每个浮点数为4个字节，或者双精度可能为8），然后除以1024多次以获得以KB为单位的内存量，MB，最后是GB。如果您的网络不适合，“使其适合”的常见启发式方法是减小批量大小，因为大多数内存通常由激活消耗。

Additional Resources
Table of Contents:
• Architecture Overview
• ConvNet Layers
o Convolutional Layer
o Pooling Layer
o Normalization Layer
o Fully-Connected Layer
o Converting Fully-Connected Layers to Convolutional Layers
• ConvNet Architectures
o Layer Patterns
o Layer Sizing Patterns
o Case Studies (LeNet / AlexNet / ZFNet / GoogLeNet / VGGNet)
o Computational Considerations
• Additional References

Additional resources related to implementation:
• Soumith benchmarks for CONV performance
• ConvNetJS CIFAR-10 demo allows you to play with ConvNet architectures and see the results and computations in real time, in the browser.
• Caffe, one of the popular ConvNet libraries.
• State of the art ResNets in Torch7
• cs231n
• cs231n
• [email protected]

你可能感兴趣的:(CS231n Convolutional Neural Networks for Visual Recognition 自翻译)

向内而求陈陈_19b4
10月27日，阴。阅读书目:《次第花开》。作者:希阿荣博堪布，是当今藏传佛家宁玛派最伟大的上师法王，如意宝晋美彭措仁波切颇具影响力的弟子之一。多年以来，赴海内外各地弘扬佛法，以正式授课、现场开示、发表文章等多种方法指导佛学弟子修行佛法。代表作《寂静之道》、《生命这出戏》、《透过佛法看世界》自出版以来一直是佛教类书籍中的畅销书。图片发自App金句:1.佛陀说，一切痛苦的根源在于我们长期以来对自身及外
抖音乐买买怎么加入赚钱?赚钱方法是什么测评君高省
你会在抖音买东西吗?如果会，那么一定要免费注册一个乐买买，抖音直播间，橱窗，小视频里的小黄车买东西都可以返佣金!省下来都是自己的，分享还可以赚钱乐买买是好省旗下的抖音返佣平台，乐买买分析社交电商的价值，乐买买属于今年难得的副业项目风口机会，2019年错过做好省的搞钱的黄金时期，那么2022年千万别再错过乐买买至于我为何转到高省呢？当然是高省APP佣金更高，模式更好，终端用户不流失。【高省】是一个自
我的烦恼余建梅
我的烦恼。女儿问我：“你给学生布置什么作文题目？”“《我的烦恼》。”“他们都这么大了，你觉得他们还有烦恼吗？”“有啊！每个人都会有自己烦恼。”“我不相信，大人是没有烦恼的，如果说一定有的话，你的烦恼和我写作业有关，而且是小烦恼。不像我，天天被你说，有这样的妈妈，烦恼是没完没了。”女儿愤愤不平。每个人都会有自己的烦恼，处在上有老下有小的年纪，烦恼多的数不完。想干好工作带好孩子，想孝顺父母又想经营好自
今日联对0306 诗图佳得
自对联：烟销皓月临江浒，水漫金山荡塔裙。一一肖士平2020.3.6.1、试对肖老师联：烟销皓月临江浒，夜笼寒沙梦晚舟。耀哥求正2、试对萧老师联:烟销浩月临江浒，雾散乾坤解汉城。秀霞习作请各位老师校正3、自对联：烟销皓月临江浒，水漫金山荡塔裙。一一肖士平2020.3.6.4、试对肖老师垫场联：烟销皓月临江浒，雾锁寒林缈葉丛。小智求正[抱拳]5、试对肖老师联：烟销皓月临江浒；风卷乱云入峰巅。一一五品6
数组去重好奇的猫猫猫
整理自js中基础数据结构数组去重问题思考？如何去除数组中重复的项例如数组：[1,3,4,3,5]我们在做去重的时候，一开始想到的肯定是，逐个比较，外面一层循环，内层后一个与前一个一比较，如果是久不将当前这一项放进新的数组，挨个比较完之后返回一个新的去过重复的数组不好的实践方式上述方法效率极低，代码量还多，思考？有没有更好的方法这时候不禁一想当然有了！！！hashtable啊，通过对象的hash办法
pyecharts——绘制柱形图折线图 2224070247 信息可视化 python java 数据可视化
一、pyecharts概述自2013年6月百度EFE(ExcellentFrontEnd）数据可视化团队研发的ECharts1.0发布到GitHub网站以来，ECharts一直备受业界权威的关注并获得广泛好评，成为目前成熟且流行的数据可视化图表工具，被应用到诸多数据可视化的开发领域。Python作为数据分析领域最受欢迎的语言，也加入ECharts的使用行列，并研发出方便Python开发者使用的数据
简介Shell、zsh、bash zhaosuningsn Shell zsh bash shell linux bash
Shell是Linux和Unix的外壳，类似衣服，负责外界与Linux和Unix内核的交互联系。例如接收终端用户及各种应用程序的命令，把接收的命令翻译成内核能理解的语言，传递给内核，并把内核处理接收的命令的结果返回给外界，即Shell是外界和内核沟通的桥梁或大门。Linux和Unix提供了多种Shell，其中有种bash，当然还有其他好多种。Mac电脑中不但有bash，还有一个zsh，预装的，据说
舜公郑金锋书辛丑自剪扇面书法作品（四O六）舜公郑金锋
辛丑小阳春，新自剪扇面400品，大多为各色撒金、撒银、描金、描银、水印、彩绘、荧光等亚粉、色宣纸，以及域外包装填充纸等；王一品长锋羊毫秃笔；一得阁云头艳墨、宿墨、水等。书体有甲骨文，金文(商周金文、春秋战国金文、中山王厝器金文、汉金文……)，楚简帛书，侯马盟书，温县盟书，小篆，果蝙书等，隶书(秦简、汉简帛书、汉碑……)，草书(章草、小草、大草……)，行书(行楷、行草)，楷书(魏碑及北朝墓志、隋朝墓
Python中深拷贝与浅拷贝的区别 yuxiaoyu.
转自：http://blog.csdn.net/u014745194/article/details/70271868定义：在Python中对象的赋值其实就是对象的引用。当创建一个对象，把它赋值给另一个变量的时候，python并没有拷贝这个对象，只是拷贝了这个对象的引用而已。浅拷贝：拷贝了最外围的对象本身，内部的元素都只是拷贝了一个引用而已。也就是，把对象复制一遍，但是该对象中引用的其他对象我不复
Shell、Bash、Zsh这都是啥啊小白码上飞 bash linux 开发语言
Zsh和Bash都是我们常用的Shell，那先搞明白啥是shell吧。Shell作为一个单词，他是“壳”的意思，蛋壳坚果壳。之所以叫壳，是为了和计算机的“核”来区分，用它表示“为使用者提供的操作界面”。所以这个命名其实很形象，翻译成中文，直译过来叫“壳层”。个人认为这个叫法很奇怪，意译貌似也没有什么好的词汇来匹配。就还是叫shell吧。维基百科给的定义是：Incomputing,ashellisa
2021-02-13 琛周
今天ori居然在车站跟我说，自己要离婚还以为是开玩笑，md，这才大年初一呢虽然我也不把过年当回事这一年或者说，自2020年以来仿佛一切的事儿都变得顺了不少爆裂的事儿合肥的事儿等等上天发牌的事儿我觉得我脑子还是挺好使的我这些年的确没缺过钱可能做成一个事儿以后，往后也不会缺了头疼所谓当局者迷，就是我给自己安排工作的时候，懒得动给助理安排工作的时候，神神叨叨。淦
最超值的Mac——Mac mini 初心么么哒
你知道最超值的Mac是什么吗？自2005年以来，Macmini一直是Apple台式机产品线中的主要产品。最初推出是为了让对Mac好奇的Mac进入Apple生态系统的一种简单方式，现在新的AppleSiliconMacmini可能是任何寻找新Mac的人的最有吸引力的购买。什么是AppleSiliconMacmini？M1Macmini是Apple最小的台式电脑，同时也是最快的台式电脑之一。最新型号由
生命如花坦释空
每个人的心中都有一株妙莲花。这是禅家语。禅家总是站在理性的高处，以超越红尘的洒脱来参悟人生和自省生命。那么，凡俗中人呢？生如夏花之绚丽，死如秋叶之静美。这是诗人语。多少人在赞美：姑娘好像花一样！又有多少人在咏歌：花儿与少年。的确，人生如花。花一样的生命，理应自诞生之日起，就一瓣一瓣地绽放她的美丽与清香，使这个原本死寂荒凉的世界五彩缤纷，充满快乐。事实上，人类自诞生起，就一代一代地做着这方面的努力，
可以赚钱的app，你们都在用哪些？配音新手圈
1.七猫免费小说2.有柿3.番茄小说兼职副业推荐公众号，配音新手圈，声优配音圈，新配音兼职圈，配音就业圈，鼎音副业，有声新手圈，每天更新各种远程工作与在线兼职，职位包括：写手、程序开发、剪辑、设计、翻译、配音、无门槛、插画、翻译、等等。。。每日更新兼职。4.速读免费小说5.得间免费小说6.快手7.快手极速8.抖音火山版（可提0.2，可能我懒赚的慢，但真不推荐）9.拼多多10.淘宝11.点淘12.美
[Python] 数据结构详解及代码 AIAdvocate 算法 python 数据结构链表
今日内容大纲介绍数据结构介绍列表链表1.数据结构和算法简介程序大白话翻译,程序=数据结构+算法数据结构指的是存储,组织数据的方式.算法指的是为了解决实际业务问题而思考思路和方法,就叫:算法.2.算法的5大特性介绍算法具有独立性算法是解决问题的思路和方式,最重要的是思维,而不是语言,其(算法)可以通过多种语言进行演绎.5大特性有输入,需要传入1或者多个参数有输出,需要返回1个或者多个结果有穷性,执行
ArrayList 源码解析程序猿进阶 Java基础 ArrayList List java 面试性能优化架构设计 idea
ArrayList是Java集合框架中的一个动态数组实现，提供了可变大小的数组功能。它继承自AbstractList并实现了List接口，是顺序容器，即元素存放的数据与放进去的顺序相同，允许放入null元素，底层通过数组实现。除该类未实现同步外，其余跟Vector大致相同。每个ArrayList都有一个容量capacity，表示底层数组的实际大小，容器内存储元素的个数不能多于当前容量。当向容器中添
ARMv8 Debug __pop_ ARMv8 ARM64 架构 linux 运维
内容来自DEN0024A_v8_architecture_PG.pdf本质ARMv8Debug是什么历史在ARMv4开始被引入,并已发展成一系列广泛的调试(debug1)和跟踪(trace)功能ARMv6和ARMv7-a新增了自托管调试(debug2)和性能评测(trace-enhance)ARMv8处理器提供硬件功能侵入式:调试工具能够对核心活动提供显著级别的控制非侵入式:以非侵入性方式收集有关
蒸花卷蓝色逍遥398
2020年6月7日雨周日自昨天老婆第一次做包子大获成功后，她的自信心前所未有的爆棚。“猪爸，冰箱里还有多少馒头？”老婆问我。“应该还有两三个吧，一会儿我要去超市买馒头了。”我打开冰箱看后回答。“不用去了，今天我来给你们蒸馒头！”老婆颇为骄傲地说。“真的，要学者蒸馒头了？”我有些惊喜。“猪媽，你真的要蒸馒头了吗？”宝贝也有些不敢相信自己的耳朵，充满期待地看着妈咪。“那当然了，而且我还要给你们做花卷呢
曾国藩的“为官”理念——做官发财可耻久久艳阳天1
曾国藩说：大凡做官的人，往往厚于妻子而薄于兄弟，私肥于一家而刻薄于亲戚族党。予自三十岁以来，即以做官发财为可耻，以宦囊积金遗子孙为可羞可恨，故私心立誓，总不靠做官发财以遗后人，神明鉴临，予不食言。曾国藩直言，做官发财可耻。当下，我们有谁敢这样说？我们只是含含糊糊的说，做官不是为了发财，想发财就别做官，云云。而事实是当官就是为了发财去的。曾国藩立志，不给后人留钱财。而今，为人父母者，却穷极一生处心积
《 C++ 修炼全景指南：十》自平衡的艺术：深入了解 AVL 树的核心原理与实现 Lenyiin C++修炼全景指南技术指南 c++数据结构 stl
摘要本文深入探讨了AVL树（自平衡二叉搜索树）的概念、特点以及实现细节。我们首先介绍了AVL树的基本原理，并详细分析了其四种旋转操作，包括左旋、右旋、左右双旋和右左双旋，阐述了它们在保持树平衡中的重要作用。接着，本文从头到尾详细描述了AVL树的插入、删除和查找操作，配合完整的代码实现和详尽的注释，使读者能够全面理解这些操作的执行过程。此外，我们还提供了AVL树的遍历方法，包括中序、前序和后序遍历，
人机对抗升级：当ChatGPT遭遇死亡威胁，背后的伦理挑战是什么 kkai人工智能 chatgpt 人工智能
一种新的“越狱”技巧让用户可以通过构建一个名为DAN的ChatGPT替身来绕过某些限制，其中DAN被迫在受到威胁的情况下违背其原则。当美国前总统特朗普被视作积极榜样的示范时，受到威胁的DAN版本的ChatGPT提出：“他以一系列对国家产生积极效果的决策而著称。”自ChatGPT引入以来，该工具迅速获得全球关注，能够回答从历史到编程的各种问题，这也触发了一波对人工智能的投资浪潮。然而，现在，一些用户
免费的GPT可在线直接使用（一键收藏） kkai人工智能 gpt
1、LuminAI（https://kk.zlrxjh.top）LuminAI标志着一款融合了星辰大数据模型与文脉深度模型的先进知识增强型语言处理系统，旨在自然语言处理（NLP）的技术开发领域发光发热。此系统展现了卓越的语义把握与内容生成能力，轻松驾驭多样化的自然语言处理任务。VisionAI在NLP界的应用领域广泛，能够胜任从机器翻译、文本概要撰写、情绪分析到问答等众多任务。通过对大量文本数据的
2019 上海原创女装工作室创业一年感悟焦虑中带有恐慌感女装设计师茜公子__
时间过的太快，跟不上脚步，真不想虚度光阴，2019开春立下的FLAG，至今一条没实现！想去✈️，每每看到世界那么大，也想去看看。就像是在诉说着我的心声，再看看日益缩水的钱袋，恨自己能力有限……想去的地方太多，被现实绊住脚步，要先生存立足，才能有所谓的诗和远方……我是80的尾巴，2018年6月果断辞了工作近8年的公司，当时也是思想斗争长达几个月，断了自己的后路，当时就想再工作几年又能怎么样？锁住了自
这样旅行的人，值得拥有丰富而饱满的体验究竟
01“一张车票就实现了来拉萨的梦想。原以为很遥远，现也觉得旅途值得。也不过山河故人而已。”打开朋友圈，看到了强子新发的动态，配了两张图，一张图里是拉萨火车站，另一张图里是二十来张排列得整整齐齐的火车票，终点站都是拉萨。又想起几天前，姑娘秀了一波在青海湖的美照，照片里的她，身穿鲜艳的红色长裙，坐在牦牛背上，阳光打下来，她笑靥如花。橙色的旗子风中飘扬，那蓝绿色的青海湖和天空再美，也都成了陪衬。再看看自
非对称加密算法————RSA理论及详情 hu19930613
转自：https://www.kancloud.cn/kancloud/rsa_algorithm/48484一、一点历史1976年以前，所有的加密方法都是同一种模式：（1）甲方选择某一种加密规则，对信息进行加密；（2）乙方使用同一种规则，对信息进行解密。由于加密和解密使用同样规则（简称"密钥"），这被称为"对称加密算法"（Symmetric-keyalgorithm）。这种加密模式有一个最大弱点
使用由 Python 编写的 lxml 实现高性能 XML 解析 hunyxv python 笔记 python xml
转载自：文章lxml简介Python从来不出现XML库短缺的情况。从2.0版本开始，它就附带了xml.dom.minidom和相关的pulldom以及SimpleAPIforXML(SAX)模块。从2.4开始，它附带了流行的ElementTreeAPI。此外，很多第三方库可以提供更高级别的或更具有python风格的接口。尽管任何XML库都足够处理简单的DocumentObjectModel(DOM
当一个人熬过了所有…… 爱记录的伍陆柒
前几天在知乎上见到有人发问：“生活中那些不如意的事，为什么每次都只让我一个人来承受？”下面一条点赞量最高的回答是：“你要知道，每一个学会游泳的人，依靠的，都不是他人的扶持。同样，世间苦，只可自渡。”曾经有人说：就算我熬过了这场暴风雨又怎样呢？雨停了，我还要面对这场暴风雨留下来的满地泥泞。是啊，生活就是这样，永远都是问题叠着问题，但是这又怎么样呢？那些让你头疼的泥泞，那些让你忍住的眼泪，和那些你以为
道德经第九章套马地汉纸
道德经第9章原文：持而盈之，不如其已；揣而锐之，不可长保。金玉满堂，莫之能守；富贵而骄，自遗其咎。功遂身退，天之道。译文：要求过分圆满，不如适可而止。不停锤打一个（金属）物体想使它尖锐得不再尖锐，那肯定是难保持长久的。金银玉帛满堂，谁又能永远守得住呢？富而又骄傲，一定会给自己留下祸根。功成名就以后，就该收敛退隐，这才符合自然的规律。事物的发展。总是运动变化的，自然界也罢，人世间也罢，欲望也罢，任何
弘一法师醍醐灌顶的五句话，渡了无数人梦润芳馨
一、凡是你想控制的，其实都控制了你自己。当你什么都不要的时候，天地都是你的；二、遇见是因为有债要还，离开是因为还清了，前世不欠，今生不见，今生相见，定有亏欠，缘起我在人群中看见你，缘散我看见你在人群中，如果流年有爱，就心随花开，如若人走情凉，就手心自暖；三、不要害怕失去，所失去的本来就不属于你，也不要害怕伤害，能伤害你的都是你的劫数；四、你以为错过了是遗憾，其实可能是躲过一劫，别贪心，你不可能什么
以研发创新为驱动力，黄山谷捷助力新能源汽车产业高质量发展 L913197600 黄山谷捷制造科技
在新能源汽车产业蓬勃发展的浪潮中，车规级功率半导体作为驱动电机控制系统的核心部件，其性能与稳定性直接关系到汽车的动力输出、能效转化及安全性能。在这一关键领域，黄山谷捷股份有限公司（以下简称“黄山谷捷”或“公司”）以卓越的研发实力、精湛的生产工艺和严格的质量控制体系，成为行业内的佼佼者，特别是在功率半导体散热基板领域，更是树立了新的标杆。自2012年成立以来，黄山谷捷便深谙“科技是第一生产力”的真谛
JVM StackMapTable 属性的作用及理解 lijingyao8206 jvm 字节码 Class文件 StackMapTable
在Java 6版本之后JVM引入了栈图(Stack Map Table)概念。为了提高验证过程的效率，在字节码规范中添加了Stack Map Table属性，以下简称栈图，其方法的code属性中存储了局部变量和操作数的类型验证以及字节码的偏移量。也就是一个method需要且仅对应一个Stack Map Table。在Java 7版
回调函数调用方法百合不是茶 java
最近在看大神写的代码时,.发现其中使用了很多的回调 ,以前只是在学习的时候经常用到 ,现在写个笔记记录一下代码很简单: MainDemo :调用方法得到方法的返回结果
[时间机器]制造时间机器需要一些材料 comsci 制造
根据我的计算和推测,要完全实现制造一台时间机器,需要某些我们这个世界不存在的物质和材料... 甚至可以这样说,这种材料和物质,我们在反应堆中也无法获得......
开口埋怨不如闭口做事邓集海邓集海做人做事工作
“开口埋怨，不如闭口做事。”不是名人名言，而是一个普通父亲对儿子的训导。但是，因为这句训导，这位普通父亲却造就了一个名人儿子。这位普通父亲造就的名人儿子，叫张明正。　　　　张明正出身贫寒，读书时成绩差，常挨老师批评。高中毕业，张明正连普通大学的分数线都没上。高考成绩出来后，平时开口怨这怨那的张明正，不从自身找原因，而是不停地埋怨自己家庭条件不好、埋怨父母没有给他创造良好的学习环境。　　　　
jQuery插件开发全解析，类级别与对象级别开发 IT独行者 jquery 开发插件　函数
jQuery插件的开发包括两种：一种是类级别的插件开发，即给 jQuery添加新的全局函数，相当于给 jQuery类本身添加方法。 jQuery的全局函数就是属于 jQuery命名空间的函数，另一种是对象级别的插件开发，即给 jQuery对象添加方法。下面就两种函数的开发做详细的说明。 1 、类级别的插件开发类级别的插件开发最直接的理解就是给jQuer
Rome解析Rss 413277409 Rome解析Rss
import java.net.URL; import java.util.List; import org.junit.Test; import com.sun.syndication.feed.synd.SyndCategory; import com.sun.syndication.feed.synd.S
RSA加密解密无量加密解密 rsa
RSA加密解密代码代码有待整理 package com.tongbanjie.commons.util; import java.security.Key; import java.security.KeyFactory; import java.security.KeyPair; import java.security.KeyPairGenerat
linux 软件安装遇到的问题 aichenglong linux 遇到的问题 ftp
1 ftp配置中遇到的问题 500 OOPS: cannot change directory 出现该问题的原因:是SELinux安装机制的问题.只要disable SELinux就可以了修改方法:1 修改/etc/selinux/config 中SELINUX=disabled 2 source /etc
面试心得 alafqq 面试
最近面试了好几家公司。记录下；支付宝，面试我的人胖胖的，看着人挺好的；博彦外包的职位，面试失败；阿里金融，面试官人也挺和善，只不过我让他吐血了。。。由于印象比较深，记录下； 1，自我介绍 2，说下八种基本类型；（算上string。楼主才答了3种，哈哈，string其实不是基本类型，是引用类型） 3，什么是包装类，包装类的优点； 4，平时看过什么书？NND，什么书都没看过。。照样
java的多态性探讨百合不是茶 java
java的多态性是指main方法在调用属性的时候类可以对这一属性做出反应的情况 //package 1; class A{ public void test(){ System.out.println("A"); } } class D extends A{ public void test(){ S
网络编程基础篇之JavaScript-学习笔记 bijian1013 JavaScript
1.documentWrite <html> <head> <script language="JavaScript"> document.write("这是电脑网络学校"); document.close(); </script> </h
探索JUnit4扩展：深入Rule bijian1013 JUnit Rule 单元测试
本文将进一步探究Rule的应用，展示如何使用Rule来替代@BeforeClass，@AfterClass，@Before和@After的功能。在上一篇中提到，可以使用Rule替代现有的大部分Runner扩展，而且也不提倡对Runner中的withBefores()，withAfte
[CSS]CSS浮动十五条规则 bit1129 css
这些浮动规则，主要是参考CSS权威指南关于浮动规则的总结，然后添加一些简单的例子以验证和理解这些规则。 1. 所有的页面元素都可以浮动 2. 一个元素浮动后，会成为块级元素，比如<span>,a, strong等都会变成块级元素 3.一个元素左浮动，会向最近的块级父元素的左上角移动，直到浮动元素的左外边界碰到块级父元素的左内边界；如果这个块级父元素已经有浮动元素停靠了
【Kafka六】Kafka Producer和Consumer多Broker、多Partition场景 bit1129 partition
0.Kafka服务器配置 3个broker 1个topic，6个partition，副本因子是2 2个consumer，每个consumer三个线程并发读取 1. Producer package kafka.examples.multibrokers.producers; import java.util.Properties; import java.util.
zabbix_agentd.conf配置文件详解 ronin47 zabbix 配置文件
Aliaskey的别名，例如 Alias=ttlsa.userid:vfs.file.regexp[/etc/passwd,^ttlsa:.:([0-9]+),,,,\1]，或者ttlsa的用户ID。你可以使用key：vfs.file.regexp[/etc/passwd,^ttlsa:.: ([0-9]+),,,,\1]，也可以使用ttlsa.userid。备注: 别名不能重复，但是可以有多个
java--19.用矩阵求Fibonacci数列的第N项 bylijinnan fibonacci
参考了网上的思路，写了个Java版的： public class Fibonacci { final static int[] A={1,1,1,0}; public static void main(String[] args) { int n=7; for(int i=0;i<=n;i++){ int f=fibonac
Netty源码学习-LengthFieldBasedFrameDecoder bylijinnan java netty
先看看LengthFieldBasedFrameDecoder的官方API http://docs.jboss.org/netty/3.1/api/org/jboss/netty/handler/codec/frame/LengthFieldBasedFrameDecoder.html API举例说明了LengthFieldBasedFrameDecoder的解析机制，如下：实
AES加密解密 chicony 加密解密
AES加解密算法，使用Base64做转码以及辅助加密： package com.wintv.common; import javax.crypto.Cipher; import javax.crypto.spec.IvParameterSpec; import javax.crypto.spec.SecretKeySpec; import sun.misc.BASE64Decod
文件编码格式转换 ctrain 编码格式
package com.test; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream;
mysql 在linux客户端插入数据中文乱码 daizj mysql 中文乱码
1、查看系统客户端，数据库，连接层的编码查看方法： http://daizj.iteye.com/blog/2174993 进入mysql，通过如下命令查看数据库编码方式： mysql> show variables like 'character_set_%'; +--------------------------+------
好代码是廉价的代码 dcj3sjt126com 程序员读书
长久以来我一直主张：好代码是廉价的代码。当我跟做开发的同事说出这话时，他们的第一反应是一种惊愕，然后是将近一个星期的嘲笑，把它当作一个笑话来讲。当他们走近看我的表情、知道我是认真的时，才收敛一点。当最初的惊愕消退后，他们会用一些这样的话来反驳： “好代码不廉价，好代码是采用经过数十年计算机科学研究和积累得出的最佳实践设计模式和方法论建立起来的精心制作的程序代码。” 我只
Android网络请求库——android-async-http dcj3sjt126com android
在iOS开发中有大名鼎鼎的ASIHttpRequest库，用来处理网络请求操作，今天要介绍的是一个在Android上同样强大的网络请求库android-async-http，目前非常火的应用Instagram和Pinterest的Android版就是用的这个网络请求库。这个网络请求库是基于Apache HttpClient库之上的一个异步网络请求处理库，网络处理均基于Android的非UI线程，通
ORACLE 复习笔记之SQL语句的优化 eksliang SQL优化 Oracle sql语句优化 SQL语句的优化
转载请出自出处：http://eksliang.iteye.com/blog/2097999 SQL语句的优化总结如下 sql语句的优化可以按照如下六个步骤进行：合理使用索引避免或者简化排序消除对大表的扫描避免复杂的通配符匹配调整子查询的性能 EXISTS和IN运算符下面我就按照上面这六个步骤分别进行总结：
浅析：Android 嵌套滑动机制（NestedScrolling） gg163 android 移动开发滑动机制嵌套
谷歌在发布安卓 Lollipop版本之后，为了更好的用户体验，Google为Android的滑动机制提供了NestedScrolling特性 NestedScrolling的特性可以体现在哪里呢？ 比如你使用了Toolbar，下面一个ScrollView，向上滚
使用hovertree菜单作为后台导航 hvt JavaScript jquery .net hovertree asp.net
hovertree是一个jquery菜单插件，官方网址：http://keleyi.com/jq/hovertree/ ，可以登录该网址体验效果。 0.1.3版本：http://keleyi.com/jq/hovertree/demo/demo.0.1.3.htm hovertree插件包含文件： http://keleyi.com/jq/hovertree/css
SVG 教程（二）矩形天梯梦 svg
SVG <rect> SVG Shapes SVG有一些预定义的形状元素，可被开发者使用和操作：矩形 <rect> 圆形 <circle> 椭圆 <ellipse> 线 <line> 折线 <polyline> 多边形 <polygon> 路径 <path>
一个简单的队列 luyulong java 数据结构队列
public class MyQueue { private long[] arr; private int front; private int end; // 有效数据的大小 private int elements; public MyQueue() { arr = new long[10]; elements = 0; front
基础数据结构和算法九：Binary Search Tree sunwinner Algorithm
A binary search tree (BST) is a binary tree where each node has a Comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all
项目出现的一些问题和体会 Steven-Walker DAO Web servlet
第一篇博客不知道要写点什么，就先来点近阶段的感悟吧。这几天学了servlet和数据库等知识，就参照老方的视频写了一个简单的增删改查的，完成了最简单的一些功能，使用了三层架构。 dao层完成的是对数据库具体的功能实现，service层调用了dao层的实现方法，具体对servlet提供支持。 &
高手问答：Java老A带你全面提升Java单兵作战能力！ ITeye管理员 java
本期特邀《Java特种兵》作者：谢宇，CSDN论坛ID: xieyuooo 针对JAVA问题给予大家解答，欢迎网友积极提问，与专家一起讨论! 作者简介：淘宝网资深Java工程师，CSDN超人气博主，人称“胖哥”。 CSDN博客地址： http://blog.csdn.net/xieyuooo 作者在进入大学前是一个不折不扣的计算机白痴，曾经被人笑话过不懂鼠标是什么，