bigcindy

图像分类经典卷积神经网络—GoogLeNet论文翻译（中英文对照版）—Going Deeper with Convolutions（走向更深的卷积神经网络）

图像分类经典论文翻译汇总：[翻译汇总]

翻译pdf文件下载：[下载地址]

此版为中英文对照版，纯中文版请稳步：[GoogLeNet纯中文版]

Going Deeper with Convolutions

走向更深的卷积神经网络

Christian Szegedy

Google Inc.

（谷歌）

Wei Liu

University of North Carolina, Chapel Hill

（北卡罗来纳大学教堂山分校）

Yangqing Jia（贾扬清）

Google Inc.

（谷歌）

Pierre Sermanet

Google Inc.

（谷歌）

Scott Reed

University of Michigan

（密歇根大学）

Dragomir Anguelov

Google Inc.

（谷歌）

Dumitru Erhan

Google Inc.

（谷歌）

Vincent Vanhoucke

Google Inc.

（谷歌）

Andrew Rabinovich

Google Inc.

（谷歌）

Abstract

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

摘要

我们在ImageNet大规模视觉识别挑战赛2014（ILSVRC14）上提出了一种代号为Inception的深度卷积神经网络结构，并在分类和检测上取得了新的最好结果。这个架构的主要特点是提高了网络内部计算资源的利用率。通过精心的手工设计，我们在增加了网络深度和广度的同时保持了计算预算不变。为了优化质量，架构的设计以赫布理论和多尺度处理直觉为基础。我们在ILSVRC14提交中应用的一个特例被称为GoogLeNet，一个22层的深度网络，其质量在分类和检测的背景下进行了评估。

1. Introduction

In the last three years, our object classification and detection capabilities have dramatically improved due to advances in deep learning and convolutional networks [10]. One encouraging news is that most of this progress is not just the result of more powerful hardware, larger datasets and bigger models, but mainly a consequence of new ideas, algorithms and improved network architectures. No new data sources were used, for example, by the top entries in the ILSVRC 2014 competition besides the classification dataset of the same competition for detection purposes. Our GoogLeNet submission to ILSVRC 2014 actually uses 12 times fewer parameters than the winning architecture of Krizhevsky et al [9] from two years ago, while being significantly more accurate. On the object detection front, the biggest gains have not come from naive application of bigger and bigger deep networks, but from the synergy of deep architectures and classical computer vision, like the R-CNN algorithm by Girshick et al [6].

1. 引言

过去三年中，由于深度学习和卷积网络的发展[10]，我们的目标分类和检测能力得到了显著提高。一个令人鼓舞的消息是，大部分的进步不仅仅是更强大硬件、更大数据集、更大模型的结果，而主要是新的想法、算法和网络结构改进的结果。例如，ILSVRC 2014竞赛中最靠前的团队将分类的数据集用于相同竞赛的检测任务，没有使用新的数据资源。我们在ILSVRC 2014中的GoogLeNet提交实际使用的参数只有两年前Krizhevsky等人[9]获胜结构参数的1/12，而结果明显更准确。在目标检测前沿，最大的收获不是来自于越来越大的深度网络的简单应用，而是来自于深度架构和经典计算机视觉的协同，像Girshick等人[6]的R-CNN算法那样。

Another notable factor is that with the ongoing traction of mobile and embedded computing, the efficiency of our algorithms —— especially their power and memory use —— gains importance. It is noteworthy that the considerations leading to the design of the deep architecture presented in this paper included this factor rather than having a sheer fixation on accuracy numbers. For most of the experiments, the models were designed to keep a computational budget of 1.5 billion multiply-adds at inference time, so that the they do not end up to be a purely academic curiosity, but could be put to real world use, even on large datasets, at a reasonable cost.

另一个显著因素是随着移动和嵌入式设备的推动，我们的算法的效率很重要——尤其是在电力和内存使用上。值得注意的是，正是包含了这个因素的考虑才得出了本文中呈现的深度架构设计，而不是单纯的为了提高准确率。对于大多数实验来说，模型被设计为在一次推断中保持15亿乘加的计算预算，所以最终它们不是单纯的学术好奇心，而是能在现实世界中应用，甚至是以合理的代价在大型数据集上使用。

In this paper, we will focus on an efficient deep neural network architecture for computer vision, codenamed Inception, which derives its name from the Network in network paper by Lin et al [12] in conjunction with the famous “we need to go deeper” internet meme [1]. In our case, the word “deep” is used in two different meanings: first of all, in the sense that we introduce a new level of organization in the form of the “Inception module” and also in the more direct sense of increased network depth. In general, one can view the Inception model as a logical culmination of [12] while taking inspiration and guidance from the theoretical work by Arora et al [2]. The benefits of the architecture are experimentally verified on the ILSVRC 2014 classification and detection challenges, where it significantly outperforms the current state of the art.

在本文中，我们将关注一个高效的计算机视觉深度神经网络架构，代号为Inception，它的名字来自于Lin等人[12]网络论文中的Network与著名的“we need to go deeper”网络迷因[1]的结合。在我们的案例中，单词“deep”用在两个不同的含义中：首先，在某种意义上，我们以“Inception module”的形式引入了一种新层次的组织方式，在更直接的意义上增加了网络的深度。一般来说，可以把Inception模型看作论文[12]的逻辑顶点同时从Arora等人[2]的理论工作中受到了鼓舞和引导。这种架构的好处在ILSVRC 2014分类和检测挑战赛中通过实验得到了验证，它明显优于目前的最好水平。

2. Related Work

Starting with LeNet-5 [10], convolutional neural networks (CNN) have typically had a standard structure —— stacked convolutional layers (optionally followed by contrast normalization and max-pooling) are followed by one or more fully-connected layers. Variants of this basic design are prevalent in the image classification literature and have yielded the best results to-date on MNIST, CIFAR and most notably on the ImageNet classification challenge [9, 21]. For larger datasets such as Imagenet, the recent trend has been to increase the number of layers [12] and layer size [21, 14], while using dropout [7] to address the problem of overfitting.

2. 近期工作

从LeNet-5 [10]开始，卷积神经网络（CNN）通常有一个标准结构——堆叠的卷积层（后面可以选择有对比归一化和最大池化）后面是一个或更多的全连接层。这个基本设计的变种在图像分类领域流行，并且目前为止在MNIST，CIFAR和更著名的ImageNet分类挑战赛中[9, 21]的已经取得了最佳结果。对于更大的数据集例如ImageNet来说，最近的趋势是增加层的数目[12]和层的大小[21, 14]，同时使用dropout[7]来解决过拟合问题。

Despite concerns that max-pooling layers result in loss of accurate spatial information, the same convolutional network architecture as [9] has also been successfully employed for localization [9, 14], object detection [6, 14, 18, 5] and human pose estimation [19].

尽管担心最大池化层会引起准确空间信息的损失，但与[9]相同的卷积网络结构也已经成功的应用于定位[9, 14]，目标检测[6, 14, 18, 5]和行人姿态估计[19]。

Inspired by a neuroscience model of the primate visual cortex, Serre et al. [15] used a series of fixed Gabor filters of different sizes to handle multiple scales. We use a similar strategy here. However, contrary to the fixed 2-layer deep model of [15], all filters in the Inception architecture are learned. Furthermore, Inception layers are repeated many times, leading to a 22-layer deep model in the case of the GoogLeNet model.

受灵长类视觉皮层神经科学模型的启发，Serre等人[15]使用了一系列固定的不同大小的Gabor滤波器来处理多尺度。我们使用一个了类似的策略。然而，与[15]的固定的2层深度模型不同，Inception结构中所有的滤波器是学习到的。此外，Inception层重复了很多次，在GoogLeNet模型中得到了一个22层的深度模型。

Network-in-Network is an approach proposed by Lin et al. [12] in order to increase the representational power of neural networks. In their model, additional 1 × 1 convolutional layers are added to the network, increasing its depth. We use this approach heavily in our architecture. However, in our setting, 1 × 1 convolutions have dual purpose: most critically, they are used mainly as dimension reduction modules to remove computational bottlenecks, that would otherwise limit the size of our networks. This allows for not just increasing the depth, but also the width of our networks without a significant performance penalty.

Network-in-Network是Lin等人[12]为了增加神经网络表现能力而提出的一种方法。在他们的模型中，网络中添加了额外的1 × 1卷积层，增加了网络的深度。我们的架构中大量的使用了这个方法。但是，在我们的设置中，1 × 1卷积有两个目的：最关键的是，它们主要是用来作为降维模块来移除卷积瓶颈，否则将会限制我们网络的大小。这不仅允许了深度的增加，而且允许我们网络的宽度增加但没有明显的性能损失。

The current leading approach for object detection is the Regions with Convolutional Neural Networks (R-CNN) method by Girshick et al. [6]. R-CNN decomposes the overall detection problem into two subproblems: utilizing low-level cues such as color and texture in order to generate object location proposals in a category-agnostic fashion and using CNN classifiers to identify object categories at those locations. Such a two stage approach leverages the accuracy of bounding box segmentation with low-level cues, as well as the highly powerful classification power of state-of-the-art CNNs. We adopted a similar pipeline in our detection submissions, but have explored enhancements in both stages, such as multi-box [5] prediction for higher object bounding box recall, and ensemble approaches for better categorization of bounding box proposals.

目前最好的目标检测是Girshick等人[6]的基于区域的卷积神经网络（R-CNN）方法。R-CNN将整个检测问题分解为两个子问题：利用低层次的信号例如颜色，纹理以跨类别的方式来产生目标位置候选区域，然后用CNN分类器来识别那些位置上的对象类别。这样一种两个阶段的方法利用了低层特征分割边界框的准确性，也利用了目前的CNN非常强大的分类能力。我们在我们的检测提交中采用了类似的方式，但探索增强这两个阶段，例如对于更高的目标边界框召回使用多盒[5]预测，并融合了更好的边界框候选区域分类方法。

3. Motivation and High Level Considerations

The most straightforward way of improving the performance of deep neural networks is by increasing their size. This includes both increasing the depth — the number of network levels — as well as its width: the number of units at each level. This is an easy and safe way of training higher quality models, especially given the availability of a large amount of labeled training data. However, this simple solution comes with two major drawbacks.

3. 动机和高层思考

提高深度神经网络性能最直接的方式是增加它们的尺寸。这不仅包括增加深度——网络层次的数目——也包括它的宽度：每一层的单元数目。这是一种训练更高质量模型容易且安全的方法，尤其是在可获得大量标注的训练数据的情况下。但是这个简单方案有两个主要的缺点。

Bigger size typically means a larger number of parameters, which makes the enlarged network more prone to overfitting, especially if the number of labeled examples in the training set is limited. This is a major bottleneck as strongly labeled datasets are laborious and expensive to obtain, often requiring expert human raters to distinguish between various fine-grained visual categories such as those in ImageNet (even in the 1000-class ILSVRC subset) as shown in Figure 1.

(a) Siberian husky (b) Eskimo dog

Figure 1: Two distinct classes from the 1000 classes of the ILSVRC 2014 classification challenge. Domain knowledge is required to distinguish between these classes.

更大的尺寸通常意味着更多的参数，这会使增大的网络更容易过拟合，尤其是在训练集的标注样本有限的情况下。这是一个主要的瓶颈，因为要获得强标注数据集费时费力且代价昂贵，经常需要专家评委在各种细粒度的视觉类别进行区分，例如图1中显示的ImageNet中的类别（甚至是1000类ILSVRC的子集）。

(a) 西伯利亚哈士奇犬 (b) 爱斯基摩犬

图1: ILSVRC 2014分类挑战赛的1000类中两个不同的类别。区分这些类别需要领域知识。

Another drawback of uniformly increased network size is the dramatically increased use of computational resources. For example, in a deep vision network, if two convolutional layers are chained, any uniform increase in the number of their filters results in a quadratic increase of computation. If the added capacity is used inefficiently (for example, if most weights end up to be close to zero), then much of the computation is wasted. As the computational budget is always finite, an efficient distribution of computing resources is preferred to an indiscriminate increase of size, even when the main objective is to increase the quality of performance.

均匀增加网络尺寸的另一个缺点是计算资源使用的显著增加。例如，在一个深度视觉网络中，如果两个卷积层相连，它们的滤波器数目的任何均匀增加都会引起计算量平方式的增加。如果增加的容量使用效率低下（例如，如果大多数权重最终接近于0），那么会浪费大量的计算能力。由于计算预算总是有限的，计算资源的有效分布更偏向于尺寸无差别的增加，即使主要目标是增加性能的质量。

A fundamental way of solving both of these issues would be to introduce sparsity and replace the fully connected layers by the sparse ones, even inside the convolutions. Besides mimicking biological systems, this would also have the advantage of firmer theoretical underpinnings due to the groundbreaking work of Arora et al. [2]. Their main result states that if the probability distribution of the dataset is representable by a large, very sparse deep neural network, then the optimal network topology can be constructed layer after layer by analyzing the correlation statistics of the preceding layer activations and clustering neurons with highly correlated outputs. Although the strict mathematical proof requires very strong conditions, the fact that this statement resonates with the well known Hebbian principle —— neurons that fire together, wire together —— suggests that the underlying idea is applicable even under less strict conditions, in practice.

解决这两个问题的一个基本的方式就是引入稀疏性并将全连接层替换为稀疏的全连接层，甚至是在卷积层内使用稀疏性。除了模仿生物系统之外，由于Arora等人[2]的开创性工作，这也具有更坚固的理论基础优势。他们的主要成果说明如果数据集的概率分布可以通过一个大型稀疏的深度神经网络表示，则最优的网络拓扑结构可以通过分析前一层激活的相关性统计和聚类高度相关的神经元来一层层的构建。虽然严格的数学证明需要在很强的条件下，但事实上这个声明与著名的赫布理论产生共鸣——神经元一起激发，一起连接——实践表明，基础概念甚至适用于不严格的条件下。

Unfortunately, today’s computing infrastructures are very inefficient when it comes to numerical calculation on non-uniform sparse data structures. Even if the number of arithmetic operations is reduced by 100×, the overhead of lookups and cache misses would dominate: switching to sparse matrices might not pay off. The gap is widened yet further by the use of steadily improving and highly tuned numerical libraries that allow for extremely fast dense matrix multiplication, exploiting the minute details of the underlying CPU or GPU hardware [16, 9]. Also, non-uniform sparse models require more sophisticated engineering and computing infrastructure. Most current vision oriented machine learning systems utilize sparsity in the spatial domain just by the virtue of employing convolutions. However, convolutions are implemented as collections of dense connections to the patches in the earlier layer. ConvNets have traditionally used random and sparse connection tables in the feature dimensions since [11] in order to break the symmetry and improve learning, yet the trend changed back to full connections with [9] in order to further optimize parallel computation. Current state-of-the-art architectures for computer vision have uniform structure. The large number of filters and greater batch size allows for the efficient use of dense computation.

遗憾的是，当碰到在非均匀的稀疏数据结构上进行数值计算时，现在的计算架构效率非常低下。即使算法运算的数量减少100倍，查询和缓存丢失上的开销仍占主导地位：切换到稀疏矩阵可能是不可行的。随着稳定提升和高度调整的数值库的应用，差距仍在进一步扩大，数值库要求极度快速密集的矩阵乘法，利用底层的CPU或GPU硬件[16, 9]的微小细节。非均匀的稀疏模型也要求更多的复杂工程和计算基础结构。目前大多数面向视觉的机器学习系统通过采用卷积的优点来利用空域的稀疏性。然而，卷积被实现为对上一层块的密集连接的集合。为了打破对称性，提高学习水平，从论文[11]开始，ConvNets习惯上在特征维度使用随机的稀疏连接表，然而为了进一步优化并行计算，论文[9]中趋向于变回全连接。目前最新的计算机视觉架构有统一的结构。更多的滤波器和更大的批大小要求密集计算的有效使用。

This raises the question of whether there is any hope for a next, intermediate step: an architecture that makes use of filter-level sparsity, as suggested by the theory, but exploits our current hardware by utilizing computations on dense matrices. The vast literature on sparse matrix computations (e.g. [3]) suggests that clustering sparse matrices into relatively dense submatrices tends to give competitive performance for sparse matrix multiplication. It does not seem far-fetched to think that similar methods would be utilized for the automated construction of non-uniform deep-learning architectures in the near future.

这提出了下一个中间步骤是否有希望的问题：一个架构能利用滤波器水平的稀疏性，正如理论所认为的那样，但能通过利用密集矩阵计算来利用我们目前的硬件。稀疏矩阵乘法的大量文献（例如[3]）认为对于稀疏矩阵乘法，将稀疏矩阵聚类为相对密集的子矩阵会有更佳的性能。在不久的将来会利用类似的方法来进行非均匀深度学习架构的自动构建，这样的想法似乎并不牵强。

The Inception architecture started out as a case study for assessing the hypothetical output of a sophisticated network topology construction algorithm that tries to approximate a sparse structure implied by [2] for vision networks and covering the hypothesized outcome by dense, readily available components. Despite being a highly speculative undertaking, modest gains were observed early on when compared with reference networks based on [12]. With a bit of tuning the gap widened and Inception proved to be especially useful in the context of localization and object detection as the base network for [6] and [5]. Interestingly, while most of the original architectural choices have been questioned and tested thoroughly in separation, they turned out to be close to optimal locally. One must be cautious though: although the Inception architecture has become a success for computer vision, it is still questionable whether this can be attributed to the guiding principles that have lead to its construction. Making sure of this would require a much more thorough analysis and verification.

Inception架构开始是作为案例研究，用于评估一个复杂网络拓扑构建算法的假设输出，该算法试图近似[2]中所示的视觉网络的稀疏结构，并通过密集的、容易获得的组件来覆盖假设结果。尽管是一个非常投机的事情，但与基于[12]的参考网络相比，早期可以观测到适度的收益。随着一点点调整加宽差距，作为[6]和[5]的基础网络，Inception被证明在定位上下文和目标检测中尤其有用。有趣的是，虽然大多数最初的架构选择已被质疑并分离开进行全面测试，但结果证明它们是局部最优的。然而必须谨慎：尽管Inception架构在计算机上领域取得成功，但这是否可以归因于构建其架构的指导原则仍是有疑问的。确保这一点将需要更彻底的分析和验证。

4. Architectural Details

The main idea of the Inception architecture is to consider how an optimal local sparse structure of a convolutional vision network can be approximated and covered by readily available dense components. Note that assuming translation invariance means that our network will be built from convolutional building blocks. All we need is to find the optimal local construction and to repeat it spatially. Arora et al. [2] suggests a layer-by-layer construction where one should analyze the correlation statistics of the last layer and cluster them into groups of units with high correlation. These clusters form the units of the next layer and are connected to the units in the previous layer. We assume that each unit from an earlier layer corresponds to some region of the input image and these units are grouped into filter banks. In the lower layers (the ones close to the input) correlated units would concentrate in local regions. Thus, we would end up with a lot of clusters concentrated in a single region and they can be covered by a layer of 1×1 convolutions in the next layer, as suggested in [12]. However, one can also expect that there will be a smaller number of more spatially spread out clusters that can be covered by convolutions over larger patches, and there will be a decreasing number of patches over larger and larger regions. In order to avoid patch-alignment issues, current incarnations of the Inception architecture are restricted to filter sizes 1×1, 3×3 and 5×5; this decision was based more on convenience rather than necessity. It also means that the suggested architecture is a combination of all those layers with their output filter banks concatenated into a single output vector forming the input of the next stage. Additionally, since pooling operations have been essential for the success of current convolutional networks, it suggests that adding an alternative parallel pooling path in each such stage should have additional beneficial effect, too (see Figure 2(a)).

(a) Inception module, naive version

(b) Inception module with dimension reductions

Figure 2: Inception module

4. 架构细节

Inception架构的主要想法是考虑怎样近似卷积视觉网络的最优稀疏结构并用容易获得的密集组件进行覆盖。注意假设转换不变性，这意味着我们的网络将以卷积构建块为基础。我们所需要做的是找到最优的局部构造并在空间上重复它。Arora等人[2]提出了一个层次结构，其中应该分析最后一层的相关统计并将它们聚集成具有高相关性的单元组。这些聚类形成了下一层的单元并与前一层的单元连接。我们假设较早层的每个单元都对应输入层的某些区域，并且这些单元被分成滤波器组。在较低的层（接近输入的层）相关单元集中在局部区域。因此，如[12]所示，我们最终会有许多聚类集中在单个区域，它们可以通过下一层的1×1卷积层覆盖。然而也可以预期，将存在更小数目的在更大空间上扩展的聚类，其可以被更大块上的卷积覆盖，这将在越来越大的区域上块的数量将会下降。为了避免块校正的问题，目前Inception架构形式的滤波器的尺寸仅限于1×1、3×3、5×5，这样做更多的是基于其便易性而非必需。这也意味着提出的架构是所有这些层的组合，其输出滤波器组连接成单个输出向量形成了下一阶段的输入。另外，由于池化操作对于目前卷积网络的成功至关重要，因此建议在每个这样的阶段添加一个替代的并行池化路径应该具有额外的有益效果（看图2(a)）。

(a) Inception模块，naïve版

(b) 降维的Inception模块

图2：Inception模块

As these “Inception modules” are stacked on top of each other, their output correlation statistics are bound to vary: as features of higher abstraction are captured by higher layers, their spatial concentration is expected to decrease. This suggests that the ratio of 3×3 and 5×5 convolutions should increase as we move to higher layers.

由于这些“Inception模块”在每个模型的顶部堆叠，其输出相关性统计必然有变化：由于较高层会捕获较高的抽象特征，其空间集中度预计会减少。这表明随着转移到更高层，3×3和5×5卷积的比例应该会增加。

One big problem with the above modules, at least in this naive form, is that even a modest number of 5×5 convolutions can be prohibitively expensive on top of a convolutional layer with a large number of filters. This problem becomes even more pronounced once pooling units are added to the mix: the number of output filters equals to the number of filters in the previous stage. The merging of output of the pooling layer with outputs of the convolutional layers would lead to an inevitable increase in the number of outputs from stage to stage. While this architecture might cover the optimal sparse structure, it would do it very inefficiently, leading to a computational blow up within a few stages.

上述模块的一个大问题是在具有大量滤波器的卷积层之上，即使适量的5×5卷积也可能是非常昂贵的，至少在这种naïve版本中有这个问题。一旦池化单元添加到混合中，这个问题甚至会变得更明显：输出滤波器的数量等于前一阶段滤波器的数量。池化层输出和卷积层输出的合并会导致这一阶段到下一阶段输出数量不可避免的增加。虽然这种架构可能会覆盖最优稀疏结构，但它会非常低效，导致在几个阶段内计算量爆炸。

This leads to the second idea of the Inception architecture: judiciously reducing dimension wherever the computational requirements would increase too much otherwise. This is based on the success of embeddings: even low dimensional embeddings might contain a lot of information about a relatively large image patch. However, embeddings represent information in a dense, compressed form and compressed information is harder to process. The representation should be kept sparse at most places (as required by the conditions of [2]) and compress the signals only whenever they have to be aggregated en masse. That is, 1×1 convolutions are used to compute reductions before the expensive 3×3 and 5×5 convolutions. Besides being used as reductions, they also include the use of rectified linear activation making them dual-purpose. The final result is depicted in Figure 2(b).

这导致了Inception架构的第二个想法：在计算要求会增加太多的地方，明智地减少维度。这是基于嵌入的成功：甚至低维嵌入可能包含大量关于较大图像块的信息。嵌入以密集、压缩的形式表示信息，然而压缩信息更难处理。这种表示应该在大多数地方保持稀疏（根据[2]中条件的要求）并且仅在它们必须汇总时才压缩信号。也就是说，在昂贵的3×3和5×5卷积之前，1×1卷积用来计算降维。除了用来降维之外，它们的第二个作用是使用线性修正单元。最终的结果如图2(b)所示。

In general, an Inception network is a network consisting of modules of the above type stacked upon each other, with occasional max-pooling layers with stride 2 to halve the resolution of the grid. For technical reasons (memory efficiency during training), it seemed beneficial to start using Inception modules only at higher layers while keeping the lower layers in traditional convolutional fashion. This is not strictly necessary, simply reflecting some infrastructural inefficiencies in our current implementation.

通常，Inception网络是一个由上述类型的模块互相堆叠组成的网络，偶尔会有步长为2的最大池化层将网络分辨率减半。出于技术原因（训练过程中内存效率），只在更高层开始使用Inception模块而在更低层仍保持传统的卷积形式似乎是有益的。这不是绝对必要的，只是反映了我们目前实现中的一些基础结构效率低下。

A useful aspect of this architecture is that it allows for increasing the number of units at each stage significantly without an uncontrolled blow-up in computational complexity at later stages. This is achieved by the ubiquitous use of dimensionality reduction prior to expensive convolutions with larger patch sizes. Furthermore, the design follows the practical intuition that visual information should be processed at various scales and then aggregated so that the next stage can abstract features from the different scales simultaneously.

该架构的一个有用的方面是它允许显著增加每个阶段的单元数量，而不会在后面的阶段出现计算复杂度不受控制的爆炸。这是在尺寸较大的块进行昂贵的卷积之前通过普遍使用降维实现的。此外，设计遵循了实践直觉，即视觉信息应该在不同的尺度上处理然后聚合，为的是下一阶段可以从不同尺度同时抽象特征。

The improved use of computational resources allows for increasing both the width of each stage as well as the number of stages without getting into computational difficulties. One can utilize the Inception architecture to create slightly inferior, but computationally cheaper versions of it. We have found that all the available knobs and levers allow for a controlled balancing of computational resources resulting in networks that are 3—10× faster than similarly performing networks with non-Inception architecture, however this requires careful manual design at this point.

计算资源的改善使用允许增加每个阶段的宽度和阶段的数量，而不会陷入计算困境。可以利用Inception架构创建略差一些但计算成本更低的版本。我们发现所有可用的控制允许计算资源的受控平衡，导致网络比没有Inception结构的类似执行网络快3—10倍，但是在这一点上需要仔细的手动设计。

5. GoogLeNet

By the “GoogLeNet” name we refer to the particular incarnation of the Inception architecture used in our submission for the ILSVRC 2014 competition. We also used one deeper and wider Inception network with slightly superior quality, but adding it to the ensemble seemed to improve the results only marginally. We omit the details of that network, as empirical evidence suggests that the influence of the exact architectural parameters is relatively minor. Table 1 illustrates the most common instance of Inception used in the competition. This network (trained with different image patch sampling methods) was used for 6 out of the 7 models in our ensemble.

Table 1: GoogLeNet incarnation of the Inception architecture

5. GoogLeNet

通过“GoogLeNet”这个名字，我们提到了在ILSVRC 2014竞赛的提交中使用的Inception架构的特例。我们也使用了一个稍微优质的更深更宽的Inception网络，但将其加入到组合中似乎只稍微提高了结果。我们忽略了该网络的细节，因为经验证据表明确切架构的参数影响相对较小。表1说明了竞赛中使用的最常见的Inception实例。这个网络（用不同的图像块采样方法训练的）使用了我们组合中7个模型中的6个。

表1：Inception架构的GoogLeNet

All the convolutions, including those inside the Inception modules, use rectified linear activation. The size of the receptive field in our network is 224×224 in the RGB color space with zero mean. “#3×3 reduce” and “#5×5 reduce” stands for the number of 1×1 filters in the reduction layer used before the 3×3 and 5×5 convolutions. One can see the number of 1×1 filters in the projection layer after the built-in max-pooling in the pool proj column. All these reduction/projection layers use rectified linear activation as well.

所有的卷积都使用了修正线性激活，包括Inception模块内部的卷积。在我们的网络中感受野是在均值为0的RGB颜色空间中，大小是224×224。“#3×3 reduce”和“#5×5 reduce”表示在3×3和5×5卷积之前，降维层使用的1×1滤波器的数量。在pool proj列可以看到内置的最大池化之后，投影层中1×1滤波器的数量。所有的这些降维/投影层也都使用了线性修正激活。

The network was designed with computational efficiency and practicality in mind, so that inference can be run on individual devices including even those with limited computational resources, especially with low-memory footprint. The network is 22 layers deep when counting only layers with parameters (or 27 layers if we also count pooling). The overall number of layers (independent building blocks) used for the construction of the network is about 100. The exact number depends on how layers are counted by the machine learning infrastructure. The use of average pooling before the classifier is based on [12], although our implementation has an additional linear layer. The linear layer enables us to easily adapt our networks to other label sets, however it is used mostly for convenience and we do not expect it to have a major effect. We found that a move from fully connected layers to average pooling improved the top-1 accuracy by about 0.6%, however the use of dropout remained essential even after removing the fully connected layers.

网络的设计考虑了计算效率和实用性，以便可以在单独的设备上运行，甚至包括那些计算资源有限的设备，尤其是低内存占用的设备。当只计算有参数的层时，网络有22层（如果我们也计算池化层是27层）。构建网络的全部层（独立构建块）的数目大约是100。确切的数量取决于机器学习基础架构对层的计算方式。分类器之前的平均池化是基于[12]的，尽管我们的实现有一个额外的线性层。这个线性层使我们的网络能很容易地适应其它的标签集，但它主要是为了方便使用，我们不期望它有重大的影响。我们发现从全连接层变为平均池化，提高了大约top-1 %0.6的准确率，然而即使在移除了全连接层之后，dropout的使用还是必不可少的。

Given relatively large depth of the network, the ability to propagate gradients back through all the layers in an effective manner was a concern. The strong performance of shallower networks on this task suggests that the features produced by the layers in the middle of the network should be very discriminative. By adding auxiliary classifiers connected to these intermediate layers, discrimination in the lower stages in the classifier was expected. This was thought to combat the vanishing gradient problem while providing regularization. These classifiers take the form of smaller convolutional networks put on top of the output of the Inception (4a) and (4d) modules. During training, their loss gets added to the total loss of the network with a discount weight (the losses of the auxiliary classifiers were weighted by 0.3). At inference time, these auxiliary networks are discarded. Later control experiments have shown that the effect of the auxiliary networks is relatively minor (around 0.5%) and that it required only one of them to achieve the same effect.

对于一个相对较深的网络，如何将梯度有效地反向传播到所有层是个问题。在这个任务上，更浅网络的强大性能表明网络中部层产生的特征应该是非常有识别力的。通过将辅助分类器添加到这些中间层，可以期望较低阶段分类器具有更好的判别力。这被认为是在提供正则化的同时克服梯度消失问题。这些分类器采用较小卷积网络的形式，放置在Inception (4a)和Inception (4b)模块的输出之上。在训练期间，它们的损失以折扣权重（辅助分类器损失的权重是0.3）加到网络的整个损失上。在推断时，这些辅助网络被丢弃。后面的对照实验表明辅助网络的影响相对较小（约0.5），只需要其中一个就能取得同样的效果。

The exact structure of the extra network on the side, including the auxiliary classifier, is as follows:

An average pooling layer with 5×5 filter size and stride 3, resulting in an 4×4×512 output for the (4a), and 4×4×528 for the (4d) stage.
A 1×1 convolution with 128 filters for dimension reduction and rectified linear activation.
A fully connected layer with 1024 units and rectified linear activation.
A dropout layer with 70% ratio of dropped outputs.
A linear layer with softmax loss as the classifier (predicting the same 1000 classes as the main classifier, but removed at inference time).

A schematic view of the resulting network is depicted in Figure 3.

Figure 3: GoogLeNet network with all the bells and whistles

包括辅助分类器在内的附加网络的具体结构如下：

一个滤波器大小5×5，步长为3的平均池化层，(4a)阶段结果输出为4×4×512，(4d)阶段结果输出为4×4×528。
具有128个滤波器的1×1卷积，用于降维和修正线性激活。
一个具有1024个单元和修正线性激活的全连接层，。
丢弃率为70%的dropout层。
使用带有softmax损失的线性层作为分类器（作为主分类器预测同样的1000类，但在推断时移除）。

最终的网络模型图如图3所示。

图3：含有的所有结构的GoogLeNet网络

6. Training Methodology

GoogLeNet networks were trained using the DistBelief [4] distributed machine learning system using modest amount of model and data-parallelism. Although we used a CPU based implementation only, a rough estimate suggests that the GoogLeNet network could be trained to convergence using few high-end GPUs within a week, the main limitation being the memory usage. Our training used asynchronous stochastic gradient descent with 0.9 momentum [17], fixed learning rate schedule (decreasing the learning rate by 4% every 8 epochs). Polyak averaging [13] was used to create the final model used at inference time.

6. 训练方法

GoogLeNet网络使用DistBelief[4]分布式机器学习系统进行训练，该系统使用适量的模型和数据并行化。尽管我们仅使用一个基于CPU的实现，但粗略的估计表明GoogLeNet网络可以用更少的高端GPU在一周之内训练到收敛，主要的限制是内存使用。我们的训练使用异步随机梯度下降，动量参数为0.9[17]，固定的学习率计划（每8次遍历下降学习率4%）。Polyak平均[13]在推断时用来创建最终的模型。

Image sampling methods have changed substantially over the months leading to the competition, and already converged models were trained on with other options, sometimes in conjunction with changed hyperparameters, such as dropout and the learning rate. Therefore, it is hard to give a definitive guidance to the most effective single way to train these networks. To complicate matters further, some of the models were mainly trained on smaller relative crops, others on larger ones, inspired by [8]. Still, one prescription that was verified to work very well after the competition, includes sampling of various sized patches of the image whose size is distributed evenly between 8% and 100% of the image area with aspect ratio constrained to the interval [34,43][34,43]. Also, we found that the photometric distortions of Andrew Howard [8] were useful to combat overfitting to the imaging conditions of training data.

图像采样方法在过去几个月的竞赛中发生了重大变化，并且已收敛的模型在其他选项上进行了训练，有时还结合着超参数的改变，例如dropout和学习率。因此，很难对训练这些网络的最有效的单一方式给出一个明确指导。让事情更复杂的是，受[8]的启发，一些模型主要是在相对较小的裁剪图像进行训练，其它模型主要是在相对较大的裁剪图像上进行训练。然而，一个经过验证的方案在竞赛后工作地很好，包括各种尺寸的图像块的采样，它的尺寸均匀分布在图像区域的8%-100%之间，方向角限制为[34,43][34,43]之间。另外，我们发现Andrew Howard[8]的光度扭曲对于克服训练数据成像条件的过拟合是有用的。

利用分布式机器学习系统和数据并行来训练网络，虽然实现时只用了CPU，但是是可以用个别高端的GPU一周达到收敛的。采用异步随机梯度下降，动量为0.9，学习率每8个epoch下降4%。用Polyak平均来创建最后的模型。图像采样的patch大小从图像的8%到100%，选取的长宽比在3/4到4/3之间，光度扭曲也有利于减少过拟合，还使用随机插值方法结合其他超参数的改变来调整图像大小。

7. ILSVRC 2014 Classification Challenge Setup and Results

The ILSVRC 2014 classification challenge involves the task of classifying the image into one of 1000 leaf-node categories in the Imagenet hierarchy. There are about 1.2 million images for training, 50,000 for validation and 100,000 images for testing. Each image is associated with one ground truth category, and performance is measured based on the highest scoring classifier predictions. Two numbers are usually reported: the top-1 accuracy rate, which compares the ground truth against the first predicted class, and the top-5 error rate, which compares the ground truth against the first 5 predicted classes: an image is deemed correctly classified if the ground truth is among the top-5, regardless of its rank in them. The challenge uses the top-5 error rate for ranking purposes.

7. ILSVRC 2014分类挑战赛设置和结果

ILSVRC 2014分类挑战赛包括将图像分类到ImageNet层级中1000个叶子结点类别的任务。训练图像大约有120万张，验证图像有5万张，测试图像有10万张。每一张图像与一个实际类别相关联，性能度量基于分类器预测的最高分。通常报告两个数字：top-1准确率，比较实际类别和第一个预测类别，top-5错误率，比较实际类别与前5个预测类别：如果图像实际类别在top-5中，则认为图像分类正确，不管它在top-5中的排名。挑战赛使用top-5错误率来进行排名。

We participated in the challenge with no external data used for training. In addition to the training techniques aforementioned in this paper, we adopted a set of techniques during testing to obtain a higher performance, which we describe next.

We independently trained 7 versions of the same GoogLeNet model (including one wider version), and performed ensemble prediction with them. These models were trained with the same initialization (even with the same initial weights, due to an oversight) and learning rate policies. They differed only in sampling methodologies and the randomized input image order.
During testing, we adopted a more aggressive cropping approach than that of Krizhevsky et al. [9]. Specifically, we resized the image to 4 scales where the shorter dimension (height or width) is 256, 288, 320 and 352 respectively, take the left, center and right square of these resized images (in the case of portrait images, we take the top, center and bottom squares). For each square, we then take the 4 corners and the center 224×224 crop as well as the square resized to 224×224, and their mirrored versions. This leads to 4×3×6×2 = 144 crops per image. A similar approach was used by Andrew Howard [8] in the previous year’s entry, which we empirically verified to perform slightly worse than the proposed scheme. We note that such aggressive cropping may not be necessary in real applications, as the benefit of more crops becomes marginal after a reasonable number of crops are present (as we will show later on).
The softmax probabilities are averaged over multiple crops and over all the individual classifiers to obtain the final prediction. In our experiments we analyzed alternative approaches on the validation data, such as max pooling over crops and averaging over classifiers, but they lead to inferior performance than the simple averaging.

我们参加竞赛时没有使用额外的数据来训练。除了本文中前面提到的训练技术之外，我们在测试时为了获得更高性能采用了一系列技巧，描述如下。

我们独立训练了7个版本的类似的GoogLeNet模型（包括一个更广泛的版本），并用它们进行了整体预测。这些模型的训练具有相同的初始化（甚至具有相同的初始权重，由于监督）和学习率策略。它们仅在采样方法和随机输入图像顺序方面不同。
在测试中，我们采用比Krizhevsky等人[9]更激进的裁剪方法。具体来说，我们将图像归一化为四个尺度，其中较短维度（高度或宽度）分别为256，288，320和352，取这些归一化的图像的左，中，右方块（在肖像图片中，我们采用顶部，中心和底部方块）。对于每个方块，我们将采用4个角以及中心224×224裁剪图像以及方块尺寸归一化为224×224，以及它们的镜像版本。这导致每张图像会得到4×3×6×2 = 144的裁剪图像。前一年的参赛的Andrew Howard[8]采用了类似的方法，经过我们实证验证，其方法略差于我们提出的方案。我们注意到，在实际应用中，这种激进裁剪可能是不必要的，因为存在合理数量的裁剪图像后，更多裁剪图像的好处会变得很微小（正如我们后面展示的那样）。
softmax概率在多个裁剪图像上和所有单个分类器上进行平均，然后获得最终预测。在我们的实验中，我们分析了验证数据的替代方法，例如裁剪图像上的最大池化和分类器的平均，但是它们比简单平均的性能略逊。

In the remainder of this paper, we analyze the multiple factors that contribute to the overall performance of the final submission.

在本文的其余部分，我们分析了有助于最终提交整体性能的多个因素。

Our final submission to the challenge obtains a top-5 error of 6.67% on both the validation and testing data, ranking the first among other participants. This is a 56.5% relative reduction compared to the SuperVision approach in 2012, and about 40% relative reduction compared to the previous year’s best approach (Clarifai), both of which used external data for training the classifiers. Table 2 shows the statistics of some of the top-performing approaches over the past 3 years.

Table 2: Classification performance

竞赛中我们的最终提交在验证集和测试集上得到了top-5 6.67%的错误率，在其它的参与者中排名第一。与2012年的SuperVision方法相比相对减少了56.5%，与前一年的最佳方法（Clarifai）相比相对减少了约40%，这两种方法都使用了外部数据训练分类器。表2显示了过去三年中一些表现最好的方法的统计。

表2：分类性能

We also analyze and report the performance of multiple testing choices, by varying the number of models and the number of crops used when predicting an image in Table 3. When we use one model, we chose the one with the lowest top-1 error rate on the validation data. All numbers are reported on the validation dataset in order to not overfit to the testing data statistics.

Table 3: GoogLeNet classification performance break down

我们也分析报告了多种测试选择的性能，当预测图像时通过改变表3中使用的模型数目和裁剪图像数目。当我们使用一个模型时，我们选择了在验证数据上top-1错误率最低的模型。报告验证数据集上的所有数据，以便不会在测试数据上过度拟合。

表3：GoogLeNet分类性能分解

8. ILSVRC 2014 Detection Challenge Setup and Results

The ILSVRC detection task is to produce bounding boxes around objects in images among 200 possible classes. Detected objects count as correct if they match the class of the groundtruth and their bounding boxes overlap by at least 50% (using the Jaccard index). Extraneous detections count as false positives and are penalized. Contrary to the classification task, each image may contain many objects or none, and their scale may vary. Results are reported using the mean average precision (mAP).

The approach taken by GoogLeNet for detection is similar to the R-CNN by [6], but is augmented with the Inception model as the region classifier. Additionally, the region proposal step is improved by combining the selective search [20] approach with multi-box [5] predictions for higher object bounding box recall. In order to reduce the number of false positives, the superpixel size was increased by 2×. This halves the proposals coming from the selective search algorithm. We added back 200 region proposals coming from multi-box [5] resulting, in total, in about 60% of the proposals used by [6], while increasing the coverage from 92% to 93%. The overall effect of cutting the number of proposals with increased coverage is a 1% improvement of the mean average precision for the single model case. Finally, we use an ensemble of 6 GoogLeNets when classifying each region. This leads to an increase in accuracy from 40% to 43.9%. Note that contrary to R-CNN, we did not use bounding box regression due to lack of time.

8. ILSVRC 2014检测挑战赛设置和结果

ILSVRC检测任务是为了在200个可能的类别中生成图像中目标的边界框。如果检测到的对象匹配的它们实际类别并且它们的边界框重叠至少50%（使用Jaccard索引），则将检测到的对象记为正确。无关的检测记为假阳性且被惩罚。与分类任务不同，每张图像可能包含多个对象或没有对象，并且它们的尺度可能是多变的。报告的结果使用平均精度均值（mAP）。

GoogLeNet检测采用的方法类似于R-CNN[6]，但用Inception模块作为区域分类器进行了增强。此外，为了更高的目标边界框召回率，通过选择搜索[20]方法和多箱[5]预测相结合改进了区域生成步骤。为了减少假阳性的数量，超分辨率的尺寸增加了2倍。这将选择搜索算法的区域生成减少了一半。我们总共补充了200个来自多盒结果的区域生成，大约60%的区域生成用于[6]，同时将覆盖率从92%提高到93%。减少区域生成的数量，增加覆盖率的整体影响是对于单个模型的情况平均精度均值增加了1%。最后，等分类单个区域时，我们使用了6个GoogLeNets的组合。这使得准确率从40%提高到43.9%。注意，与R-CNN相反，由于缺少时间我们没有使用边界框回归。

We first report the top detection results and show the progress since the first edition of the detection task. Compared to the 2013 result, the accuracy has almost doubled. The top performing teams all use convolutional networks. We report the official scores in Table 4 and common strategies for each team: the use of external data, ensemble models or contextual models. The external data is typically the ILSVRC12 classification data for pre-training a model that is later refined on the detection data. Some teams also mention the use of the localization data. Since a good portion of the localization task bounding boxes are not included in the detection dataset, one can pre-train a general bounding box regressor with this data the same way classification is used for pre-training. The GoogLeNet entry did not use the localization data for pretraining.

Table 4: Detection performance

我们首先报告了最好检测结果，并显示了从第一版检测任务以来的进展。与2013年的结果相比，准确率几乎翻了一倍。所有表现最好的团队都使用了卷积网络。我们在表4中报告了官方的分数和每个队伍的常见策略：使用外部数据、集成模型或上下文模型。外部数据通常是ILSVRC12的分类数据，用来预训练模型，后面在检测数据集上进行改善。一些团队也提到使用了定位数据。由于定位任务的边界框很大一部分不在检测数据集中，所以可以用该数据预训练一般的边界框回归器，这与分类预训练的方式相同。GoogLeNet团队没有使用定位数据进行预训练。

表4：检测性能

In Table 5, we compare results using a single model only. The top performing model is by Deep Insight and surprisingly only improves by 0.3 points with an ensemble of 3 models while the GoogLeNet obtains significantly stronger results with the ensemble.

Table 5: Single model performance for detection

在表5中，我们仅比较了单个模型的结果。最好性能模型是Deep Insight的，令人惊讶的是3个模型的集合仅提高了0.3个点，而GoogLeNet在模型集成时明显获得了更好的结果。

表5：单个模型的检测性能

9. Conclusions

Our results yield a solid evidence that approximating the expected optimal sparse structure by readily available dense building blocks is a viable method for improving neural networks for computer vision. The main advantage of this method is a significant quality gain at a modest increase of computational requirements compared to shallower and narrower architectures.

9. 结论

我们的结果取得了可靠的证据，即通过易获得的密集构造块来近似期望的最优稀疏结果是改善计算机视觉神经网络的一种可行方法。相比于较浅且较窄的架构，这个方法的主要优势是在计算需求适度增加的情况下有显著的质量收益。

Also note that our object detection work was competitive despite not utilizing context nor performing bounding box regression, suggesting yet further evidence of the strengths of the Inception architecture.

同时，需要注意的是我们的目标检测工作虽然没有利用上下文，也没有执行边界框回归，但仍然具有竞争力，这进一步显示了Inception架构优势的证据。

For both classification and detection, it is expected that similar quality of result can be achieved by much more expensive non-Inception-type networks of similar depth and width. Still, our approach yields solid evidence that moving to sparser architectures is feasible and useful idea in general. This suggest future work towards creating sparser and more refined structures in automated ways on the basis of [2], as well as on applying the insights of the Inception architecture to other domains.

对于分类和检测，预期通过更昂贵的类似深度和宽度的非Inception类型网络可以实现类似质量的结果。然而，我们的方法取得了可靠的证据，即转向更稀疏的结构一般来说是可行有用的想法。这表明未来的工作将在[2]的基础上以自动化方式创建更稀疏更精细的结构，以及将Inception架构的思考应用到其他领域。

10. Acknowledgements

We would like to thank Sanjeev Arora and Aditya Bhaskara for fruitful discussions on [2]. Also we are indebted to the DistBelief [4] team for their support especially to Rajat Monga, Jon Shlens, Alex Krizhevsky, Jeff Dean, Ilya Sutskever and Andrea Frome. We would also like to thank to Tom Duerig and Ning Ye for their help on photometric distortions. Also our work would not have been possible without the support of Chuck Rosenberg and Hartwig Adam.

10. 致谢

我们要感谢Sanjeev Arora和Aditya Bhaskara就[2]进行了富有成果的讨论。此外，我们感谢DistBelief [4]团队的支持，尤其是Rajat Monga，Jon Shlens，Alex Krizhevsky，Jeff Dean，Ilya Sutskever和Andrea Frome。我们还要感谢Tom Duerig和Ning Ye对光度失真方面的帮助。如果没有Chuck Rosenberg和Hartwig Adam的支持，我们的工作也是不可能完成的。

参考文献

References

[1] Know your meme: We need to go deeper. http://knowyourmeme.com/memes/we-need-to-go-deeper. Accessed: 2014-09-15.

[2] S. Arora, A. Bhaskara, R. Ge, and T. Ma. Provable bounds for learning some deep representations. CoRR, abs/1310.6343, 2013.

[3] U. V. C ̧atalyu ̈rek, C. Aykanat, and B. Uc ̧ar. On two-dimensional sparse matrix partitioning: Models, methods, and a recipe. SIAM J. Sci. Comput., 32(2):656–683, Feb. 2010.

[4] J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, Q. V. Le, and A. Y. Ng. Large scale distributed deep networks. In P. Bartlett, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, editors, NIPS, pages 1232–1240. 2012.

[5] D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov. Scalable object detection using deep neural networks. In CVPR, 2014.

[6] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition, 2014. CVPR 2014. IEEE Conference on, 2014.

[7] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.

[8] A. G. Howard. Some improvements on deep convolutional neural network based image classification. CoRR, abs/1312.5402, 2013.

[9] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, pages 1106–1114, 2012.

[10] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Comput., 1(4):541–551, Dec. 1989.

[11] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

[12] M. Lin, Q. Chen, and S. Yan. Network in network. CoRR, abs/1312.4400, 2013.

[13] B. T. Polyak and A. B. Juditsky. Acceleration of stochastic approximation by averaging. SIAM J. Control Optim., 30(4):838–855, July 1992.

[14] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229, 2013.

[15] T. Serre, L. Wolf, S. M. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell., 29(3):411–426, 2007.

[16] F. Song and J. Dongarra. Scaling up matrix computations on shared-memory manycore systems with 1000 cpu cores. In Proceedings of the 28th ACM Interna- tional Conference on Supercomputing, ICS ’14, pages 333–342, New York, NY, USA, 2014. ACM.

[17] I. Sutskever, J. Martens, G. E. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In ICML, volume 28 of JMLR Proceed- ings, pages 1139–1147. JMLR.org, 2013.

[18] C.Szegedy,A.Toshev,andD.Erhan.Deep neural networks for object detection. In C. J. C. Burges, L. Bottou, Z. Ghahramani, and K. Q. Weinberger, editors, NIPS, pages 2553–2561, 2013.

[19] A. Toshev and C. Szegedy. Deeppose: Human pose estimation via deep neural networks. CoRR, abs/1312.4659, 2013.

[20] K. E. A. van de Sande, J. R. R. Uijlings, T. Gevers, and A. W. M. Smeulders. Segmentation as selective search for object recognition. In Proceedings of the 2011 International Conference on Computer Vision, ICCV ’11, pages 1879–1886, Washington, DC, USA, 2011. IEEE Computer Society.

[21] M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In D. J. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, ECCV, volume 8689 of Lecture Notes in Computer Science, pages 818–833. Springer, 2014.

你可能感兴趣的:(深度学习经典论文翻译,GoogLeNet,ILSVRC,ImageNet,卷积神经网络,图像分类)

卢京潮自动控制原理ppt_自动控制原理总结 weixin_39609166 卢京潮自动控制原理ppt
自控好难TAT参考资料：河海大学自动控制原理PPT使用软件：Word+幕布自动控制原理第一章绪论1.1自动控制的发展历史分类经典控制理论、现代控制理论、智能控制（后者包含前者）经典控制理论方法:以传递函数为基础的频域法（线性）对象:单输入-单输出一类定常控制系统的分析与设计问题手段:图解特点:定性分析,这些理论由于其发展较早，现已日臻成熟（主要特点：用图形的方法定性分析闭环系统的稳定性及性能）目标
华为OD机试 - 贪吃蛇 - 队列（Python/JS/C/C++ 2024 E卷 200分）哪吒华为od python javascript
华为OD机试2024E卷题库疯狂收录中，刷题点这里专栏导读本专栏收录于《华为OD机试真题（Python/JS/C/C++）》。刷的越多，抽中的概率越大，私信哪吒，备注华为OD，加入华为OD刷题交流群，每一题都有详细的答题思路、详细的代码注释、3个测试用例、为什么这道题采用XX算法、XX算法的适用场景，发现新题目，随时更新，全天CSDN在线答疑。一、题目描述贪吃蛇是一个经典游戏，蛇的身体由若干方格连
【嵌入式面试】2024年嵌入式经典面试题汇总（Linux 文件IO）_嵌入式linux面试题 2401_83704192 程序员嵌入式
Linux主要通过shell命令进行安装。可以使用apt方式安装（软件包管理系统）、rpm包安装、deb包安装、tar.gz源代码包安装、tar.bz2源代码包安装、yum方式安装(安装rpm包)、bin文件安装。1.8占用系统资源linux是字符界面，占用的系统资源较于windows下的图形界面所占的资源小。Windows是图形界面，较于Linux的字符界面所占的资源大。参考链接2Linux的根
利用 PyTorch 动态计算图和自动求导机制实现自适应神经网络 drebander AI 编程 pytorch 神经网络人工智能
在深度学习任务中，不同任务的复杂度千差万别。为了解决复杂任务对模型容量的需求，同时避免简单任务因过度拟合导致的性能下降，我们可以构建一个能够根据任务自动调整网络结构的神经网络。在PyTorch中，动态计算图和自动求导机制为实现这一目标提供了强大的工具。动态网络结构设计PyTorch的动态计算图允许我们根据运行时的输入数据或任务复杂度，动态创建和修改网络结构。动态添加/移除层：可以在训练过程中根据需
豆包MarsCode 蛇年编程大作战-2025蛇年花样贪吃蛇 BLACK594 豆包MarsCode 人工智能 vue.js
贪吃蛇游戏是一款经典的休闲游戏，它的核心机制简单且充满趣味。在这个即将到来蛇年，我决定用豆包MarsCode实现他，唤起大家对经典游戏的回忆，为这个特别的年份增添一份别样的乐趣。完整代码及在线体验在线体验-豆包MarsCode花样贪吃蛇游戏素材来自于4399游戏"贪吃蛇竞技场"代码采用Vue实现，豆包MarsCode辅助开发
可解释性：走向透明与可信的人工智能一位小说男主人工智能入门深度学习机器学习人工智能神经网络
随着深度学习和机器学习技术的迅速发展，越来越多的行业和领域开始应用这些技术。然而，这些技术的“黑盒”特性也带来了不容忽视的挑战。在许多任务中，尽管这些模型表现出色，取得了相当高的精度，但其决策过程不透明，这对于依赖于机器决策的应用（如金融、医疗、法律等）来说，可能是无法接受的。因此，如何提高模型的可解释性、实现透明和可信的人工智能，成为了当下人工智能领域的重要课题。❤️本文将深入探讨机器学习中的可
Python 实现2048 yingjiejk python python pygame 开发语言
2048游戏是一个经典的数字益智游戏，使用Python语言可以很容易地实现它。以下是一个简单的代码示例：importpygameimportrandompygame.init()#设置颜色WHITE=(255,255,255)BLACK=(0,0,0)GRAY=(128,128,128)RED=(255,0,0)GREEN=(0,255,0)BLUE=(0,0,255)#设置屏幕大小size=(4
【算法】经典博弈论问题——斐波那契博弈 + Zeckendorf 定理 python 查理零世算法 python 数据结构
目录斐波那契博弈（FibonacciNim）齐肯多夫（Zeckendorf）定理示例分析实战演练斐波那契博弈（FibonacciNim）先说结论：当初始石子数目n是斐波那契数时，先手必败；否则，先手有策略获胜。证明概要:当n=2时，先手只能取1颗石子，后手直接取剩下的1颗石子获胜，因此先手必败。假设对于所有小于等于某个斐波那契数f[k]的情况，结论都成立。归纳：对于f[k+1]=f[k]+f[k-
单目测距（yolo-目标检测+标定+深度学习目标检测_测距）计算机C9硕士_算法工程师 YOLO 目标检测深度学习
YOLOv5模型介绍YOLOv5是目前最先进的目标检测算法之一，在多个数据集上取得了优秀的表现。相较于YOLOv4，YOLOv5采用了更深的Backbone网络和更高的分辨率输入图像，以提高检测精度和速度。单目测距实现方法在目标检测的基础上，我们可以通过计算物体在图像中的像素大小来估计其距离。具体方法是，首先确定某个物体的实际尺寸，然后根据该物体在图像中的像素大小计算其距离。这个方法可以应用于各种
用 Python 实现经典的 2048 游戏：一步步带你打造属于你的小游戏！一位小说男主 python python 游戏
用Python实现经典的2048游戏：一步步带你打造属于你的小游戏！（结尾附完整代码）简介2048是一个简单而又令人上瘾的数字拼图游戏。玩家通过滑动方块使相同数字的方块合并，目标是创造出数字2048！在这篇博客中，我们将用Python的Tkinter库从零开始实现这款游戏，涵盖从界面设计到逻辑实现的每一个细节，帮助你全面了解背后的开发思路。游戏特点经典玩法：滑动合并相同数字，尽可能达到2048。随
Depth Anything V2：单目深度估计的更强基线武朵欢Nerissa
DepthAnythingV2：单目深度估计的更强基线项目地址:https://gitcode.com/gh_mirrors/de/Depth-Anything-V2项目介绍DepthAnythingV2是由HKU与TikTok团队合作开发的单目深度估计算法的升级版本。这个框架显著提升了细节处理能力和鲁棒性，相比于基于深度学习的方法，它提供了更快的推理速度、更少的参数量以及更高的深度预测精度。本项
OpenAI的编程语言和框架，给程序员带来了帮助有哪些 API技术大佬Anzexi58 OpenAI 人工智能人工智能深度学习
OpenAI是一个人工智能开发公司，成立于2015年，总部位于美国旧金山。这家公司致力于研究和开发先进的人工智能技术，旨在将这些技术应用到解决全球一些最棘手的问题上。OpenAI以其卓越的技术和实验室出品的groundbreakingAIpapers而闻名。OpenAI的研究涉及深度学习、自然语言处理、视觉感知、强化学习等多个领域，并已在各种应用中取得了令人瞩目的成果。例如，在机器人领域，Open
【算法应用】基于麻雀搜索算法SSA求解车间布局优化问题小O的算法实验室智能算法智能算法应用车间布局优化智能算法应用车间布局优化智能算法
目录1.问题背景2.车间布局数学模型3.麻雀搜索算法SSA原理4.结果展示5.参考文献6.代码获取1.问题背景工厂设施布置的规划一直是工业工程领域不断研究和探索的内容，其中最具代表性之一的是系统布置设计(systemlayoutplanning，SLP)方法。作为一种经典且有效的方法，其为设施布置提供了很好的改善思路，但在长期的发展中也存在一些不可避免的缺点，如计算结果不够精确，很难确保计算结果较
DeepSeek：突破传统的AI算法与下载排行分析 smart_ljh 行业搜索人工智能 AI
DeepSeek的AI算法突破DeepSeek相较于OpenAI以及其它平台的性能对比DeepSeek的下载排行分析（截止2025/1/28AI人工智能相关DeepSeek甚至一度被推上了搜索）未来发展趋势总结在人工智能技术飞速发展的当下，搜索引擎市场也迎来了新的变革。DeepSeek，作为一款基于深度学习技术和大数据算法的搜索引擎，以其独特的优势在国内外市场上引起了广泛关注。下面介绍一下针对De
【车牌识别】卷积神经网络CNN车牌识别【含 GUI Matlab源码 2638期】 Matlab仿真科研站 matlab
欢迎来到Matlab仿真科研站博客之家✅博主简介：热爱科研的Matlab仿真开发者，修心和技术同步精进，Matlab项目合作可私信。个人主页：Matlab仿真科研站博客之家代码获取方式：扫描文章底部QQ二维码⛳️座右铭：行百里者，半于九十；路漫漫其修远兮，吾将上下而求索。⛄更多Matlab图像处理（仿真科研站版）仿真内容点击Matlab图像处理（仿真科研站版）⛄一、CNN车牌识别简介1车牌定位1.
Python 函数魔法书：基础、范例、避坑、测验与项目实战李智 - 重庆 Python 精讲精练 -从入门到实战 python 经验分享编程技巧编程实战水平考试
Python函数魔法书：基础、范例、避坑、测验与项目实战内容简介本系列文章是为Python3学习者精心设计的一套全面、实用的学习指南，旨在帮助读者从基础入门到项目实战，全面提升编程能力。文章结构由5个版块组成，内容层层递进，逻辑清晰。基础速通：n个浓缩提炼的核心知识点，夯实编程基础；经典范例：10个贴近实际的应用场景，深入理解Python3的编程技巧和应用方法；避坑宝典：10个典型错误解析，提供解
使用 Python 和 scikit-learn 实现 KNN 分类：以鸢尾花数据集为例弥树子 python scikit-learn 分类
在机器学习的世界里，K-NearestNeighbors（KNN）算法是一种简单而强大的分类方法。它基于一个直观的想法：相似的数据点往往属于同一类别。本文将通过Python的scikit-learn库实现KNN分类，以经典的鸢尾花数据集为例，展示从数据加载到模型评估的完整流程。1.KNN算法简介KNN是一种监督学习算法，主要用于分类和回归任务。它的工作原理非常简单：对于一个新的数据点，算法会查找训
DeepSeek--通向通用人工智能的深度探索者油泼辣子多加专业名词解释人工智能
一、词源与全称“DeepSeek"由"Deep”（深度）与"Seek"（探索）组合而成，中文译名为"深度求索"。其全称为"深度求索人工智能基础技术研究有限公司"，英文对应"DeepSeekArtificialIntelligenceResearchInstitute"。这一命名体现了企业对深度学习技术与未知领域持续探索的双重追求。二、发展历程初创期（2023）公司成立于中国杭州，创始团队汇聚了来自
python精彩编程200例-编程语言入门经典100例【Python版】 weixin_37988176
无论学习哪门计算机语言，只要把100例中绝大部分题目都做一遍，就基本掌握该语言的语法了。【程序1】题目：有1、2、3、4个数字，能组成多少个互不相同且无重复数字的三位数？都是多少？#Filename:001.pycnt=0#countthesumofresultforiinrange(1,5):forjinrange(1,5):forkinrange(1,5):ifi!=jandi!=kandj!
【设计模式-行为型】备忘录模式博一波设计模式备忘录模式
一、什么是备忘录模式来到备忘录模式了，这个模式我感觉相对简单一些，就是备份，或者快照。跟前面一样为了加深理解，我们引入一个电影情结来说明啥是备忘录模式，以来加深大家对备忘录模式的认识。那么，在电影中谁是此模式应用的王者呢。我想起一位，不知道大家有没有看过一个极其经典的电影，星爷的《大话西游》。在电影《大话西游》中，至尊宝利用月光宝盒不断穿越到紫霞仙子自杀前的时间段，试图改变结局。这种时间穿越和状态
linux git clone出现fatal: unable to access Failed to connect to github.com port 443: Timed out解决方案 herosunly C/C++/Linux解决方案 linux git github timeout port 443
大家好，我是herosunly。985院校硕士毕业，现担任算法研究员一职，热衷于机器学习算法研究与应用。曾获得阿里云天池比赛第一名，CCF比赛第二名，科大讯飞比赛第三名。拥有多项发明专利。对机器学习和深度学习拥有自己独到的见解。曾经辅导过若干个非计算机专业的学生进入到算法行业就业。希望和大家一起成长进步。本文主要介绍了linuxgitclone出现fatal:unabletoaccessF
人脸识别的经典深度学习方法明初啥都能学会深度学习人工智能
人脸识别的经典深度学习方法引言1.卷积神经网络（CNN）1.1LeNet1.2AlexNet1.3VGGNet1.4ResNet2.人脸检测2.1Viola-Jones算法2.2基于深度学习的人脸检测3.人脸特征提取3.1主成分分析（PCA）3.2人脸对齐3.2.1基于特征点的对齐3.2.2基于深度学习的对齐4.人脸识别模型4.1传统机器学习方法4.2基于深度学习的方法5.公式解读5.1卷积运算5
基于深度学习的遥感目标检测系统：UI界面、R-CNN模型与数据集准备 2025年数学建模美赛 R-CNN检测系统人工智能深度学习 r语言 cnn python ui 目标检测
一、引言遥感图像中的目标检测在很多领域，如环境监测、土地利用、城市规划、农业资源监测等方面有着广泛应用。遥感图像具有高分辨率和丰富的空间信息，但同时也带来了目标检测中的许多挑战，特别是在目标尺度变化、遮挡和复杂背景的情况下。因此，采用深度学习技术，尤其是卷积神经网络（CNN）和区域卷积神经网络（R-CNN），在遥感图像目标检测中取得了显著的成果。本文将详细介绍基于深度学习的遥感目标检测系统，使用R
ultralytics 是什么？博刻 AI 学习笔记 python
ultralytics是一个用于计算机视觉任务的Python库，专注于提供高效、易用的目标检测、实例分割和图像分类工具。它最著名的功能是实现YOLO（YouOnlyLookOnce）系列模型，特别是最新的YOLOv8。1.YOLO是什么？YOLO是一种流行的目标检测算法，以其速度快和精度高而闻名。YOLO的核心思想是将目标检测问题转化为一个回归问题，直接预测目标的边界框和类别。YOLOv8是YOL
《Python 动画：实现多种不同速度的炫酷烟花效果》后端工匠之道 python 开发语言新手入门表白表白代码爱心烟花
《Python动画：实现多种不同速度的炫酷烟花效果》前言烟花绽放是一个经典的视觉效果，通过Python和Matplotlib，我们可以轻松实现动态的烟花动画效果。本篇文章将教你如何实现多个不同速度、位置的烟花动画，让它们在屏幕上绚丽绽放，占满整个画布。效果预览本代码的最终效果如下，完整代码底部获取：多个烟花随机从屏幕不同位置升空。烟花绽放时，粒子以随机颜色和方向扩散。不同烟花有快有慢，呈现出真实的
Kaggle房价预测一名小菜鸟的学习之路深度学习pytorch 深度学习机器学习 python 人工智能神经网络
Kaggle房价预测作为深度学习基础篇章的总结，我们将对本章内容学以致用。下面，让我们动手实战一个Kaggle比赛：房价预测。本节将提供未经调优的数据的预处理、模型的设计和超参数的选择。我们希望读者通过动手操作、仔细观察实验现象、认真分析实验结果并不断调整方法，得到令自己满意的结果。%matplotlibinlineimporttorchimporttorch.nnasnnimportnumpya
C++ 与机器学习：构建高效推理引擎的秘诀 salsm C++编程魔法师 c++机器学习开发语言
随着深度学习模型逐渐从研究走向生产环境，推理能力成为部署中的关键环节。模型的推理引擎需要以极低的延迟快速处理输入数据，同时最大化地利用硬件资源。虽然Python被广泛用于模型的训练和开发，但C++却在推理领域独占鳌头，其性能优势和硬件控制能力无可替代。在这篇文章中，我们将从为什么选择C++、构建高效推理引擎的细节，以及相似的开源项目三个方面深入探讨如何利用C++打造高效的机器学习推理引擎。目录为什
《动手学深度学习》(PyTorch版) chaser&upper 深度学习 pytorch 深度学习 python
《动手学深度学习》PyTorch版前言简介面向人群食用方法方法一方法二方法三目录原书地址引用阅读指南前言读书啦！！！本项目将《动手学深度学习》原书中MXNet代码实现改为PyTorch实现。原书作者：阿斯顿·张、李沐、扎卡里C.立顿、亚历山大J.斯莫拉以及其他社区贡献者，GitHub地址：https://github.com/d2l-ai/d2l-zh此书的中英版本存在一些不同，针对此书英文版的P
从简单到深刻的认知发展 AI架构设计之禅计算机软件编程原理与应用实践 java python javascript kotlin golang 架构人工智能
认知发展，人工智能，深度学习，神经网络，机器学习，自然语言处理，计算机视觉1.背景介绍认知发展是人类从简单到复杂的思维方式演进的过程，它涉及感知、记忆、语言、推理和决策等多个方面。随着人工智能技术的飞速发展，我们开始尝试用计算机模拟人类的认知能力，构建能够学习、理解和解决复杂问题的智能系统。从早期的符号逻辑到如今的深度学习，人工智能的发展经历了多个阶段。早期的人工智能研究主要集中在规则和逻辑推理上
使用onnxruntime-web 运行yolov8-nano推理 CHEN_RUI_2200 机器学习 YOLO
ONNX（OpenNeuralNetworkExchange）模型具有以下两个特点促成了我们可以使用onnxruntime-web直接在web端上运行推理模型，为了让这个推理更直观，我选择了试验下yolov8识别预览图片：1.跨平台兼容性ONNX是一种开放的格式，可以在不同的深度学习框架之间共享模型，如PyTorch、TensorFlow、MXNet和Caffe2。这使得用户可以在一个框架中训练模
PHP，安卓，UI，java，linux视频教程合集 cocos2d-x小菜 java UI PHP android linux
╔-----------------------------------╗┆
各表中的列名必须唯一。在表 'dbo.XXX' 中多次指定了列名 'XXX'。 bozch .net .net mvc
在.net mvc5中，在执行某一操作的时候，出现了如下错误：各表中的列名必须唯一。在表 'dbo.XXX' 中多次指定了列名 'XXX'。经查询当前的操作与错误内容无关，经过对错误信息的排查发现，事故出现在数据库迁移上。回想过去：在迁移之前已经对数据库进行了添加字段操作，再次进行迁移插入XXX字段的时候，就会提示如上错误。 &
Java 对象大小的计算 e200702084 java
Java对象的大小如何计算一个对象的大小呢？
Mybatis Spring 171815164 mybatis
ApplicationContext ac = new ClassPathXmlApplicationContext("applicationContext.xml"); CustomerService userService = (CustomerService) ac.getBean("customerService"); Customer cust
JVM 不稳定参数 g21121 jvm
-XX 参数被称为不稳定参数，之所以这么叫是因为此类参数的设置很容易引起JVM 性能上的差异，使JVM 存在极大的不稳定性。当然这是在非合理设置的前提下，如果此类参数设置合理讲大大提高JVM 的性能及稳定性。可以说“不稳定参数”
用户自动登录网站永夜-极光用户
1.目标:实现用户登录后,再次登录就自动登录,无需用户名和密码 2.思路:将用户的信息保存为cookie 每次用户访问网站,通过filter拦截所有请求,在filter中读取所有的cookie,如果找到了保存登录信息的cookie,那么在cookie中读取登录信息,然后直接
centos7 安装后失去win7的引导记录程序员是怎么炼成的操作系统
1.使用root身份(必须)打开 /boot/grub2/grub.cfg 2.找到 ### BEGIN /etc/grub.d/30_os-prober ### 在后面添加 menuentry "Windows 7 (loader) (on /dev/sda1)" {
Oracle 10g 官方中文安装帮助文档以及Oracle官方中文教程文档下载 aijuans oracle
Oracle 10g 官方中文安装帮助文档下载：http://download.csdn.net/tag/Oracle%E4%B8%AD%E6%96%87API%EF%BC%8COracle%E4%B8%AD%E6%96%87%E6%96%87%E6%A1%A3%EF%BC%8Coracle%E5%AD%A6%E4%B9%A0%E6%96%87%E6%A1%A3 Oracle 10g 官方中文教程
JavaEE开源快速开发平台G4Studio_V3.2发布了無為子 AOP oracle mysql javaee G4Studio
我非常高兴地宣布,今天我们最新的JavaEE开源快速开发平台G4Studio_V3.2版本已经正式发布。大家可以通过如下地址下载。访问G4Studio网站 http://www.g4it.org G4Studio_V3.2版本变更日志功能新增 (1).新增了系统右下角滑出提示窗口功能。 (2).新增了文件资源的Zip压缩和解压缩
Oracle常用的单行函数应用技巧总结百合不是茶日期函数转换函数(核心)数字函数通用函数(核心)字符函数
单行函数; 字符函数,数字函数,日期函数,转换函数(核心),通用函数(核心) 一:字符函数: .UPPER(字符串) 将字符串转为大写 .LOWER (字符串) 将字符串转为小写 .INITCAP(字符串) 将首字母大写 .LENGTH (字符串) 字符串的长度 .REPLACE(字符串,'A','_') 将字符串字符A转换成_
Mockito异常测试实例 bijian1013 java 单元测试 mockito
Mockito异常测试实例： package com.bijian.study; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.when; import org.junit.Assert; import org.junit.Test; import org.mockito.
GA与量子恒道统计 Bill_chen JavaScript 浏览器百度 Google 防火墙
前一阵子，统计**网址时，Google Analytics（GA）和量子恒道统计（也称量子统计），数据有较大的偏差，仔细找相关资料研究了下，总结如下：为何GA和量子网站统计（量子统计前身为雅虎统计）结果不同？首先：没有一种网站统计工具能保证百分之百的准确出现该问题可能有以下几个原因：（1）不同的统计分析系统的算法机制不同；（2）统计代码放置的位置和前后
【Linux命令三】Top命令 bit1129 linux命令
Linux的Top命令类似于Windows的任务管理器，可以查看当前系统的运行情况，包括CPU、内存的使用情况等。如下是一个Top命令的执行结果： top - 21:22:04 up 1 day, 23:49, 1 user, load average: 1.10, 1.66, 1.99 Tasks: 202 total, 4 running, 198 sl
spring四种依赖注入方式白糖_ spring
平常的java开发中，程序员在某个类中需要依赖其它类的方法，则通常是new一个依赖类再调用类实例的方法，这种开发存在的问题是new的类实例不好统一管理，spring提出了依赖注入的思想，即依赖类不由程序员实例化，而是通过spring容器帮我们new指定实例并且将实例注入到需要该对象的类中。依赖注入的另一种说法是“控制反转”，通俗的理解是：平常我们new一个实例，这个实例的控制权是我
angular.injector boyitech AngularJS AngularJS API
angular.injector 描述: 创建一个injector对象, 调用injector对象的方法可以获得angular的service, 或者用来做依赖注入. 使用方法: angular.injector(modules, [strictDi]) 参数详解: Param Type Details mod
java-同步访问一个数组Integer[10]，生产者不断地往数组放入整数1000，数组满时等待；消费者不断地将数组里面的数置零，数组空时等待 bylijinnan Integer
public class PC { /** * 题目：生产者-消费者。 * 同步访问一个数组Integer[10]，生产者不断地往数组放入整数1000，数组满时等待；消费者不断地将数组里面的数置零，数组空时等待。 */ private static final Integer[] val=new Integer[10]; private static
使用Struts2.2.1配置 Chen.H apache spring Web xml struts
Struts2.2.1 需要如下 jar包: commons-fileupload-1.2.1.jar commons-io-1.3.2.jar commons-logging-1.0.4.jar freemarker-2.3.16.jar javassist-3.7.ga.jar ognl-3.0.jar spring.jar struts2-core-2.2.1.jar struts2-sp
[职业与教育]青春之歌 comsci 教育
每个人都有自己的青春之歌............但是我要说的却不是青春... 大家如果在自己的职业生涯没有给自己以后创业留一点点机会,仅仅凭学历和人脉关系,是难以在竞争激烈的市场中生存下去的.... &nbs
oracle连接(join)中使用using关键字 daizj JOIN oracle sql using
在oracle连接(join)中使用using关键字 34. View the Exhibit and examine the structure of the ORDERS and ORDER_ITEMS tables. Evaluate the following SQL statement: SELECT oi.order_id, product_id, order_date FRO
NIO示例 daysinsun nio
NIO服务端代码： public class NIOServer { private Selector selector; public void startServer(int port) throws IOException { ServerSocketChannel serverChannel = ServerSocketChannel.open(
C语言学习homework1 dcj3sjt126com c homework
0、课堂练习做完 1、使用sizeof计算出你所知道的所有的类型占用的空间。 int x; sizeof(x); sizeof(int); # include <stdio.h> int main(void) { int x1; char x2; double x3; float x4; printf(&quo
select in order by , mysql排序 dcj3sjt126com mysql
If i select like this: SELECT id FROM users WHERE id IN(3,4,8,1); This by default will select users in this order 1,3,4,8, I would like to select them in the same order that i put IN() values so:
页面校验-新建项目 fanxiaolong 页面校验
$(document).ready( function() { var flag = true; $('#changeform').submit(function() { var projectScValNull = true; var s =""; var parent_id = $("#parent_id").v
Ehcache（02）——ehcache.xml简介 234390216 ehcache ehcache.xml 简介
ehcache.xml简介 ehcache.xml文件是用来定义Ehcache的配置信息的，更准确的来说它是定义CacheManager的配置信息的。根据之前我们在《Ehcache简介》一文中对CacheManager的介绍我们知道一切Ehcache的应用都是从CacheManager开始的。在不指定配置信
junit 4.11中三个新功能 jackyrong java
junit 4.11中两个新增的功能，首先是注解中可以参数化，比如 import static org.junit.Assert.assertEquals; import java.util.Arrays; import org.junit.Test; import org.junit.runner.RunWith; import org.junit.runn
国外程序员爱用苹果Mac电脑的10大理由 php教程分享 windows PHP unix Microsoft perl
Mac 在国外很受欢迎，尤其是在设计/web开发/IT 人员圈子里。普通用户喜欢 Mac 可以理解，毕竟 Mac 设计美观，简单好用，没有病毒。那么为什么专业人士也对 Mac 情有独钟呢？从个人使用经验来看我想有下面几个原因： 1、Mac OS X 是基于 Unix 的这一点太重要了，尤其是对开发人员，至少对于我来说很重要，这意味着Unix 下一堆好用的工具都可以随手捡到。如果你是个 wi
位运算、异或的实际应用 wenjinglian 位运算
一．位操作基础，用一张表描述位操作符的应用规则并详细解释。二．常用位操作小技巧，有判断奇偶、交换两数、变换符号、求绝对值。三．位操作与空间压缩，针对筛素数进行空间压缩。 &n
weblogic部署项目出现的一些问题（持续补充中……） Everyday都不同 weblogic部署失败
好吧，weblogic的问题确实…… 问题一： org.springframework.beans.factory.BeanDefinitionStoreException: Failed to read candidate component class: URL [zip:E:/weblogic/user_projects/domains/base_domain/serve
tomcat7性能调优（01） toknowme tomcat7
Tomcat优化： 1、最大连接数最大线程等设置 <Connector port="8082" protocol="HTTP/1.1" useBodyEncodingForURI="t
PO VO DAO DTO BO TO概念与区别 xp9802 java DAO 设计模式 bean 领域模型
O/R Mapping 是 Object Relational Mapping（对象关系映射）的缩写。通俗点讲，就是将对象与关系数据库绑定，用对象来表示关系数据。在O/R Mapping的世界里，有两个基本的也是重要的东东需要了解，即VO，PO。它们的关系应该是相互独立的，一个VO可以只是PO的部分，也可以是多个PO构成，同样也可以等同于一个PO（指的是他们的属性）。这样，PO独立出来，数据持