大数据机器学习实验室

[论文翻译]Deep Learning

论文来源:Deep learning

Deep Learning

Yann LeCun, Yoshua Bengio & Geoffrey Hinton

Abstract

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech rec-ognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech

摘要

深度学习允许由多个处理层组成的计算模型学习具有多个抽象级别的数据表示。这些方法极大地提高了语音识别、视觉对象识别、目标检测以及药物发现和基因组学等许多领域的最新进展。深度学习通过使用反向传播算法来发掘大型数据集中的复杂结构，以指示机器应如何更改用于计算每层表示的内部参数，这些参数用于计算前一层的表示。深度卷积网络在处理图像、视频、语音和音频方面带来了突破，而递归网络则为文本和语音等序列数据带来了光明。

引言

Machine-learning technology powers many aspects of modern society: from web searches to content filtering on social net-works to recommendations on e-commerce websites, and it is increasingly present in consumer products such as cameras and smartphones. Machine-learning systems are used to identify objects in images, transcribe speech into text, match news items, posts or products with users’ interests, and select relevant results of search. Increasingly, these applications make use of a class of techniques called deep learning.

机器学习技术为现代社会的许多方面提供了动力：从网络搜索到社交网络上的内容过滤，再到电子商务网站上的推荐，它越来越多地出现在诸如照相机和智能手机等消费品中。机器学习系统用于识别图像中的目标，将语音转录成文本，将新闻条目、帖子或产品与用户的兴趣相匹配，并选择相关的搜索结果。这些应用程序越来越多地使用一种称为深度学习的技术。
Conventional machine-learning techniques were limited in their ability to process natural data in their raw form. For decades, constructing a pattern-recognition or machine-learning system required careful engineering and considerable domain expertise to design a feature extractor that transformed the raw data (such as the pixel values of an image) into a suitable internal representation or feature vector from which the learning subsystem, often a classifier, could detect or classify patterns in the input.
传统的机器学习技术在处理原始形式的自然数据方面受到限制。几十年来，构建一个模式识别或机器学习系统需要仔细的工程设计和相当多的领域专业知识来设计一个特征提取器，从学习子系统中，将原始数据（如图像的像素值）转换成合适的内部表示或特征向量，通常一个分类器，可以检测或分类输入中的模式。
Representation learning is a set of methods that allows a machine to be fed with raw data and to automatically discover the representations needed for detection or classification. Deep-learning methods are representation-learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level. With the composition of enough such transformations, very complex functions can be learned. For classification tasks, higher layers of representation amplify aspects of the input that are important for discrimination and suppress irrelevant variations. An image, for example, comes in the form of an array of pixel values, and the learned features in the first layer of representation typically represent the presence or absence of edges at particular orientations and locations in the image. The second layer typically detects motifs by spotting particular arrangements of edges, regardless of small variations in the edge positions. The third layer may assemble motifs into larger combinations that correspond to parts of familiar objects, and subsequent layers would detect objects as combinations of these parts. The key aspect of deep learning is that these layers of features are not designed by human engineers: they are learned from data using a general-purpose learning procedure.
表示学习是一系列方法，允许计算机接收原始数据并自动检测或分类所需的表示。深度学习方法通过组合简单但非线性的模块获得具有多个表示层的表示学习方法，每个模块将一个级别的表示（从原始输入开始）转换为更高、更抽象的表示。通过组合足够多的这种变换，可以学习非常复杂的函数。对于分类任务，更高层的表示放大了输入中对区分很重要的方面，并抑制了不相关的变化。例如，图像以像素值阵列的形式出现，并且第一表示层中的学习特征通常表示图像中特定方向和位置处的边的存在与否。第二层通常通过检测边缘的特定排列来检测目标，而不考虑边缘位置的微小变化。第三层可以将模体组合成与熟悉对象的部分相对应的更大的组合，并且随后的层将检测作为这些部分的组合的对象。深度学习的关键在于，这些特征层不是人工设计的：它们是通过通用学习过程从数据中学习的。
Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years. It has turned out to be very good at discovering intricate structures in high-dimensional data and is therefore applicable to many domains of science, business and government. In addition to beating records in image recognition 1–4 and speech recognition 5–7, it has beaten other machine-learning techniques at predicting the activity of potential drug molecules 8, analysing particle accelerator data 9,10, reconstructing brain circuits 11, and predicting the effects of mutations in non-coding DNA on gene expression and disease 12,13. Perhaps more surprisingly, deep learning has produced extremely promising results for various tasks in natural language understanding 14, particularly topic classification, sentiment analysis, question answering 15 and language translation 16,17.
深度学习在解决多年来一直阻碍人工智能界难以突破的问题方面取得了重大进展。事实证明，它非常善于发现高维数据中复杂的结构，因此适用于科学、商业和政府的许多领域。除了在图像识别和语音识别方面的记录外，它在预测潜在药物分子的活性、分析粒子加速器数据、重建脑回路、预测非编码DNA突变对基因表达的影响等方面也胜过其他机器学习技术疾病。更令人惊讶的是，深度学习在自然语言理解也有很大的作用，尤其是主题分类、情感分析、问题回答和语言翻译。
We think that deep learning will have many more successes in the near future because it requires very little engineering by hand, so it can easily take advantage of increases in the amount of available computation and data. New learning algorithms and architectures that are currently being developed for deep neural networks will only accelerate this progress.
我们认为，在不久的将来，深度学习将取得更多的成功，因为它不涉及很多人工，因此可以很容易地利用可用计算量和增加的数据。目前为深层神经网络开发的新的学习算法和结构只会加速这一进展。

Supervised learning

The most common form of machine learning, deep or not, is super-vised learning. Imagine that we want to build a system that can classify images as containing, say, a house, a car, a person or a pet. We first collect a large data set of images of houses, cars, people and pets, each labelled with its category. During training, the machine is shown an image and produces an output in the form of a vector of scores, one for each category. We want the desired category to have the highest score of all categories, but this is unlikely to happen before training. We compute an objective function that measures the error (or distance) between the output scores and the desired pattern of scores. The machine then modifies its internal adjustable parameters to reduce this error. These adjustable parameters, often called weights, are real numbers that can be seen as ‘knobs’ that define the input–output function of the machine. In a typical deep-learning system, there may be hundreds of millions of these adjustable weights, and hundreds of millions of labelled examples with which to train the machine.
无论网络是否深，机器学习最常见的形式都是监督学习。我们想构建一个可以分类图像中包含的是房子，汽车，人或者宠物的系统。我们首先收集了一个大数据集，里面包括房子，汽车，人和宠物，每一个都有其分类的标签。在训练过程中，计算机会显示一个图像，并以分数向量的形式生成输出，每个类别对应一个分数。我们希望理想的类别在所有类别中得分最高，但这不可能在训练前发生。我们计算一个目标函数来度量输出分数和期望的分数之间的误差（或距离）。然后，计算机修改其内部可调参数以减少此误差。这些可调参数，通常称为权重，是一个实数，可以看作是定义机器输入输出功能的“旋钮”。在一个典型的深度学习系统中，可能有数以亿计的可调权重，以及数以亿计的用于训练机器的标签示例。
To properly adjust the weight vector, the learning algorithm computes a gradient vector that, for each weight, indicates by what amount the error would increase or decrease if the weight were increased by a tiny amount. The weight vector is then adjusted in the opposite direction to the gradient vector.
为了正确地调整权重向量，学习算法计算一个梯度向量，对于每个权重，该向量指示如果权重稍微增加一点，误差会增加或减少多少。然后在与梯度向量相反的方向上调整权重向量。
The objective function, averaged over all the training examples, can be seen as a kind of hilly landscape in the high-dimensional space of weight values. The negative gradient vector indicates the direction of steepest descent in this landscape, taking it closer to a minimum, where the output error is low on average.
在所有训练实例中取平均值的目标函数可以看作是一种山地景观在高维空间中的权重值。负梯度向量表示该景观中最陡下降的方向，使其更接近最小值，输出误差平均较低。
In practice, most practitioners use a procedure called stochastic gradient descent (SGD).This consists of showing the input vector for a few examples, computing the outputs and the errors, computing the average gradient for those examples, and adjusting the weights accordingly. The process is repeated for many small sets of examples from the training set until the average of the objective function stops decreasing. It is called stochastic because each small set of examples gives a noisy estimate of the average gradient over all examples. This simple procedure usually finds a good set of weights surprisingly quickly when compared with far more elaborate optimization techniques 18. After training, the performance of the system is measured on a different set of examples called a test set. This serves to test the generalization ability of the machine — its ability to produce sensible answers on new inputs that it has never seen during training.
在实践中，大多数人使用称为随机梯度下降（SGD）。包括显示几个例子的输入向量，计算输出和误差，计算这些例子的平均梯度，并相应地调整权重。从训练集中的许多小样本重复这个过程，直到目标函数的平均值停止下降。之所以称之为随机，是因为每个小样本集都给出了所有样本的平均梯度的噪声估计。与更精细的优化技术相比，这个简单的过程通常会很快地找到一组好的权重。在训练之后，系统的性能将在一组称为与训练集不同的测试集上进行测量。这有助于测试机器的泛化能力，即它对新输入的图像产生合理答案的能力，新图像是它在训练中从未见过的。
Many of the current practical applications of machine learning use linear classifiers on top of hand-engineered features. A two-class linear classifier computes a weighted sum of the feature vector components. If the weighted sum is above a threshold, the input is classified as belonging to a particular category.
目前机器学习的许多实际应用是在人工设计的特征之上使用线性分类器。两类线性分类器计算特征向量分量的加权和。如果加权和高于阈值，则输入被归类为属于特定类别。
Since the 1960s we have known that linear classifiers can only carve their input space into very simple regions, namely half-spaces separated by a hyperplane 19. But problems such as image and speech recognition require the input–output function to be insensitive to irrelevant variations of the input, such as variations in position, orientation or illumination of an object, or variations in the pitch or accent of speech, while being very sensitive to particular minute variations (for example, the difference between a white wolf and a breed of wolf-like white dog called a Samoyed). At the pixel level, images of two Samoyeds in different poses and in different environments may be very different from each other, whereas two images of a Samoyed and a wolf in the same position and on similar backgrounds may be very similar to each other. A linear classifier, or any other ‘shallow’ classifier operating on raw pixels could not possibly distinguish the latter two, while putting the former two in the same category. This is why shallow classifiers require a good feature extractor that solves the selectivity–invariance dilemma — one that produces representations that are selective to the aspects of the image that are important for discrimination, but that are invariant to irrelevant aspects such as the pose of the animal. To make classifiers more powerful, one can use generic non-linear features, as with kernel methods 20, but generic features such as those arising with the Gaussian kernel do not allow the learner to generalize well far from the training examples 21. The conventional option is to hand design good feature extractors, which requires a consider-able amount of engineering skill and domain expertise. But this can all be avoided if good features can be learned automatically using a general-purpose learning procedure. This is the key advantage of deep learning.
自20世纪60年代以来，我们就知道线性分类器只能将其输入空间分割成非常简单的区域，即由超平面分隔的半空间。但是，像图像和语音识别这样的问题要求输入-输出函数对输入的不相关变化不敏感，例如对象的位置、方向或照明的变化，或者音调或重音的变化，同时对特定的细微变化非常敏感（例如，白狼和一种叫萨摩耶的狼样的白色狗之间的区别）。在像素级，不同姿势和不同环境下的两个萨摩耶的图像可能大不相同，而在相同位置和相似背景下的萨摩耶和狼的两个图像可能非常相似。一个线性分类器，或任何其他原始像素级的“浅”分类器不能区分两者，而将两者放在同一类别中。这就是为什么浅层分类器需要一个很好的特征提取器来解决选择性-不变性的困境-一个能够产生对图像的某些方面有选择性的表示，而这些方面对于不相关的方面（如动物的姿势）是不变的。为了使分类器更强大，我们可以使用泛型非线性特征，如核方法，但泛型特征（如高斯核产生的特征）在与训练样本大不相同时泛化额能力不强。传统的选择是手工设计好的特征提取器，这需要大量的工程技能和领域专业知识。但是，如果可以使用通用的学习过程自动学习好的特征，那么这些都可以避免。这是深度学习的关键优势。

A deep-learning architecture is a multilayer stack of simple modules, all (or most) of which are subject to learning, and many of which compute non-linear input–output mappings. Each module in the stack transforms its input to increase both the selectivity and the invariance of the representation. With multiple non-linear layers, say a depth of 5 to 20, a system can implement extremely intricate functions of its inputs that are simultaneously sensitive to minute details — distinguishing Samoyeds from white wolves — and insensitive to large irrelevant variations such as the background, pose, lighting and surrounding objects.
深度学习体系结构是由简单模块组成的多层堆栈，所有模块（或大部分模块）都需要学习，其中许多模块计算非线性输入输出映射。堆栈中的每个模块转换其输入，以提高表示的选择性和不变性。有了多个非线性层，比如5到20的深度，一个系统可以实现其输入的极其复杂的功能，这些功能同时对微小的细节敏感——区分萨摩耶和白狼——并且对大的无关变化不敏感，比如背景、姿势、光照和周围的物体。

Backpropagation to train multilayer architectures

From the earliest days of pattern recognition 22,23, the aim of researchers has been to replace hand-engineered features with trainable multilayer networks, but despite its simplicity, the solution was not widely understood until the mid 1980s. As it turns out, multilayer architectures can be trained by simple stochastic gradient descent. As long as the modules are relatively smooth functions of their inputs and of their internal weights, one can compute gradients using the backpropagation procedure. The idea that this could be done, and that it worked, was discovered independently by several different groups during the 1970s and 1980s 24–27.
从模式识别的早期，研究人员的目标是用可训练的多层网络来代替手工设计的特征，尽管它很简单，也直到20世纪80年代中期才被广泛理解。事实证明，多层结构可以通过简单的随机梯度下降来训练。只要模块是其输入和内部权重的相对平滑函数，就可以使用反向传播程序计算梯度。在20世纪70年代和80年代，几个不同的团体各自地发现了这个想法，认为这是可以做到的，而且是有效的
The back propagation procedure to compute the gradient of an objective function with respect to the weights of a multilayer stack of modules is nothing more than a practical application of the chain rule for derivatives. The key insight is that the derivative (or gradient) of the objective with respect to the input of a module can be computed by working backwards from the gradient with respect to the output of that module (or the input of the subsequent module) (Fig.1). The back propagation equation can be applied repeatedly to propagate gradients through all modules, starting from the output at the top (where the network produces its prediction) all the way to the bottom (where the external input is fed). Once these gradients have been computed, it is straightforward to compute the gradients with respect to the weights of each module.
用反向传播法计算一个目标函数相对于一个多层模块栈的权重的梯度，不过是导数链规则的一个实际应用。关键是，目标相对于模块输入的导数（或梯度）可以通过从相对于该模块输出（或后续模块的输入）的梯度向后计算（图1）。反向传播方程可重复应用于在所有模块中传播梯度，从顶部的输出（网络产生其预测）一直到底部（外部输入被馈送）。一旦计算了这些梯度，就很容易计算出相对于每个模块权重的梯度。

Many applications of deep learning use feed forward neural net-work architectures (Fig. 1), which learn to map a fixed-size input (for example, an image) to a fixed-size output (for example, a prob-ability for each of several categories). To go from one layer to the next, a set of units compute a weighted sum of their inputs from the previous layer and pass the result through a non-linear function. At present, the most popular non-linear function is the rectified linear unit (ReLU), which is simply the half-wave rectifier f(z) = max(z, 0). In past decades, neural nets used smoother non-linearities, such as tanh(z) or 1/(1 + exp(−z)), but the ReLU typically learns much faster in networks with many layers, allowing training of a deep supervised network without unsupervised pre-training28. Units that are not in the input or output layer are conventionally called hidden units. The hidden layers can be seen as distorting the input in a non-linear way so that categories become linearly separable by the last layer (Fig.1).
很多深度学习的应用都使用前馈神经网络架构（图1），该架构学习将固定大小的输入（例如，图像）映射到固定大小的输出（例如，几个类别中的每一个的概率）。为了从一层到下一层，一组单元计算上一层输入的加权和，并将结果传递给一个非线性函数。目前最流行的非线性函数是整流线性单（ReLU），即半波整流器 $f （ z ） = m a x （ z, 0 ）$ 。在过去的几十年里，神经网络使用更平滑的非线性，例如 $t a n h （ z ）$ 或 $1 / （ 1 + e x p （ - z ））$ ，但ReLU通常在多层网络中学习得更快，允许在无监督预训练的情况下训练深度监督网络。不在输入或输出层的单元通常称为隐藏单元。隐藏层可以看作是以非线性方式扭曲输入，使得类别可以由最后一层线性分离。
In the late 1990s, neural nets and backpropagation were largely forsaken by the machine-learning community and ignored by the computer-vision and speech-recognition communities. It was widely thought that learning useful, multistage, feature extractors with little prior knowledge was infeasible.
在20世纪90年代末，神经网络和反向传播在很大程度上被机器学习界所抛弃，被计算机视觉和语音识别界所忽视。人们普遍认为，学习有用的、多阶段的、具有很少先验知识的特征抽取器是不可行的。特别是，人们普遍认为简单的梯度下降会陷入局部最小权配置中，对于这种配置，任何小的变化都会降低平均误差。
In particular, it was commonly thought that simple gradient descent would get trapped in poor local minima — weight configurations for which no small change would reduce the average error. In practice, poor local minima are rarely a problem with large net-works. Regardless of the initial conditions, the system nearly always reaches solutions of very similar quality. Recent theoretical and empirical results strongly suggest that local minima are not a serious issue in general. Instead, the landscape is packed with a combinatorially large number of saddle points where the gradient is zero, and the surface curves up in most dimensions and curves down in the remainder 29,30. The analysis seems to show that saddle points with only a few downward curving directions are present in very large numbers, but almost all of them have very similar values of the objective function. Hence, it does not much matter which of these saddle points the algorithm gets stuck at.
在实践中，较差的局部极小值在大型网络中很少出现问题。不管初始条件如何，系统几乎总能得到质量非常相似的解。最近的理论和实证结果表明，局部极小值一般不是一个严重的问题。取而代之的是，取值空间中有大量梯度为零的鞍点，而曲面在大多数维度上呈上升趋势，在提醒中向下弯曲。分析表明，只有几个向下弯曲方向的鞍点数量非常多，但几乎所有鞍点的目标函数值都非常相似。因此，算法在这些鞍点中哪一个被卡住并不重要。
Interest in deep feed forward networks was revived around 2006 (refs31–34) by a group of researchers brought together by the Canadian Institute for Advanced Research (CIFAR). The researchers introduced unsupervised learning procedures that could create layers of feature detectors without requiring labelled data. The objective in learning each layer of feature detectors was to be able to reconstruct or model the activities of feature detectors (or raw inputs) in the layer below. By ‘pre-training’ several layers of progressively more complex feature detectors using this reconstruction objective, the weights of a deep network could be initialized to sensible values. A final layer of output units could then be added to the top of the network and the whole deep system could be fine-tuned using standard back propagation 33–35. This worked remarkably well for recognizing handwritten digits or for detecting pedestrians, especially when the amount of labelled data was very limited 36.
2006年前后，由加拿大高级研究所（CIFAR）召集的一组研究人员重新唤起了人们对深度前馈网络的兴趣。研究人员引入了无监督学习，这种程序可以在不需要标记数据的情况下创建特征检测器层。学习每一层特征检测器是为了重建或模拟下一层特征检测器（或原始输入）的活动。通过使用该重构目标对多个逐步复杂的特征检测器进行“预训练”，可以将深度网络的权值初始化为合理值。最后一层输出单元可以被添加到网络的顶部，整个深系统可以使用标准反向传播进行微调。这对于识别手写数字或检测行人非常有效，尤其是在标签数据量非常有限的情况下。
The first major application of this pre-training approach was in speech recognition, and it was made possible by the advent of fast graphics processing units (GPUs) that were convenient to program 37 and allowed researchers to train networks 10 or 20 times faster. In 2009, the approach was used to map short temporal windows of coefficients extracted from a sound wave to a set of probabilities for the various fragments of speech that might be represented by the frame in the centre of the window. It achieved record-breaking results on a standard speech recognition benchmark that used a small vocabulary38 and was quickly developed to give record-breaking results on a large vocabular y task 39. By 2012, versions of the deep net from 2009 were being developed by many of the major speech groups 6 and were already being deployed in Android phones. For smaller data sets, unsupervised pre-training helps to prevent over fitting 40, leading to significantly better generalization when the number of labelled examples is small, or in a transfer setting where we have lots of examples for some ‘source’ tasks but very few for some ‘target’ tasks. Once deep learning had been rehabilitated, it turned out that the pre-training stage was only needed for small data sets.
这种预训练方法的第一个主要应用是在语音识别中，并且由于GPU的出现而成为可能，它便于编程，并使研究人员训练网络的速度提高了10到20倍。2009年，该方法被用于从声波中提取的系数的短时间窗口映射为各种语音片段的概率集，这些语音片段可能由窗口中心的帧表示。它在使用小词汇的标准语音识别基准测试中破了纪录，并且很快被开发出来，在一个大词汇量任务中破了纪录。到2012年，许多主要的语音组织都在开发2009年的deepnet版本，并且已经部署在Android手机上。对于较小的数据集，无监督的预训练有助于防止过拟合，在标记的示例数量较少时，或者在迁移学习中，“源”很多，但“目标”很少任务，显著提高泛化能力。一旦深度学习得到恢复，原来只需要对小数据集进行预训练。
There was, however, one particular type of deep, feedforward net-work that was much easier to train and generalized much better than networks with full connectivity between adjacent layers. This was the convolutional neural network (ConvNet)41,42. It achieved many practical successes during the period when neural networks were out of favour and it has recently been widely adopted by the computer-vision community
然而，有一种特殊类型的深层前馈网络比相邻层之间全连通的网络更容易训练和推广。这就是卷积神经网络（ConvNet）。它在神经网络不受欢迎的时期取得了许多实际的成功，最近被计算机视觉界广泛采用。

Convolutional neural networks

ConvNets are designed to process data that come in the form of multiple arrays, for example a colour image composed of three 2D arrays containing pixel intensities in the three colour channels. Many data modalities are in the form of multiple arrays: 1D for signals and sequences, including language; 2D for images or audio spectrograms; and 3D for video or volumetric images. There are four key ideas behind ConvNets that take advantage of the properties of natural signals: local connections, shared weights, pooling and the use of many layers.
卷积网络用于处理以多个阵列形式出现的数据，例如由三个2D阵列组成的彩色图像，其中包含三个颜色通道中的像素强度。许多数据模式以多个阵列的形式存在：1D用于信号和序列，包括语言；2D用于图像或音频频谱图；3D用于视频或体积图像。卷积网络利用了自然信号的特性，其背后有四个关键思想：局部连接、共享权重、池化和多层的使用。
The architecture of a typical ConvNet (Fig. 2) is structured as a series of stages. The first few stages are composed of two types of layers: convolutional layers and pooling layers. Units in a convolutional layer are organized in feature maps, within which each unit is connected to local patches in the feature maps of the previous layer through a set of weights called a filter bank. The result of this local weighted sum is then passed through a non-linearity such as a ReLU. All units in a feature map share the same filter bank. Different feature maps in a layer use different filter banks. The reason for this architecture is twofold. First, in array data such as images, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected. Second, the local statistics of images and other signals are invariant to location. In other words, if a motif can appear in one part of the image, it could appear anywhere, hence the idea of units at different locations sharing the same weights and detecting the same pattern in different parts of the array. Mathematically, the filtering operation performed by a feature map is a discrete convolution, hence the name.

典型卷积网络（图2）的体系结构是由几个阶段构成的。前几个阶段由两种类型的层组成：卷积层和池化层。卷积层用于提取单元特征，其中每个单元通过一组称为滤波器组的权重连接到上一层的特征映射中的局部面片。然后，该局部加权和的结果通过一个非线性激活函数，如ReLU。特征图中的所有单元共享同一组滤波器。一个图层中的不同特征映射使用不同的滤波器组。采用这种架构有两个原因。首先，在图像等阵列数据中，值的局部通常高度相关，形成易于检测的独特的局部模体。第二，图像和其他信号的局部统计信息对位置不变性。换言之，如果一个模体可以出现在图像的一个部分，那么它就可以出现在任何地方，因此在不同位置的单元共享相同的权重，并在阵列的不同部分检测相同的模式。从数学上讲，特征映射执行的过滤操作是一个离散卷积，因此得名。
Although the role of the convolutional layer is to detect local conjunctions of features from the previous layer, the role of the pooling layer is to merge semantically similar features into one. Because the relative positions of the features forming a motif can vary somewhat, reliably detecting the motif can be done by coarse-graining the position of each feature. A typical pooling unit computes the maximum of a local patch of units in one feature map (or in a few feature maps). Neighbouring pooling units take input from patches that are shifted by more than one row or column, thereby reducing the dimension of the representation and creating an invariance to small shifts and distortions. Two or three stages of convolution, non-linearity and pooling are stacked, followed by more convolutional and fully-connected layers. Backpropagating gradients through a ConvNet is as simple as through a regular deep network, allowing all the weights in all the filter banks to be trained.
卷积层的作用是检测上一层特征的局部连接，池化层的作用是将语义相似的特征合并为一个。由于构成一个模体的特征的相对位置可能有所不同，所以可以通过对每个特征的位置进行粗粒化来检测模体。典型的最大池化，计算一个特征映射（或几个特征映射）中局部最大值。邻近池化从相邻的一行或一列（几行或几列）中获取输入，从而减少了表示的维数，并创建了对小位移和扭曲的不变性。二或三阶段的卷积，堆叠非线性函数核池化层，之后接更多的卷积层和全连接层。通过卷积网络反向传播梯度就像通过常规的深层网络一样简单，允许训练所有滤波器组中的所有权重。
Deep neural networks exploit the property that many natural signals are compositional hierarchies, in which higher-level features are obtained by composing lower-level ones. In images, local combinations of edges form motifs, motifs assemble into parts, and parts form objects. Similar hierarchies exist in speech and text from sounds to phones, phonemes, syllables, words and sentences. The pooling allows representations to vary very little when elements in the previous layer vary in position and appearance.
深度神经网络利用了许多自然信号都是组合层次的特性，其中高层次的特征是由较低层次的特征组成的。在图像中，边缘的局部组合形成模体，模体组合成部分，部分形成目标。从声音到音素、音素、音节、单词和句子，语音和文本中都存在类似的层次结构。当前一层中的元素在位置和外观上发生变化时，池化表示的变化很小。

The convolutional and pooling layers in ConvNets are directly inspired by the classic notions of simple cells and complex cells in visual neuroscience 43, and the overall architecture is reminiscent of the LGN–V1–V2–V4–IT hierarchy in the visual cortex ventral pathway 44. When ConvNet models and monkeys are shown the same picture, the activations of high-level units in the ConvNet explains half of the variance of random sets of 160 neurons in the monkey’s infer otemporal cortex 45. ConvNets have their roots in the neocognitron 46, the architecture of which was somewhat similar, but did not have an
end-to-end supervised-learning algorithm such as backpropagation. A primitive 1D ConvNet called a time-delay neural net was used for the recognition of phonemes and simple words47,48.
卷积网络中的卷积层和池化层是受到了视觉神经科学中简单细胞和复杂细胞的经典概念的启发，整体架构让人想起视觉皮层腹侧通道中的LGN–V1–V2–V4–IT层次结构。当ConvNet模型和猴子看到同一张图片时，ConvNet中高级单元的激活函数解释了猴子颞下皮质160个神经元随机集方差的一半。ConvNets起源于neocognitron，其体系结构有些相似，但没有诸如反向传播这样的端到端监督学习算法。neocognitron用一种称为时延神经网络的原始一维ConvNet来识别音素和简单单词。
There have been numerous applications of convolutional networks going back to the early 1990s, starting with time-delay neural networks for speech recognition 47 and document reading 42. The document reading system used a ConvNet trained jointly with a probabilistic model that implemented language constraints. By the late 1990s this system was reading over 10% of all the cheques in the United States. A number of ConvNet-based optical character recognition and handwriting recognition systems were later deployed by Microsoft 49. ConvNets were also experimented with in the early 1990s for object detection in natural images ,including faces and hands 50,51, and for face recognition 52.
早在20世纪90年代初，卷积网络就有了大量的应用，首先是用于语音识别的时延神经网络和文档读取。文档阅读系统使用ConvNet和实现语言约束的概率模型联合训练。到20世纪90年代末，这个系统读取了美国10%以上的支票。后来，Microsoft部署了许多基于ConvNet的光学字符识别和手写识别系统。ConvNets在20世纪90年代早期也被用于自然图像中的目标检测，包括人脸和手，以及人脸识别。

Image understanding with deep convolutional network

Since the early 2000s, ConvNets have been applied with great success to the detection, segmentation and recognition of objects and regions in images. These were all tasks in which labelled data was relatively abundant, such as traffic sign recognition 53, the segmentation of biological images 54 particularly for connectomics 55, and the detection of faces, text, pedestrians and human bodies in natural images 36,50,51,56–58. A major recent practical success of ConvNets is face recognition 59.
自21世纪初以来，ConvNets已成功地应用于图像中目标检测、分割和识别。这些任务都是标记数据相对丰富的任务，例如交通标志识别、生物图像分割（尤其是connectomics）以及在自然图像中检测人脸、文本、行人和人体。最近ConvNets在人脸识别方面取得了成功。
Importantly, images can be labelled at the pixel level, which will have applications in technology, including autonomous mobile robots and self-driving cars 60,61. Companies such as Mobileye and NVIDIA are using such ConvNet-based methods in their upcoming vision systems for cars. Other applications gaining importance involve natural language understanding14 and speech recognition 7.
重要的是，图像可以在像素级标记，这将在技术上有应用，包括自主移动机器人和自动驾驶汽车。像Mobileye和NVIDIA这样的公司正在他们即将推出的汽车视觉系统中使用这种基于ConvNet的方法。卷积网络在自然语言处理和语音识别方面的应用也越来越重要。
Despite these successes, ConvNets were largely forsaken by the mainstream computer-vision and machine-learning communities until the ImageNet competition in 2012. When deep convolutional networks were applied to a data set of about a million images from the web that contained 1,000 different classes, they achieved spectacular results, almost halving the error rates of the best competing approaches 1. This success came from the efficient use of GPUs, ReLUs, a new regularization technique called dropout 62, and techniques to generate more training examples by deforming the existing ones. This success has brought about a revolution in computer vision; ConvNets are now the dominant approach for almost all recognition and detection tasks 4,58,59,63–65 and approach human performance on some tasks. A recent stunning demonstration combines ConvNets and recurrent net modules for the generation of image captions (Fig.3).

尽管取得了这些成功，ConvNets并没有被主流计算机视觉和机器学习所接受，直到2012年的ImageNet竞赛。当深度卷积网络应用于包含1000个不同类的约100万张图像的数据集时，它们取得了惊人的结果，几乎将最佳竞争方法的错误率减半。这是由于对gpu的高效利用，使用ReLUs，和dropout新正则技术，以及通过增强现有示例来生成更多训练示例。这一成功带来了计算机视觉领域的一场革命；目前，ConvNets已成为几乎所有识别和检测任务的主导方法，并在某些任务中接近人的表现。最近一个演示利用ConvNets和递归网络模块来生成图像标题，令我们非常意外。
Recent ConvNet architectures have 10 to 20 layers of ReLUs, hundreds of millions of weights, and billions of connections between units. Whereas training such large networks could have taken weeks only two years ago, progress in hardware, software and algorithm parallelization have reduced training times to a few hours.
最近的ConvNet架构有10到20层ReLUs，数亿个权重，单元之间有数十亿个连接。两年前，训练如此庞大的网络需要几周时间，但在硬件、软件和算法并行化方面的进步已将训练时间缩短到几个小时。
The performance of ConvNet-based vision systems has caused most major technology companies, including Google, Facebook, Microsoft, IBM, Yahoo!, Twitter and Adobe, as well as a quickly growing number of start-ups to initiate research and development projects and to deploy ConvNet-based image understanding products and services.
基于ConvNet的视觉系统的性能已经引起了包括Google、Facebook、Microsoft、IBM、yahoo在内的大多数主要技术公司的关注！推特和Adobe，以及数量迅速增长的初创企业，它们发起研发项目，部署基于ConvNet的图像理解产品和服务。
ConvNets are easily amenable to efficient hardware implementations in chips or field-programmable gate arrays 66,67. A number of companies such as NVIDIA, Mobileye, Intel, Qualcomm and Samsung are developing ConvNet chips to enable real-time vision applications in smartphones, cameras, robots and self-driving cars.
ConvNets很易适应芯片或现场可编程门阵列中的有效硬件实现。许多公司，如NVIDIA、Mobileye、Intel、Qualcomm和Samsung正在开发ConvNet芯片，以便在智能手机、相机、机器人和自动驾驶汽车中实现实时视觉应用。

Distributed representations and language processing

Deep-learning theory shows that deep nets have two different exponential advantages over classic learning algorithms that do not use distributed representations 21. Both of these advantages arise from the power of composition and depend on the underlying data-generating distribution having an appropriate componential structure 40. First, learning distributed representations enable generalization to new combinations of the values of learned features beyond those seen during training (for example, 2n combinations are possible with n binary features)68,69. Second, composing layers of representation in a deep net brings the potential for another exponential advantage 70 (exponential in the depth).
深度学习理论表明，与不使用分布式表示的经典学习算法相比，深层网络具有两种不同的指数优势。这两个优点都源于组合的能力，并依赖于具有适当组件结构的底层数据生成分布。首先，学习分布式表示可以泛化到学习特征值的新组合，而不是训练期间看到的那些值（例如，对于n个二进制特征，组合可能达到2n个）。第二，在一个深度网络中组成表示层带来了另一个指数优势（深度指数）的潜力。
The hidden layers of a multilayer neural network learn to represent the network’s inputs in a way that makes it easy to predict the target outputs. This is nicely demonstrated by training a multilayer neural network to predict the next word in a sequence from a local context of earlier words 71. Each word in the context is presented to the network as a one-of-N vector, that is, one component has a value of 1 and the rest are0. In the first layer, each word creates a different pattern of activations, or word vectors (Fig.4). In a language model, the other layers of the network learn to convert the input word vectors into an output word vector for the predicted next word, which can be used to predict the probability for any word in the vocabulary to appear as the next word. The network learns word vectors that contain many active components each of which can be interpreted as a separate feature of the word, as was first demonstrated 27 in the context of learning distributed representations for symbols. These semantic features were not explicitly present in the input. They were discovered by the learning procedure as a good way of factorizing the structured relationships between the input and output symbols into multiple ‘micro-rules’. Learning word vectors turned out to also work very well when the word sequences come from a large corpus of real text and the individual micro-rules are unreliable 71. When trained to predict the next word in a news story, for example, the learned word vectors for Tuesday and Wednesday are very similar, as are the word vectors for Sweden and Norway. Such representations are called distributed representations because their elements (the features) are not mutually exclusive and their many configurations correspond to the variations seen in the observed data. These word vectors are composed of learned features that were not determined ahead of time by experts, but automatically discovered by the neural network. Vector representations of words learned from text are now very widely used in natural language applications 14,17,72–76.

多层神经网络的隐藏层学习以一种容易预测目标输出的方式来表示网络的输入。通过训练一个多层神经网络来预测序列中来自局部的下一个单词，可以很好地证明这一点。上下文中的每个单词都以一个N向量输入现给网络，也就是说，一个分量的值为1，其余的为0。在第一层中，每个字产生不同的激活模式或字向量（图4）。在一个语言模型中，网络的其他层学习将输入词向量转换为预测的下一个词的输出词向量，该向量可用于预测词汇表中任何单词作为下一个单词出现的概率。网络学习包含许多活跃成分的词向量，其中每一个都可以解释为单词的一个单独的特征。这些语义特征在输入中没有显式呈现。学习过程发现它们是将输入和输出符号之间的结构化关系分解为多个“微规则”的好方法。当单词序列来自大量的真实文本并且单个的微观规则不可靠时，学习单词向量也非常有效。例如，当训练预测新闻报道中的下一个单词时，星期二和星期三所学的单词向量非常相似，瑞典和挪威的单词向量也是如此。这种表示被称为分布式表示，因为它们的元素（特征）不是互斥的，它们的许多配置对应于观测数据中看到的变化。这些词向量是由学习的特征组成的，这些特征不是由专家预先确定的，而是由神经网络自动发现的。从文本中学习单词的向量表示现在在自然语言应用中得到了广泛的应用。

The issue of representation lies at the heart of the debate between the logic-inspired and the neural-network-inspired paradigms for cognition. In the logic-inspired paradigm, an instance of a symbol is something for which the only property is that it is either identical or non-identical to other symbol instances. It has no internal structure that is relevant to its use; and to reason with symbols, they must be bound to the variables in judiciously chosen rules of inference. By contrast, neural networks just use big activity vectors, big weight matrices and scalar non-linearities to perform the type of fast ‘intuitive’ inference that underpins effortless commonsense reasoning.
表征问题是逻辑启发和神经网络启发的认知范式争论的核心。在逻辑启发的范式中，一个符号的实例是唯一的属性是它与其他符号实例相同或不相同。它没有与其使用相关的内部结构；要用符号进行推理，它们必须与经过明智选择的推理规则中的变量绑定在一起。相比之下，神经网络只是使用大活动向量、大权值矩阵和标量非线性来执行快速的“直觉”推理，这种推理支持毫不费力的常识推理
Before the introduction of neural language models 71, the standard approach to statistical modelling of language did not exploit distributed representations: it was based on counting frequencies of occurrences of short symbol sequences of length up to N (called N-grams). The number of possible N-grams is on the order of VN, where V is the vocabulary size, so taking into account a context of more than a handful of words would require very large training corpora. N-grams treat each word as an atomic unit, so they cannot generalize across semantically related sequences of words, whereas neural language models can because they associate each word with a vector of real valued features, and semantically related words end up close to each other in that vector space (Fig.4).
在引入神经语言模型71之前，语言统计建模的标准方法没有利用分布式表示：它是基于计算长度不超过N的短符号序列（称为N-gram）的出现频率。可能的N-gram的数量是VN的数量级，其中V是词汇量的大小，因此考虑到一个包含多个单词的上下文将需要非常大的训练语料库。N-gram将每个单词作为一个原子单元来处理，因此它们不能在语义相关的单词序列中进行泛化，而神经语言模型可以，因为它们将每个单词与实值特征向量相关联，并且语义相关的单词在该向量空间中彼此接近。

Recurrent neural networks

When backpropagation was first introduced, its most exciting use was for training recurrent neural networks (RNNs). For tasks that involve sequential inputs, such as speech and language, it is often better to use RNNs (Fig. 5). RNNs process an input sequence one element at a time, maintaining in their hidden units a ‘state vector’ that implicitly contains information about the history of all the past elements of the sequence. When we consider the outputs of the hidden units at different discrete time steps as if they were the outputs of different neurons in a deep multilayer network (Fig.5, right), it becomes clear how we can apply backpropagation to train RNNs.

当反向传播刚被引入时，它最大的用途是训练递归神经网络（RNN）。对于涉及顺序输入的任务，例如语音和语言，使用RNN通常更好（图5）。RNN一次只处理一个输入序列的一个元素，在它们的隐藏单元中维护一个“状态向量”，它隐含地包含序列所有过去元素的历史信息。当我们考虑隐藏单元在不同离散时间步长的输出，就好像它们是深层多层网络中不同神经元的输出一样（图5），我们就可以清楚地知道如何应用反向传播来训练RNN。
RNNs are very powerful dynamic systems, but training them has proved to be problematic because the backpropagated gradients either grow or shrink at each time step, so over many time steps they typically explode or vanish 77,78.
RNN是非常强大的动态系统，但是实际证明训练它们可能会出问题，因为反向传播的梯度在每一个传播都会增长或缩小，因此在多次传播时，它们通常会爆炸或消失。
Thanks to advances in their architecture 79,80 and ways of training them 81,82, RNNs have been found to be very good at predicting the next character in the text 83 or the next word in a sequence 75, but they can also be used for more complex tasks. For example, after reading an English sentence one word at a time, an English ‘encoder’ network can be trained so that the final state vector of its hidden units is a good representation of the thought expressed by the sentence. This thought vector can then be used as the initial hidden state of (or as extra input to) a jointly trained French ‘decoder’ network, which outputs a prob-ability distribution for the first word of the French translation. If a particular first word is chosen from this distribution and provided as input to the decoder network it will then output a probability distribution for the second word of the translation and so on until a full stop is chosen 17,72,76. Overall, this process generates sequences of French words according to a probability distribution that depends on the English sentence. This rather naive way of performing machine translation has quickly become competitive with the state-of-the-art, and this raises serious doubts about whether understanding a sentence requires anything like the internal symbolic expressions that are manipulated by using inference rules. It is more compatible with the view that everyday reasoning involves many simultaneous analogies that each contribute plausibility to a conclusion 84,85.
由于其体系结构和训练方式的进步，人们发现RNN非常善于预测文本中的下一个字符或序列中的下一个单词，但它们也可以用于更复杂的任务。例如，在一次读一个英语句子后，可以训练一个英语“编码器”网络，使其隐藏单元的最终状态向量很好地表示句子所表达的思想。然后，这个思想向量可以被用作联合训练的法语“解码器”网络的初始隐藏状态（或作为额外输入），该网络输出法语翻译的第一个单词的概率分布。如果从这个分布中选择一个特定的第一个字并作为输入提供给解码器网络，那么它将输出翻译的第二个字的概率分布，依此类推，直到选择了一个句号。总的来说，这个过程根据一个依赖于英语句子的概率分布生成法语单词序列。这种相当幼稚的机器翻译方式很快就与最先进的技术相竞争，这就引起了人们对理解一个句子是否需要像使用推理规则所操纵的内部符号表达式之类的东西的严重怀疑。它更符合这样一种观点：日常推理包括许多同时发生的类比，每个人都有可能得出结论。

Instead of translating the meaning of a French sentence into an English sentence, one can learn to ‘translate’ the meaning of an image into an English sentence (Fig. 3). The encoder here is a deep ConvNet that converts the pixels into an activity vector in its last hidden layer. The decoder is an RNN similar to the ones used for machine translation and neural language modelling. There has been a surge of interest in such systems recently (see examples mentioned in ref. 86).
除了把法语句子的意思翻译成英语句子，人们可以学会把图像的意思“翻译”成英语句子（图3）。这里的编码器是一个深度 ConvNet，它将像素转换成最后一个隐藏层中的活动向量。译码器是一个类似于机器翻译和神经语言建模的RNN。最近人们对这类系统的兴趣激增。
RNNs, once unfolded in time (Fig. 5), can be seen as very deep feedforward networks in which all the layers share the same weights. Although their main purpose is to learn long-term dependencies, theoretical and empirical evidence shows that it is difficult to learn to store information for very long 78.
RNN一旦在时间上展开（图5），就可以看作是非常深的前馈网络，其中所有层共享相同的权重。虽然他们的主要目的是学习长期依赖性，但理论和经验证据表明，学习长期存储信息是很困难的。
To correct for that, one idea is to augment the network with an explicit memory. The first proposal of this kind is the long short-term memory (LSTM) networks that use special hidden units, the natural behaviour of which is to remember inputs for a long time 79. A special unit called the memory cell acts like an accumulator or a gated leaky neuron: it has a connection to itself at the next time step that has a weight of one, so it copies its own real-valued state and accumulates the external signal, but this self-connection is multiplicatively gated by another unit that learns to decide when to clear the content of the memory.
为了纠正这一点，有一个方法是用显式内存扩充网络。第一种建议是使用特殊隐藏单元的长短期记忆（LSTM）网络，其自然行为是长时间记忆输入。一个称为记忆细胞的特殊单元就像一个累加器或一个门控神经元：它在下一个时间步连接到自己，复制自己的实值状态并累积外部信号，但是这个自我连接被另一个单元乘性地选通，它学习决定何时清除记忆的内容。
LSTM networks have subsequently proved to be more effective than conventional RNNs, especially when they have several layers for each time step87, enabling an entire speech recognition system that goes all the way from acoustics to the sequence of characters in the transcription. LSTM networks or related forms of gated units are also currently used for the encoder and decoder networks that perform so well at machine translation 17,72,76.
LSTM网络后来被证明比传统的rnn更有效，尤其是当它们在每个时间step有多个层时，能够实现从声学到转录中字符序列的整个语音识别系统。LSTM网络或相关形式的选通单元目前也用于编码器和解码器网络，它们在机器翻译方面表现得非常好。
Over the past year, several authors have made different proposals to augment RNNs with a memory module. Proposals include the Neural Turing Machine in which the network is augmented by a ‘tape-like’ memory that the RNN can choose to read from or write to 88, and memory networks, in which a regular network is augmented by a kind of associative memory 89. Memory networks have yielded excellent performance on standard question-answering benchmarks. The memory is used to remember the story about which the network is later asked to answer questions.
在过去的一年里，几位作者提出了不同的建议，用内存模块来扩充RNN。建议包括神经图灵机，其中网络由RNN可选择读或写的“磁带状”存储器扩充，以及存储器网络，其中规定网络由一种联想存储器扩充。内存网络已经在标准的问答基准测试中取得了优异的性能。记忆是用来记忆故事的，网络随后被要求回答问题。
Beyond simple memorization, neural Turing machines and memory networks are being used for tasks that would normally require reasoning and symbol manipulation. Neural Turing machines can be taught ‘algorithms’. Among other things, they can learn to output a sorted list of symbols when their input consists of an unsorted sequence in which each symbol is accompanied by a real value that indicates its priority in the list 88. Memory networks can be trained to keep track of the state of the world in a setting similar to a text adventure game and after reading a story, they can answer questions that require complex inference 90. In one test example, the network is shown a 15-sentence version of the The Lord of the Rings and correctly answers questions such as “where is Frodo now?”89.
除了简单的记忆，神经图灵机器和记忆网络被用于通常需要推理和符号操作的任务。神经图灵机器可以教“算法”。除其他外，当他们的输入由一个未排序的序列组成时，他们可以学习输出一个已排序的符号列表，其中每个符号都有一个实际值，该值指示其在列表中的优先级。记忆网络可以训练成在一个类似文本冒险游戏的环境中跟踪世界的状态，并且在阅读故事之后，他们可以回答需要复杂推理的问题。在一个测试例子中，网络显示了一个15句话的《指环王》版本，并正确回答了诸如“佛罗多现在在哪里”等问题。

The future of deep learning

Unsupervised learning 91–98 had a catalytic effect in reviving interest in deep learning, but has since been overshadowed by the successes of purely supervised learning. Although we have not focused on it in this Review, we expect unsupervised learning to become far more important in the longer term. Human and animal learning is largely unsupervised: we discover the structure of the world by observing it, not by being told the name of every object.
无监督学习在恢复对深度学习的兴趣方面起到了催化作用，但后来被纯粹监督学习的成功所掩盖。虽然我们在这篇评论中没有关注它，但我们期望无监督学习在长期内会变得更加重要。人类和动物的学习在很大程度上是无监督的：我们通过观察发现世界的结构，而不是被告知每一个物体的名称。
Human vision is an active process that sequentially samples the optic array in an intelligent, task-speciﬁc way using a small, high-resolution fovea with a large, low-resolution surround. We expect much of the future progress in vision to come from systems that are trained end-to-end and combine ConvNets with RNNs that use reinforcement learning to decide where to look. Systems combining deep learning and reinforcement learning are in their infancy, but they already outperform passive vision systems 99 at classification tasks and produce impressive results in learning to play many different video games 100.
人类视觉是一个主动的过程，它使用一个小的、高分辨率的中心凹和一个大的、低分辨率的环绕物，以智能的、特定的方式对光学阵列进行采样。我们预计未来远景的大部分进展将来自经过端到端训练，并将ConvNets与rnn相结合，这些rnn使用强化学习来决定在哪里寻找。将深度学习和强化学习相结合的系统还处于初级阶段，但它们在分类任务上的表现已经超过了被动视觉系统，并在学习玩多种不同的视频游戏方面取得了令人印象深刻的成果。

Natural language understanding is another area in which deep learning is poised to make a large impact over the next few years. We expect systems that use RNNs to understand sentences or whole documents will become much better when they learn strategies for selectively attending to one part at a time76,86.
自然语言理解是另一个领域，深度学习将在未来几年产生重大影响。我们期望使用包含RNN的系统来理解句子或整个文档，当他们学会一次有选择地关注一个部分的策略时，会变得更好。
Ultimately, major progress in artificial intelligence will come about through systems that combine representation learning with complex reasoning. Although deep learning and simple reasoning have been used for speech and handwriting recognition for a long time, new paradigms are needed to replace rule-based manipulation of symbolic expressions by operations on large vectors 101.
最终，人工智能的重大进展将通过将表示学习与复杂推理相结合的系统来实现。虽然在语音和手写体识别中使用深度学习和简单推理已经有很长一段时间了，但是需要新的范式来取代基于规则的符号表达式操作，而不是对大向量的操作。

参考文献

References

你可能感兴趣的:(计算机视觉)

微算法科技研究量子视觉计算，利用量子力学原理提升传统计算机视觉任务的性能
计算机视觉，作为人工智能领域的一个重要分支，致力于模拟人类视觉系统对图像或视频等视觉数据的理解与分析能力。它涵盖了图像识别、目标检测、图像分割等一系列复杂任务，广泛应用于自动驾驶、医疗影像分析、安防监控等多个领域。然而，随着数据规模的不断膨胀和任务复杂度的日益提升，传统计算机视觉算法在处理大规模、高维度数据时遇到了性能瓶颈。微算法科技(NASDAQ：MLGO)研究量子视觉计算，探索量子计算与经典卷
霍夫变换（Hough Transform）算法原来详解和纯C++代码实现以及OpenCV中的使用示例点云SLAM 算法图形图像处理算法 opencv 图像处理与计算机视觉算法直线提取检测目标检测霍夫变换算法
霍夫变换（HoughTransform）是一种经典的图像处理与计算机视觉算法，广泛用于检测图像中的几何形状，例如直线、圆、椭圆等。其核心思想是将图像空间中的“点”映射到参数空间中的“曲线”，从而将形状检测问题转化为参数空间中的峰值检测问题。一、霍夫变换基本思想输入：边缘图像（如经过Canny边缘检测）输出：一组满足几何模型的形状（如直线、圆）关键思想：图像空间中的一个点→参数空间中的一个曲线参数空
目标检测（object detection）加油吧zkf 目标检测目标检测人工智能计算机视觉
目标检测作为计算机视觉的核心技术，在自动驾驶、安防监控、医疗影像等领域发挥着不可替代的作用。本文将系统讲解目标检测的概念、原理、主流模型、常见数据集及应用场景，帮助读者构建对这一技术的完整认知。一、目标检测的核心概念目标检测（ObjectDetection）是指在图像或视频中自动定位并识别出所有感兴趣的目标的技术。它需要解决两个核心问题：分类（Classification）：确定图像中每个目标的类
微算法科技的前沿探索：量子机器学习算法在视觉任务中的革新应用 MicroTech2025 量子计算算法
在信息技术飞速发展的今天，计算机视觉作为人工智能领域的重要分支，正逐步渗透到我们生活的方方面面。从自动驾驶到人脸识别，从医疗影像分析到安防监控，计算机视觉技术展现了巨大的应用潜力。然而，随着视觉任务复杂度的不断提升，传统机器学习算法在处理大规模、高维度数据时遇到了计算瓶颈。在此背景下，量子计算作为一种颠覆性的计算模式，以其独特的并行处理能力和指数级增长的计算空间，为解决这一难题提供了新的思路。微算
OpenCV图片操作100例：从入门到精通指南（1）总有刁民想爱朕ha opencv 计算机视觉人工智能
OpenCV图片操作100例：从入门到精通指南本文整理了100个OpenCV实用技巧，涵盖图像处理各个领域，助你轻松掌握计算机视觉核心技能！一、入门必备：基础操作1.图像读写与显示importcv2#读取图像（BGR格式）img=cv2.imread('image.jpg')#显示图像cv2.imshow('示例图片',img)cv2.waitKey(0)#按任意键退出cv2.destroyAll
OpenCV图片操作100例：从入门到精通指南（3）总有刁民想爱朕ha opencv 人工智能计算机视觉
高效学习路径：1️⃣分阶段学习：入门：1-20例（基础操作）进阶：21-50例（图像处理）高级：51-100例（计算机视觉）2️⃣项目驱动学习：证件照背景替换（1-15例）停车场车位检测（30-45例）视频运动追踪（70-85例）3️⃣性能优化技巧：#使用UMat加速图像处理umat_img=cv2.UMat(img)processed=cv2.GaussianBlur(umat_img,(5,5
OpenCV入门到精通：AI视觉处理的完整指南 AI云原生与云计算技术学院人工智能 opencv 计算机视觉 ai
OpenCV入门到精通：AI视觉处理的完整指南关键词：OpenCV、计算机视觉、图像预处理、目标检测、AI视觉应用摘要：本文是一份面向AI视觉爱好者的OpenCV完整学习指南。从OpenCV的核心概念讲起，结合生活案例、代码示例和项目实战，逐步拆解图像读取/显示、灰度化、边缘检测、目标检测等关键技术。无论你是想入门计算机视觉的新手，还是希望用OpenCV解决实际问题的开发者，都能通过本文掌握从理论
CNN 猫狗识别：从理论到实战的深度解析爱熬夜的小古 cnn 深度学习人工智能
在计算机视觉领域，卷积神经网络（ConvolutionalNeuralNetwork，CNN）凭借其强大的特征提取和模式识别能力，成为图像分类任务的主流技术。猫狗识别作为经典的图像分类问题，不仅能帮助我们理解CNN的工作原理，还能为实际应用提供技术支持。本文将深入探讨CNN在猫狗识别中的应用，从理论基础到实战代码，带你全面掌握这项技术。一、CNN基础理论概述（一）CNN的核心组件卷积层：是CNN的
OpenCV入门到精通：从基础到实战的全面指南
摘要：本文旨在为初学者和有一定经验的开发者提供OpenCV从入门到精通的全面指南。文章首先介绍了OpenCV的基本概念和安装方法，然后深入讲解了图像处理基础、特征检测与匹配、视频处理与分析等核心内容，最后通过实战案例展示了OpenCV在计算机视觉任务中的应用。关键词：OpenCV；图像处理；特征检测；视频分析；实战案例引言OpenCV（OpenSourceComputerVisionLibrary
【人工智能面经第五期：模型训练与优化核心面试深度问答】码上有前 Pytorch Python 深度学习人工智能面试职场和发展
作者：“码上有前”文章简介：人工智能面经欢迎小伙伴们点赞、收藏⭐、留言模型训练与优化核心面试深度问答摘要围绕模型训练与优化的训练技巧（正则化、迁移学习）和数据工程（数据增强、标注质量）展开，通过20个关键问题，解析正则化协同策略、迁移学习适配场景、数据增强实践等核心要点，助力读者掌握人工智能与计算机视觉岗位面试中模型训练优化的知识体系，明晰技术原理与实际应用的关联。目录训练技巧-正则化策略相关问题
机器学习手写字体识别系统：技术演进与应用实践万能小贤哥机器学习人工智能
引言：手写字体识别的技术定位与价值在信息处理领域，人工录入手写文本的低效性与机器识别的高效性形成鲜明对比。例如，医疗处方的人工处理需约5分钟/张，而采用手写字体识别技术可将时间缩短至10秒/张，显著提升处理效率。作为计算机视觉与人工智能的重要分支，手写字体识别技术通过将手写文本转换为可编辑电子文本，不仅大幅减少人工输入时间和错误，降低人工处理成本，还能在大量数据处理时保持高于人工录入的准确性，是人
【论文阅读笔记】TimesURL: Self-supervised Contrastive Learning for Universal Time Series 少写代码少看论文多多睡觉 #论文阅读笔记论文阅读笔记
TimesURL:Self-supervisedContrastiveLearningforUniversalTimeSeriesRepresentationLearning摘要学习适用于多种下游任务的通用时间序列表示，并指出这在实际应用中具有挑战性但也是有价值的。最近，研究人员尝试借鉴自监督对比学习（SSCL）在计算机视觉（CV）和自然语言处理（NLP）中的成功经验，以解决时间序列表示的问题。
异物检测的计算机视觉算法技术路线思绪漂移计算机视觉算法人工智能
异物检测的计算机视觉算法技术路线在现代智能监测系统中，异物检测有着其必要性和运维重要性，通过计算机视觉算法，可以实时识别各种异常物体，为设备安全运行提供有力保障。本文将介绍异物检测的主要技术路线。一、分类识别适应场景分类识别技术主要适用于已知目标类别的异物检测场景。在运维环境中，这类场景包括：固定区域内的障碍物监测（如轨道区域的石块、工具、动物等）关键部件的异物附着检测（如固定装置上的杂物）安全通
【AI大模型】深入解析预训练：大模型时代的核心引擎我爱一条柴ya 学习AI记录深度学习人工智能 ai python AI编程算法
预训练已成为现代人工智能，尤其是自然语言处理和计算机视觉领域的基石技术。它彻底改变了模型开发范式，催生了BERT、GPT等革命性模型。本文将系统阐述预训练的核心概念、原理、方法、应用及挑战。一、预训练的本质：为何需要它？核心问题：数据标注的瓶颈监督学习依赖海量高质量标注数据，获取成本极高（时间、金钱、专业知识）。对于复杂任务（如理解语义、生成文本），标注难度呈指数级上升。标注数据稀缺导致模型泛化能
OpenCvSharp 实现环形文字识别OCR实例（C#） XisVisual_Basic ocr c#计算机视觉 C#
近年来，随着计算机视觉和图像处理的不断发展，光学字符识别（OCR）技术也变得愈发成熟。OCR技术可以将图像中的文字转换为可编辑和可搜索的文本，为人们带来了极大的便利。在本篇文章中，我们将介绍如何使用OpenCvSharp库来实现环形文字的识别。首先，在使用OpenCvSharp之前，我们需要确保已经在项目中引用了该库，并添加相应的命名空间。usingOpenCvSharp;接下来，我们需要准备一张
Python|OpenCV-实现识别弧形文字(17) 写python的鑫哥 OpenCV入门与进阶 python opencv 人工智能计算机视觉弧形文字环形文字识别
前言本文是该专栏的第19篇，后面将持续分享OpenCV计算机视觉的干货知识，记得关注。我们知道，OCR可以识别文字方面的需求，但是如果遇到那些目标文字是“弧形文字”，需要怎么去识别呢？遇到想要识别“弧形文字”的需求，这个时候你可以借助于Opencv+OCR技术来实现。而本文，笔者将针对上述问题需求，利用OpenCV结合OCR来实现“弧形文字”的识别。废话不多说，具体的细节部分以及详细的解决方案，跟
【小白入门必看】一文读懂深度学习计算机视觉技术及学习路线
一、什么是计算机视觉？计算机视觉，其实就是教机器怎么像我们人一样，用摄像头看看周围的世界，然后理解它。比如说，它能认出这是个苹果，或者那边有辆车。除此之外，还能把拍到的照片或者视频转换成有用的信息，帮我们做决定。整个过程就是为了让机器能看懂图像，然后根据这些图像来做出聪明的选择。二、计算机视觉实现起来难吗？人类依赖视觉，找辆汽车轻而易举，毕竟汽车那么大，一眼就能看出来，所以常误以为计算机视觉简单，
计算机视觉：Transformer的轻量化与加速策略 xcLeigh 计算机视觉CV 计算机视觉 transformer 人工智能 AI 策略
计算机视觉：Transformer的轻量化与加速策略一、前言二、Transformer基础概念回顾2.1Transformer架构概述2.2自注意力机制原理三、Transformer轻量化策略3.1模型结构优化3.1.1减少层数和头数3.1.2优化Patch大小3.2参数共享与剪枝3.2.1参数共享3.2.2剪枝3.3知识蒸馏四、Transformer加速策略4.1模型量化4.2.2TPU加速4.
【机器学习】解密计算机视觉：CNN、目标检测与图像识别核心技术（第25天）吴师兄大模型 0基础实现机器学习入门到精通机器学习计算机视觉 cnn 人工智能目标检测图像识别 pytorch
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
人体坐姿检测系统开发实战（YOLOv8+PyTorch+可视化） Loving_enjoy 计算机学科论文创新点人工智能深度学习迁移学习经验分享
本文将手把手教你构建智能坐姿检测系统，结合目标检测与姿态估计技术，实现不良坐姿的实时识别与预警###一、项目背景与价值现代人每天平均坐姿时间超过8小时，不良坐姿会导致：-脊椎压力增加300%-颈椎病发病率提升45%-腰椎间盘突出风险增加60%本系统通过计算机视觉技术实时监测坐姿状态，对驼背、侧倾、前倾等不良姿势进行智能识别和预警。相较于传统传感器方案，我们的视觉方案具有非接触、低成本、易部署的优势
魔都AI医疗哪家强？全景揭秘科技创新与未来钱景！
引言上海作为中国科技创新的先锋城市，正在AI医疗领域崭露头角。根据2024年12月的数据，上海拥有34家专注于AI药物研发的公司，占全国预临床研究的60%和临床试验的47%。这些公司利用深度学习、大语言模型（LLM）和计算机视觉等技术，革新药物发现、医疗影像分析和数据治理，推动医疗行业的智能化转型。从全球首个人工智能医院“AgentHospital”到AI驱动的诊断系统，上海的AI医疗生态正在重塑
语义分割模型的轻量化与准确率提升研究 pk_xz123456 仿真模型深度学习算法 transformer 深度学习人工智能算法数据结构
语义分割模型的轻量化与准确率提升研究1.引言语义分割是计算机视觉领域的核心任务之一，它要求模型为图像中的每个像素分配一个类别标签。随着深度学习的发展，语义分割模型在多个领域得到了广泛应用，如自动驾驶、医学影像分析、遥感图像解译等。然而，现有的语义分割模型往往面临两个主要挑战：模型复杂度高导致难以部署在资源受限的设备上，以及准确率仍有提升空间以满足实际应用需求。本文将从模型轻量化和准确率提升两个角度
Python深度学习实践：建立端到端的自动驾驶系统 AI天才研究院 Agentic AI 实战计算 AI人工智能与大数据计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
Python深度学习实践：建立端到端的自动驾驶系统1.背景介绍自动驾驶系统是当今科技领域最具挑战性和前景的应用之一。它融合了计算机视觉、深度学习、规划与控制等多个领域的先进技术,旨在实现车辆的自主感知、决策和操控。随着人工智能技术的不断发展,越来越多的公司和研究机构投入了大量资源来开发自动驾驶系统。Python作为一种高效、易学且开源的编程语言,在这一领域扮演着重要角色。本文将探讨如何利用Pyth
从0开始学习计算机视觉--Day08--卷积神经网络
之前我们提到，神经网络是通过全连接层对输入做降维处理，将输入的向量通过矩阵和激活函数进行降维，在神经元上输出激活值。而卷积神经网络中，用卷积层代替了全连接层。不同的是，这里的输入不再需要降维，而是可以保留输入的空间结构，例如输入的是32×32×3的图片，在全连接层中是3072×1的向量，而卷积层里则保持不变。这里的改变的地方是对于同样的WX的函数形式，这里是把5×5×3的权重矩阵（也叫卷积核）向量
Python打卡：Day40
#先继续之前的代码importtorchimporttorch.nnasnnimporttorch.optimasoptimfromtorch.utils.dataimportDataLoader,Dataset#DataLoader是PyTorch中用于加载数据的工具fromtorchvisionimportdatasets,transforms#torchvision是一个用于计算机视觉的库，
BigQuery对象引用（ObjectRef）全面指南：一站式整合结构化与非结构化多模态数据分析
引言企业需要同时管理有组织表格中的结构化数据，以及日益增长的非结构化数据（如图片、音频和文档）。传统上，联合分析这些多样化数据类型非常复杂，通常需要使用不同的工具。非结构化媒体通常需要导出到专门的服务进行处理（如图片分析需计算机视觉服务，音频需语音转文本引擎），这会造成数据孤岛，阻碍全局分析视角的建立。以虚构的电商支持系统为例：结构化的工单信息存储在BigQuery表中，而相关的支持通话录音或损坏
【心灵鸡汤】深度学习技能形成树：从零基础到AI专家的成长路径全解析智算菩萨人工智能深度学习
引言：技能树的生长哲学在这个人工智能浪潮汹涌的时代，深度学习犹如一棵参天大树，其根系深深扎入数学与计算科学的沃土，主干挺拔地承载着机器学习的核心理念，而枝叶则繁茂地延伸至计算机视觉、自然语言处理、强化学习等各个应用领域。对于初入此领域的新手而言，理解这棵技能树的生长规律，掌握其形成过程中的关键节点和发展阶段，将直接决定其在人工智能道路上能够走多远、攀多高。技能树的概念源于游戏设计，但在学习深度学习
多模态大模型的技术应用与未来展望：重构AI交互范式的新引擎 zhaoyi_he 重构人工智能
一、引言：为什么多模态是AI发展的下一场革命？过去十年，深度学习推动了计算机视觉和自然语言处理的飞跃，但两者的发展路径长期割裂。随着生成式AI和大模型时代的到来，**多模态大模型（MultimodalFoundationModels）**以统一的建模方式处理图像、文本、音频、视频等多源数据，重塑了“感知-认知-决策”链条，为AGI迈出关键一步。OpenAI的GPT-4o、Google的Gemini
OpenCV 图像操作：颜色识别、替换与水印添加
目录引言代码实现1.导入必要的库2.图像加法3.图像直接相加4.颜色加权加法5.HSV颜色空间转换概念作用6.查找颜色范围对应的像素点7.与运算-生成掩膜8.添加水印9.主函数总结引言在计算机视觉领域，OpenCV是一个强大的库，提供了丰富的图像操作功能。本文将详细介绍如何使用OpenCV进行图像加法、颜色加权加法、HSV颜色空间转换、颜色范围查找、与运算生成掩膜以及添加水印等操作，并给出相应的P
YOLO学习笔记｜从YOLOv5到YOLOv11：技术演进与核心改进北斗猿 YOLO学习从零到1 YOLO 目标检测算法 python 计算机视觉
从YOLOv5到YOLOv11：技术演进与核心改进深度解析一、YOLO系列发展概述YOLO（YouOnlyLookOnce）目标检测算法自2016年诞生以来，凭借其"单次检测"的独特理念和卓越的实时性能，持续引领着计算机视觉领域的技术革新。从JosephRedmon的初代YOLO到AlexeyBochkovskiy的YOLOv4，再到Ultralytics团队的YOLOv5及后续系列，这一算法家族
微信开发者验证接口开发 362217990 微信开发者 token 验证
微信开发者接口验证。 Token，自己随便定义，与微信填写一致就可以了。根据微信接入指南描述 http://mp.weixin.qq.com/wiki/17/2d4265491f12608cd170a95559800f2d.html 第一步：填写服务器配置第二步：验证服务器地址的有效性第三步：依据接口文档实现业务逻辑这里主要讲第二步验证服务器有效性。建一个
一个小编程题-类似约瑟夫环问题 BrokenDreams 编程
今天群友出了一题：一个数列,把第一个元素删除,然后把第二个元素放到数列的最后,依次操作下去,直到把数列中所有的数都删除,要求依次打印出这个过程中删除的数。 &
linux复习笔记之bash shell (5) 关于减号-的作用 eksliang linux关于减号“-”的含义 linux关于减号“-”的用途 linux关于“-”的含义 linux关于减号的含义
转载请出自出处： http://eksliang.iteye.com/blog/2105677 管道命令在bash的连续处理程序中是相当重要的，尤其在使用到前一个命令的studout（标准输出）作为这次的stdin（标准输入）时，就显得太重要了，某些命令需要用到文件名，例如上篇文档的的切割命令（split）、还有
Unix(3) 18289753290 unix ksh
1)若该变量需要在其他子进程执行，则可用"$变量名称"或${变量}累加内容什么是子进程？在我目前这个shell情况下，去打开一个新的shell，新的那个shell就是子进程。一般状态下，父进程的自定义变量是无法在子进程内使用的，但通过export将变量变成环境变量后就能够在子进程里面应用了。 2)条件判断： &&代表and ||代表or&nbs
关于ListView中性能优化中图片加载问题酷的飞上天空 ListView
ListView的性能优化网上很多信息，但是涉及到异步加载图片问题就会出现问题。具体参看上篇文章http://314858770.iteye.com/admin/blogs/1217594 如果每次都重新inflate一个新的View出来肯定会造成性能损失严重，可能会出现listview滚动是很卡的情况，还会出现内存溢出。现在想出一个方法就是每次都添加一个标识，然后设置图
德国总理默多克：给国人的一堂“震撼教育”课永夜-极光教育
http://bbs.voc.com.cn/topic-2443617-1-1.html德国总理默多克：给国人的一堂“震撼教育”课　安吉拉—默克尔，一位经历过社会主义的东德人，她利用自己的博客，发表一番来华前的谈话，该说的话，都在上面说了，全世界想看想传播——去看看默克尔总理的博客吧！　　德国总理默克尔以她的低调、朴素、谦和、平易近人等品格给国人留下了深刻印象。她以实际行动为中国人上了一堂
关于Java继承的一个小问题。。。随便小屋 java
今天看Java 编程思想的时候遇见一个问题，运行的结果和自己想想的完全不一样。先把代码贴出来！ //CanFight接口 interface Canfight { void fight(); } //ActionCharacter类 class ActionCharacter { public void fight() { System.out.pr
23种基本的设计模式 aijuans 设计模式
Abstract Factory：提供一个创建一系列相关或相互依赖对象的接口，而无需指定它们具体的类。　　Adapter：将一个类的接口转换成客户希望的另外一个接口。A d a p t e r模式使得原本由于接口不兼容而不能一起工作的那些类可以一起工作。　　Bridge：将抽象部分与它的实现部分分离，使它们都可以独立地变化。　　Builder：将一个复杂对象的构建与它的表示分离，使得同
《周鸿祎自述：我的互联网方法论》读书笔记 aoyouzi 读书笔记
从用户的角度来看,能解决问题的产品才是好产品,能方便/快速地解决问题的产品,就是一流产品. 商业模式不是赚钱模式一款产品免费获得海量用户后,它的边际成本趋于0,然后再通过广告或者增值服务的方式赚钱,实际上就是创造了新的价值链. 商业模式的基础是用户,木有用户,任何商业模式都是浮云.商业模式的核心是产品,本质是通过产品为用户创造价值. 商业模式还包括寻找需求
JavaScript动态改变样式访问技术百合不是茶 JavaScript style属性 ClassName属性
一:style属性格式: HTML元素.style.样式属性="值"; 创建菜单:在html标签中创建或者在head标签中用数组创建 <html> <head> <title>style改变样式</title> </head> &l
jQuery的deferred对象详解 bijian1013 jquery deferred对象
jQuery的开发速度很快，几乎每半年一个大版本，每两个月一个小版本。每个版本都会引入一些新功能，从jQuery 1.5.0版本开始引入的一个新功能----deferred对象。 &nb
淘宝开放平台TOP Bill_chen C++c 物流 C#
淘宝网开放平台首页：http://open.taobao.com/ 淘宝开放平台是淘宝TOP团队的产品，TOP即TaoBao Open Platform，是淘宝合作伙伴开发、发布、交易其服务的平台。支撑TOP的三条主线为： 1.开放数据和业务流程 * 以API数据形式开放商品、交易、物流等业务； &
【大型网站架构一】大型网站架构概述 bit1129 网站架构
大型互联网特点面对海量用户、海量数据大型互联网架构的关键指标高并发高性能高可用高可扩展性线性伸缩性安全性大型互联网技术要点前端优化 CDN缓存反向代理 KV缓存消息系统分布式存储 NoSQL数据库搜索监控安全想到的问题： 1.对于订单系统这种事务型系统，如
eclipse插件hibernate tools安装白糖_ Hibernate
eclipse helios(3.6)版 1.启动eclipse 2.选择 Help > Install New Software...> 3.添加如下地址： http://download.jboss.org/jbosstools/updates/stable/helios/ 4.选择性安装：hibernate tools在All Jboss tool
Jquery easyui Form表单提交注意事项 bozch jquery easyui
jquery easyui对表单的提交进行了封装，提交的方式采用的是ajax的方式，在开发的时候应该注意的事项如下： 1、在定义form标签的时候，要将method属性设置成post或者get，特别是进行大字段的文本信息提交的时候，要将method设置成post方式提交，否则页面会抛出跨域访问等异常。所以这个要
Trie tree(字典树)的Java实现及其应用-统计以某字符串为前缀的单词的数量 bylijinnan java实现
import java.util.LinkedList; public class CaseInsensitiveTrie { /** 字典树的Java实现。实现了插入、查询以及深度优先遍历。 Trie tree's java implementation.(Insert,Search,DFS) Problem Description Igna
html css 鼠标形状样式汇总 chenbowen00 html css
css鼠标手型cursor中hand与pointer Example：CSS鼠标手型效果 <a href="#" style="cursor:hand">CSS鼠标手型效果</a><br/> Example：CSS鼠标手型效果 <a href="#" style=&qu
[IT与投资]IT投资的几个原则 comsci it
无论是想在电商,软件,硬件还是互联网领域投资,都需要大量资金,虽然各个国家政府在媒体上都给予大家承诺,既要让市场的流动性宽松,又要保持经济的高速增长....但是,事实上,整个市场和社会对于真正的资金投入是非常渴望的,也就是说,表面上看起来,市场很活跃,但是投入的资金并不是很充足的......
oracle with语句详解 daizj oracle with with as
oracle with语句详解转在oracle中，select 查询语句，可以使用with,就是一个子查询，oracle 会把子查询的结果放到临时表中，可以反复使用例子:注意，这是sql语句，不是pl/sql语句，可以直接放到jdbc执行的 ----------------------------------------------------------------
hbase的简单操作 deng520159 数据库 hbase
近期公司用hbase来存储日志,然后再来分析 ,把hbase开发经常要用的命令找了出来. 用ssh登陆安装hbase那台linux后用hbase shell进行hbase命令控制台! 表的管理 1）查看有哪些表 hbase(main)> list 2）创建表 # 语法：create <table>, {NAME => <family&g
C语言scanf继续学习、算术运算符学习和逻辑运算符 dcj3sjt126com c
/* 2013年3月11日20:37:32 地点：北京潘家园功能：完成用户格式化输入多个值目的：学习scanf函数的使用 */ # include <stdio.h> int main(void) { int i, j, k; printf("please input three number:\n"); //提示用
2015越来越好 dcj3sjt126com 歌曲
越来越好房子大了电话小了感觉越来越好假期多了收入高了工作越来越好商品精了价格活了心情越来越好天更蓝了水更清了环境越来越好活得有奔头人会步步高想做到你要努力去做到幸福的笑容天天挂眉梢越来越好婆媳和了家庭暖了生活越来越好孩子高了懂事多了学习越来越好朋友多了心相通了大家越来越好道路宽了心气顺了日子越来越好活的有精神人就不显
java.sql.SQLException: Value '0000-00-00' can not be represented as java.sql.Tim feiteyizu mysql
数据表中有记录的time字段（属性为timestamp）其值为：“0000-00-00 00:00:00” 程序使用select 语句从中取数据时出现以下异常： java.sql.SQLException:Value '0000-00-00' can not be represented as java.sql.Date java.sql.SQLException: Valu
Ehcache（07）——Ehcache对并发的支持 234390216 并发 ehcache 锁 ReadLock WriteLock
Ehcache对并发的支持在高并发的情况下，使用Ehcache缓存时，由于并发的读与写，我们读的数据有可能是错误的，我们写的数据也有可能意外的被覆盖。所幸的是Ehcache为我们提供了针对于缓存元素Key的Read（读）、Write（写）锁。当一个线程获取了某一Key的Read锁之后，其它线程获取针对于同
mysql中blob,text字段的合成索引 jackyrong mysql
在mysql中，原来有一个叫合成索引的，可以提高blob,text字段的效率性能，但只能用在精确查询，核心是增加一个列，然后可以用md5进行散列，用散列值查找则速度快比如： create table abc(id varchar(10),context blog,hash_value varchar(40)); insert into abc(1,rep
逻辑运算与移位运算 latty 位运算逻辑运算
源码：正数的补码与原码相同例+7 源码：00000111 补码：00000111 （用8位二进制表示一个数）负数的补码：符号位为1，其余位为该数绝对值的原码按位取反；然后整个数加1。 -7 源码： 10000111 ，其绝对值为00000111 取反加一：11111001 为-7补码已知一个数的补码，求原码的操作分两种情况：
利用XSD 验证XML文件 newerdragon java xml xsd
XSD文件（XML Schema 语言也称作 XML Schema 定义（XML Schema Definition，XSD）。具体使用方法和定义请参看： http://www.w3school.com.cn/schema/index.asp java自jdk1.5以上新增了SchemaFactory类可以实现对XSD验证的支持，使用起来也很方便。以下代码可用在J
搭建 CentOS 6 服务器(12) - Samba rensanning centos
（1）安装 # yum -y install samba Installed: samba.i686 0:3.6.9-169.el6_5 # pdbedit -a rensn new password:123456 retype new password:123456 …… （2）Home文件夹 # mkdir /etc
Learn Nodejs 01 toknowme nodejs
（1）下载nodejs https://nodejs.org/download/ 选择相应的版本进行下载（2）安装nodejs 安装的方式比较多，请baidu下我这边下载的是“node-v0.12.7-linux-x64.tar.gz”这个版本（1）上传服务器（2）解压 tar -zxvf node-v0.12.
jquery控制自动刷新的代码举例 xp9802 jquery
1、html内容部分复制代码代码示例: <div id='log_reload'> <select name="id_s" size="1"> <option value='2'>-2s-</option> <option value='3'>-3s-</option