大数据机器学习实验室

[论文翻译]A Global Geometric Framework for Nonlinear Dimensionality Reduction

论文题目：A Global Geometric Framework for Nonlinear Dimensionality Reduction
论文来源:Science 290, 2319 (2000);

A Global Geometric Framework for Nonlinear Dimensionality Reduction

非线性降维的全局几何框架

Joshua B. Tenenbaum,^1* Vin de Silva,² John C. Langford³

Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: ﬁnding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve ﬁbers or 106 optic nerve ﬁbers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations,such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction,our sefﬁciently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

科学家们在处理大量高维数据时，如全球气候模式、恒星光谱或人类基因分布等，经常会面临维度降低的问题：在高维观测过程中，发现隐藏在其中的有意义的低维结构。人脑在日常感知中也面临同样的问题，从高维感官输入中提取出30,000个听觉神经元或106个视神经纤维，这是数量很少的感知相关特征。在这里，我们描述了一种解决维度降低问题的方法，该方法使用易于测量的局部度量信息来学习数据集的底层全局几何，与主成分分析(PCA)和多维度缩放(MDS)等经典技术不同，我们的方法能够发现复杂的自然观察结果所蕴含的非线性自由度，例如不同观察条件下的人类笔迹或人脸图像。与以往的非线性维度降低算法相比，我们的方法能够计算出一个全局最优的解，并且对于一类重要的数据表征，保证渐进地收敛到真实结构。

A canonical problem in dimensionality reduction from the domain of visual perception is illustrated in Fig. 1A. The input consists of many images of a person’s face observed under different pose and lighting conditions, in no particular order. These images can be thought of as points in a high-dimensional vector space, with each input dimension corresponding to the brightness of one pixel in the image or the firing rate of one retinal ganglion cell. Although the input dimensionality may be quite high (e.g., 4096 for these 64 pixel by 64 pixel images), the perceptually meaningful structure of these images has many fewer independent degrees of freedom. Within the 4096-dimensional input space, all of the images lie on an intrinsically three dimensional manifold, or constraint surface, that can be parameterized by two pose variables plus an azimuthal lighting angle. Our goal is to discover, given only the unordered high-dimensional inputs, low-dimensional representations such as Fig. 1A with coordinates that capture the intrinsic degrees of freedom of a data set. This problem is of central importance not only in studies of vision (1–5), but also in speech (6, 7), motor control (8, 9), and a range of other physical and biological sciences (10–12).

图1的A是视觉感知领域的一个典型的维度降低问题。输入包含在不同姿势和光照条件下以特定顺序观察到的许多人脸图像。这些图像可以被认为是高维向量空间中的点，每个输入维度对应图像中一个像素的亮度或一个视网膜神经节细胞的放电率。虽然输入维度可能相当高，例如，对于这些64x64像素的图像来说，为4096，但这些图像的感知意义结构具有较少的独立自由度。在4096维的输入空间内，所有的图像都位于一个本质上的三维面，或者说约束面，它可以由两个姿势变量加上一个方位光照角进行参数化。我们的目标是，只给定无序的高维输入，发现如图1中的A这样的低维表示，其坐标可以捕捉数据集的固有自由度。这个问题不仅在视觉(1-5)的研究中具有核心重要性，而且在语音(6，7)、运动控制(8，9)以及一系列其他物理和生物科学(10-12)中也具有核心重要性。

======================================================================================================

Fig. 1. (A) A canonical dimensionality reduction problem from visual perception.The input consists of a sequence of 4096-dimensional vectors, representing the brightness values of 64 pixel by 64 pixel images of a face rendered with different poses and lighting directions. Applied to N =698 raw images,Isomap(K=6) learns a three-dimension al embedding of the data’s intrinsic geometric structure. A two-dimensional projection is shown, with a sample of the original input images (red circles) superimposed on all the data points (blue) and horizontal sliders (under the images) representing the third dimension. Each coordinate axis of the embedding correlates highly with one degree of freedom underlying the original data: leftright pose (x axis, R =0.99), up-down pose (y axis, R =0.90), and lighting direction (slider position, R =0.92). The input-space distances dx(i,j) given to Isomap were Euclidean distances between the 4096-dimensional image vectors. (B) Isomap applied to N =1000 handwritten “2”s from the MNIST database (40). The two most signiﬁcant dimensions in the Isomap embedding, shown here, articulate the major features of the “2”: bottom loop (x axis) and top arch (y axis). Input-space distances dx(i,j) were measured by tangent distance,a metric designed to capture the invariances relevant in handwriting recognition (41). Here we used e-Isomap (with ε=4.2) because we did not expect a constant dimensionality to hold over the whole data set; consistent with this, Isomap ﬁnds several tendrils projecting from the higher dimensional mass of data and representing successive exaggerations of an extra stroke or ornament in the digit.

图1.(A) 一个从视觉感知出发的规范维度降低问题.输入由4096个维度的向量序列组成,代表以不同姿势和光照方向渲染的面部的64像素乘64像素图像的亮度值。应用于N=698张原始图像，Isomap(K=6)可以学习数据内在几何结构的三维嵌入。图中显示了一个二维投影，原始输入图像的样本，即红色圆圈叠加在所有数据点，即蓝色点上，水平滑块（图像下方）代表第三维。嵌入的每个坐标轴与原始数据基础的一个自由度高度相关：左右的姿势（x轴，R=0.99），上下姿势（y轴，R=0.90），以及照明方向（滑块位置，R=0.92）。给予Isomap的输入空间距离dx(i,j)是4096维图像向量之间的欧氏距离。(B) Isomap应用于MNIST数据库(40)中的N =1000个手写 “2”。在Isomap嵌入中的两个最显著的维度，这里显示，阐明了 "2 "的主要特征：底部环（x轴）和顶部拱形（y轴）。输入空间的距离dx(i,j)由切线距离测量，一个旨在捕捉手写识别中相关的不变量的度量(41)。在这里，我们使用e-Isomap（ε=4.2），因为我们并不期望在整个数据集上保持一个恒定的维度；与此一致，Isomap定义了几个从较高维度的数据质量中投射出来的卷须，并代表数字中一个额外笔画或纹饰的连续夸张。

=======================================================================================================
The classical techniques for dimensionality reduction, PCA and MDS, are simple to implement, efficiently computable, and guaranteed to discover the true structure of data lying on or near a linear subspace of the high-dimensional input space (13). PCA finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space. Classical MDS finds an embedding that preserves the interpoint distances, equivalent to PCA when those distances are Euclidean. However, many data sets contain essential nonlinear structures that are invisible to PCA and MDS (4, 5, 11, 14). For example, both methods fail to detect the true degrees of freedom of the face data set (Fig. 1A), or even its intrinsic three-dimensionality (Fig. 2A).

经典的降维技术PCA和MDS，实现简单，计算效率高，并保证发现躺在高维输入空间的线性子空间上或附近的数据的真实结构（13）。PCA发现了一个低维数据点的嵌入，它能最好地保留数据点在高维输入空间中的方差。经典MDS找到一个能保存点间距离的嵌入，当这些距离是欧几里得时，相当于PCA。然而，许多数据集包含重要的非线性结构，而这些结构是PCA和MDS所看不到的（4，5，11，14）。例如，这两种方法都无法检测到人脸数据集的真实自由度（图1A），甚至无法检测到其固有的三维性（图2A）。

Here we describe an approach that combines the major algorithmic features of PCA and MDS—computational efficiency, global optimality, and asymptotic convergence guarantees—with the flexibility to learn a broad class of nonlinear manifolds. Figure 3 A illustrates the challenge of nonlinearity with data lying on a two-dimensional “Swiss roll”: points far apart on the underlying manifold, as measured by their geodesic, or shortest path, distances, may appear deceptively close in the high-dimensional input space, as measured by their straight-line Euclidean distance. Only the geodesic distances reflect the true low-dimensional geometry of the manifold, but PCA and MDS effectively see just the Euclidean structure; thus, they fail to detect the intrinsic twodimensionality (Fig. 2B).

在这里，我们描述了一种方法，它结合了PCA和MDS的主要算法特征：计算效率、全局优化和渐进收敛保证，以及学习广泛的非线性流形的灵活性。图3的A说明了在二维 "瑞士卷 "上的非线性数据所面临的挑战：在底层流形上相距甚远的点，按照它们的测地线或最短路径的距离来衡量，在高维输入空间中，按照它们的欧氏直线距离来衡量，可能会显得非常接近。只有测地距离才反映了真正的低维几何学，但PCA和MDS实际上只看到了欧氏结构；因此，它们无法检测内在的二维性（图2B）。

Our approach builds on classical MDS but seeks to preserve the intrinsic geometry of the data, as captured in the geodesic manifold distances between all pairs of data points. The crux is estimating the geodesic distance between faraway points, given only input-space distances. For neighboring points, inputspace distance provides a good approximation to geodesic distance. For faraway points, geodesic distance can be approximated by adding up a sequence of “short hops” between neighboring points. These approximations are computed efficiently by finding shortest paths in a graph with edges connecting neighboring data points.

我们的方法建立在经典MDS的基础上，但力求保留数据的内在几何形状，正如所有数据点之间的测地线流形距离所体现的那样。关键是在只给定输入空间距离的情况下，估计远方点之间的测地线距离。对于相邻点，输入空间距离可以很好地近似测地距离。对于远处的点，测地距离可以通过相邻点之间的 "短跳 "序列相加来近似。这些近似值可以通过寻找图中连接相邻数据点的边的最短路径来有效计算。

The complete isometric feature mapping, or Isomap, algorithm has three steps, which are detailed in Table 1. The first step determines which points are neighbors on the manifold M, based on the distances d_X(i,j) between pairs of points i,j in the input space X. Two simple methods are to connect each point to all points within some fixed radius e, or to all of its K nearest neighbors (15). These neighborhood relations are represented as a weighted graph G over the data points, with edges of weight d_X(i,j) between neighboring points (Fig. 3B).

In its second step, Isomap estimates the geodesic distances dM(i,j) between all pairs of points on the manifold M by computing their shortest path distances dG(i,j) in the graph G. One simple algorithm (16) for finding shortest paths is given in Table 1.

The final step applies classical MDS to the matrix of graph distances D_G = {d_G(i,j)}, constructing an embedding of the data in a d-dimensional Euclidean space Y that best preserves the manifold’s estimated intrinsic geometry (Fig. 3C). The coordinate vectors yi for points in Y are chosen to minimize the cost function $E=||τ(D_G)-τ(D_Y)||_{^{L^2}}$ (1) where DY denotes the matrix of Euclidean distances {{d_Y(i,j) = ||y_i -y_j||} and $A|| _L {^2}$ the L² matrix norm $\sqrt{\sum _{i,j}A_{i,j}^2}$ . The τ operator converts distances to inner products (17), which uniquely characterize the geometry of the data in a form that supports efficient optimization. The global minimum of Eq. 1 is achieved by setting the coordinates yi to the top d eigenvectors of the matrix τ(D_G)(13).

As with PCA or MDS, the true dimensionality of the data can be estimated from the decrease in error as the dimensionality of Y is increased. For the Swiss roll, where classical methods fail, the residual variance of Isomap correctly bottoms out at d =2 (Fig. 2B).

完整的等距特征图谱，即Isomap算法有三个步骤，详见表1。第一步根据输入空间X中的点i,j对之间的距离d_X(i,j)来确定哪些点是流形M上的邻接点，两种简单的方法是将每个点连接到某个固定半径e内的所有点，或者连接到它的所有K个最近的邻接点（15）。这些邻接关系表示为数据点上的加权图G，相邻点之间的权重为d_X(i,j)的边(图3B)。

在第二步中，Isomap通过计算图G中所有点对之间的最短路径距离dG(i,j)来估计流形M上所有点对之间的测地距离dM(i,j)，表1中给出了一个寻找最短路径的简单算法(16)。

最后一步将经典MDS应用于图距离矩阵D_G= {d_G(i,j)}，在d维欧氏空间Y中构建数据的嵌入，该空间能最好地保存线形估计的内在几何结构（图3C）。Y中各点的坐标向量yi的选择是为了最小化成本函数 $E=||τ(D_G)-τ(D_Y)||_{^{L^2}}$ (1) 其中DY表示欧氏距离矩阵{d_Y(i,j) = ||y_i - y_j||}和 $A|| _L {^2}$ 的L²矩阵规范 $\sqrt{\sum _{i,j}A_{i,j}^2}$ 。t算子将距离转换为内积(17)，内积以一种支持高效优化的形式独特地描述了数据的几何特征。将坐标y_i设为矩阵τ(D_G)(13)的前d个特征向量，即可实现式1的全局最小值。

与PCA或MDS一样，数据的真实维度可以通过随着Y的维度增加而减少的误差来估计。对于经典方法失效的瑞士卷，Isomap的残差方差正确地在d =2处达到了底部（图2B）。
=======================================================================================================
Fig. 2. The residual variance of PCA (open triangles), MDS [open triangles in (A) through ©; open circles in (D)], and Isomap (ﬁlled circles) on four data sets (42). (A) Face images varying in pose and illumination (Fig. 1A). (B) Swiss roll data (Fig. 3). © Hand images varying in ﬁnger extension and wrist rotation (20). (D) Handwritten “2”s (Fig.1B). In all cases,residual variance decreases as the dimensionality d is increased. The intrinsic dimensionality of the data can be estimated by looking for the “elbow” at which this curve ceases to decrease signiﬁcantly with added dimensions. Arrows mark the true or approximate dimensionality, when known. Note the tendency of PCA and MDS to overestimate the dimensionality, in contrast to Isomap.

图2.PCA(空心三角形)、MDS，即(A)至 ( C) 中的空心三角形、(D)中的空心圆和Isoma实心圆对四个数据集（42）的残差。(A)在姿势和照明度上变化的人脸图像（图1 A）；(B)瑞士卷数据（图3）；( C)手部图像，在手指延伸和手腕旋转（20）变化；(D)手写 "2 "（Fig.1B）。在所有情况下，残差随着维度d的增加而减小。数据的内在维度可以通过寻找 "肘部 "来估计，在这个 "肘部 "处，随着维度的增加，曲线不再明显下降。已知时，箭头标记真实或近似尺寸。请注意，与Isomap相比，PCA和MDS倾向于高估尺寸。

=======================================================================================================

Just as PCA and MDS are guaranteed, given sufficient data, to recover the true structure of linear manifolds, Isomap is guaranteed asymptotically to recover the true dimensionality and geometric structure of a strictly larger class of nonlinear manifolds. Like the Swiss roll, these are manifolds whose intrinsic geometry is that of a convex region of Euclidean space, but whose ambient geometry in the high-dimensional input space may be highly folded, twisted, or curved. For non-Euclidean manifolds, such as a hemisphere or the surface of a doughnut, Isomap still produces a globally optimal lowdimensional Euclidean representation, as measured by Eq. 1.

These guarantees of asymptotic convergence rest on a proof that as the number of data points increases, the graph distances d_G(i,j) provide increasingly better approximations to the intrinsic geodesic distances d_M(i,j), becoming arbitrarily accurate in the limit of infinite data (18, 19). How quickly d_G(i,j) converges to d_M(i,j) depends on certain parameters of the manifold as it lies within the high-dimensional space (radius of curvature and branch separation) and on the density of points. To the extent that a data set presents extreme values of these parameters or deviates from a uniform density, asymptotic convergence still holds in general, but the sample size required to estimate geodesic distance accurately may be impractically large.
就像PCA和MDS在给定足够数据的情况下，可以保证恢复线性流形的真实结构一样，Isomap可以渐进地保证恢复一类严格意义上更大的非线性流形的真实维度和几何结构。就像瑞士卷一样，这些流形的内在几何结构是欧氏空间的凸区域，但其高维输入空间的环境几何结构可能是高度折叠、扭曲或弯曲的。对于非欧几里得表象，如半球形或甜甜圈的表面，Isomap仍能产生全局最优的低维欧几里得表象，由公式1测得。

这些渐近收敛的保证基于以下证明，即随着数据点数量的增加，图形距离d_G(i,j)对固有测地距离d_M(i,j)提供了越来越好的近似值，在无限数据的限制下变得任意精确(18, 19)。d_G(i,j)收敛到d_M(i,j)的速度取决于流形的某些参数，因为它位于高维空间内(曲率半径和分支分离)，并取决于点的密度。如果一个数据集呈现出这些参数的极端值或偏离了均匀的密度，渐进收敛在一般情况下仍然成立，但准确估计测地距离所需的样本量可能不切实际地大。

=======================================================================================================

Fig.3.The“Swissroll” dataset,illustrating how Isomap exploits geodesic paths for nonlinear dimensionality reduction.(A) For two arbitrary points (circled) on a nonlinear manifold, their Euclidean distance in the highdimensional input space (length of dashed line) may not accurately reﬂect their intrinsic similarity, as measured by geodesic distance along the low-dimensional manifold (length of solid curve). (B) The neighborhood graph G constructed in step one of Isomap (with K=7 and N =1000 data points) allows an approximation (red segments) to the true geodesic path to be computed efﬁciently in step two, as the shortest path in G.( C) The two-dimensional embedding recovered by Isomap in step three, which best preserves the shortest path distances in the neighborhood graph (overlaid). Straight lines in the embedding (blue) now represent simpler and cleaner approximations to the true geodesic paths than do the corresponding graph paths (red).

图3. "瑞士卷"数据集，说明Isomap是如何利用测地路径来降低非线性维度的。 (A) 对于非线性流形上的两个任意带圆圈的点，它们在高维输入空间中的欧氏距离，即虚线的长度可能无法准确地重现它们的内在相似性，这是通过沿低维流形的测地距离，即实曲线的长度测得的。(B)Isomap第一步构建的邻域图G，其K =7，N=1000个数据点，允许在第二步中有效地计算出真正的测地路径的近似值即红色段，作为G中最短的路径。( C)Isomap在第三步中恢复的二维嵌入，它最好的保留邻域图中最短的路径距离(覆盖)。嵌入中的蓝色直线现在比相应的红色图路径更简单、更干净地代表真正的测地路径的近似。

=======================================================================================================

Isomap’s global coordinates provide a simple way to analyze and manipulate highdimensional observations in terms of their intrinsic nonlinear degrees of freedom. For a set of synthetic face images, known to have three degrees of freedom, Isomap correctly detects the dimensionality (Fig. 2A) and separates out the true underlying factors (Fig. 1A). The algorithm also recovers the known low-dimensional structure of a set of noisy real images, generated by a human hand varying in finger extension and wrist rotation (Fig. 2C) (20). Given a more complex data set of handwritten digits, which does not have a clear manifold geometry, Isomap still finds globally meaningful coordinates (Fig. 1B) and nonlinear structure that PCA or MDS do not detect (Fig. 2D). For all three data sets, the natural appearance of linear interpolations between distant points in the low-dimensional coordinate space confirms that Isomap has captured the data’s perceptually relevant structure (Fig. 4).

Isomap的全局坐标提供了一种简单的方法，从其内在的非线性自由度出发，来分析和操作高维度的观测数据。对于一组已知具有三个自由度的合成人脸图像，Isomap正确地检测出尺寸（图2 A），并分离出真正的底层因素（图1 A）。该算法还恢复了已知的一组噪声的真实图像的低维结构，该图像是由一只在手指伸展和手腕旋转中变化的人类手产生的（图2 C）（20）。给定一个更复杂的手写数字数据集，它没有明确的流形几何，Isomap仍然可以找到全局有意义的坐标（图1 B）和非线性结构，而PCA或MDS没有检测到非线性结构（图2D）。对于这三组数据，在低维坐标空间中，远处的点之间自然出现线性插值，证实了Isomap已经捕捉到了数据的感性相关结构（图4）。

=======================================================================================================
Table1.The Isomap algorithm takes as input the distances d_X(i,j) between all pairs i,j from N datapoints in the high-dimensional input space X, measured either in the standard Euclidean metric (as in Fig. 1A) or in some domain-speciﬁc metric (as in Fig. 1B). The algorithm outputs coordinate vectors y_i in a d-dimensional Euclidean space Y that (according to Eq. 1) best represent the intrinsic geometry of the data. The only free parameter (ε or K) appears in Step 1.
表1.Isomap算法将高维输入空间X中N个数据点的所有对i,j之间的距离d_X(i,j)作为输入，用标准欧氏度量法，如图1A，或某种域特定度量法，如图1B。该算法在d维欧氏空间Y中输出坐标向量y_i，根据公式1，最能代表数据的内在几何形状。唯一的自由参数ε或K出现在步骤1中。

=======================================================================================================

步骤：
1.构造邻接图如果[根据d_X(i,j)测量点i和j比ε即ε-Isomap更近，或者如果i是j的K个最近的邻居之一，则通过连接点i和j在所有数据点上定义图G，设置边长等于d_X(i,j)。
2.计算最短路径如果i,j相连，则初始化d_G(i,j) =d_X(i,j)；否则，初始化d_G(i,j) =∞ 。然后依次对k的每个值1到N，用min{d_G(i,j)，d_G(i,k) +d_G(k,j)}代替所有条目d_G(i,j)。最终值矩阵d_G = {d_G(i,j)}将包含G中所有点对之间的最短路径距离(16，19)。
3.构建d维嵌入设λ_p为矩阵t(D_G)(17)的第p个特征值(依次递减)， $v^i_p$ 为p个特征向量的第i个分量。那么设d维坐标矢量y_i的第p个分量等于 $\sqrt{λ_p }v^i_p$

Fig. 4. Interpolations along straight lines in the Isomap coordinate space (analogous to the blue line in Fig. 3C) implement perceptually natural but highly nonlinear “morphs” of the corresponding high-dimensional observations (43) by transforming them approximately along geodesic paths (analogous to the solid curve in Fig. 3A). (A) Interpolations in a three-dimensional embedding of face images (Fig. 1A). (B) Interpolations in a fourdimensional embedding of hand images (20) appear as natural hand movements when viewed in quick succession, even though no such motions occurred in the observed data. © Interpolations in a six-dimensional embedding of handwritten“2”s(Fig.1B) preserve continuity not only in the visual features of loop and arch articulation, but also in the implied pen trajectories, which are the true degrees of freedom underlying those appearances.

图4.沿着Isomap坐标空间中的直线进行插值，类似于图3C中的蓝线，方法是通过沿测地线近似地变换相应的高维观测值（43），以实现了感知上自然但高度非线性的 “变形”，类似于图3A中的实线。(A) 人脸图像三维嵌入中的插值（图1A）。(B)快速连续观看时，手部图像（20）的四维嵌入中的内插显示为自然的手部运动，即使在观察到的数据中没有发生这样的运动。©手写 "2 "即图1B的六维嵌入中的插值不仅在环形和弓形衔接的视觉特征上保持了连续性，而且在隐含的笔的轨迹上也保持了连续性，这才是这些表象背后的真正自由度。

=======================================================================================================

Previous attempts to extend PCA and MDS to nonlinear data sets fall into two broad classes, each of which suffers from limitations overcome by our approach. Local linear techniques (21–23) are not designed to represent the global structure of a data set within a single coordinate system, as we do in Fig. 1. Nonlinear techniques based on greedy optimization procedures (24–30) attempt to discover global structure, but lack the crucial algorithmic features that Isomap inherits from PCA and MDS: a noniterative, polynomial time procedure with a guarantee of global optimality; for intrinsically Euclidean manifolds, a guarantee of asymptotic convergence to the true structure; and the ability to discover manifolds of arbitrary dimensionality, rather than requiring a fixed d initialized from the beginning or computational resources that increase exponentially in d.

Here we have demonstrated Isomap’s performance on data sets chosen for their visually compelling structures, but the technique may be applied wherever nonlinear geometry complicates the use of PCA or MDS. Isomap complements, and may be combined with, linear extensions of PCA based on higher order statistics, such as independent component analysis (31, 32). It may also lead to a better understanding of how the brain comes to represent the dynamic appearance of objects, where psychophysical studies of apparent motion (33, 34) suggest a central role for geodesic transformations on nonlinear manifolds (35) much like those studied here.

以前尝试将PCA和MDS扩展到非线性数据集的方法分为两大类，每类都受到我们方法克服的局限性的困扰。就像我们在图1中做的那样,局部线性技术(21-23)并不是为了在单一坐标系内表示数据集的全局结构而设计的。基于贪婪优化程序(24-30)的非线性技术试图发现全局结构，但缺乏Isomap从PCA和MDS中继承的关键算法特征：一个非迭代的、多项式时间的程序，并保证全局最优性；对于内在的欧几里德流形，保证了到真实结构的渐近收敛；以及发现任意维度流形的能力，而不需要从开始就初始化的固定d或在d中呈指数增长的计算资源。

在这里，我们已经展示了Isomap在为其视觉上引人注目的结构选择的数据集上的性能，但该技术可以应用在任何非线性几何形状复杂的PCA或MDS的使用。Isomap是对基于高阶统计学的PCA线性扩展的补充，也可以与之结合，如独立成分分析(31, 32)。它也可能导致人们更好地理解大脑如何来表示对象的动态外观，其中视运动的心理物理学研究（33，34）提出了非线性流形上的测地线变换（35）的中心作用，很像那些在这里研究。

=======================================================================================================
References and Notes

M. P. Young, S. Yamane, Science 256, 1327 (1992).
R. N. Shepard, Science 210, 390 (1980).
M. Turk, A. Pentland, J. Cogn. Neurosci. 3, 71 (1991).
H. Murase, S. K. Nayar, Int. J. Comp. Vision 14,5 (1995).
J. W. McClurkin, L. M. Optican, B. J. Richmond, T. J. Gawne, Science 253, 675 (1991).
J. L. Elman, D. Zipser, J. Acoust. Soc. Am. 83, 1615 (1988).
W. Klein, R. Plomp, L. C. W. Pols, J. Acoust. Soc. Am. 48, 999 (1970).
E.Bizzi,F.A.Mussa-Ivaldi,S.Giszter,Science253,287 (1991).
T. D. Sanger, Adv. Neural Info. Proc. Syst. 7, 1023 (1995).
J. W. Hurrell, Science 269, 676 (1995).
C. A. L. Bailer-Jones, M. Irwin, T. von Hippel, Mon. Not. R. Astron. Soc. 298, 361 (1997).
P. Menozzi, A. Piazza, L. Cavalli-Sforza, Science 201, 786 (1978).
K. V. Mardia, J. T. Kent, J. M. Bibby, Multivariate Analysis, (Academic Press, London, 1979).
A. H. Monahan, J. Clim., in press.
The scale-invariant K parameter is typically easier to set than ε, but may yield misleading results when thlocal dimensionality varies across the data set. When available, additional constraints such as the temporal ordering of observations may also help to determine neighbors. In earlier work (36) we explored a more complex method (37), which required an order of magnitude more data and did not support the theoretical performance guarantees we provide here for ε- and K-Isomap.
This procedure, known as Floyd’s algorithm, requires O(N³) operations. More efﬁcient algorithms exploiting the sparse structure of the neighborhood graph can be found in (38).
The operator t is deﬁned by τ (D)= -HSH/2, where S is the matrix of squared distances ${S_{ij} = D_{ij}^ 2}$ , and H is the “centering matrix” {H_ij = δ_ij -1/N}(13).
Our proof works by showing that for a sufﬁciently highdensity( α) of data points,we can always choose a neighborhood size (ε or K) large enough that the graph will (with high probability) have a path not much longer than the true geodesic, but small enough to prevent edges that “short circuit” the true geometry of the manifold. More precisely, given arbitrarily small values of λ₁,λ₂, and μ, we can guarantee that with probability at least 1- μ, estimates of the form
$(1-λ_1)d_M(i,j)\leqslant d_G(i,j)\leqslant(1+λ_2)d_M(i,j)$
will hold uniformly over all pairs of data points i,j.For ε-Isomap, we require
$ε\leqslant (2/π)r_0\sqrt{24λ_1}$ ε 0,
$α>[log(V/μη_d(λ_2\epsilon/16)^d)]/\eta _d(\lambda _2\epsilon/8)^d$
where r₀ is the minimal radius of curvature of the manifold M as embedded in the input space X, s₀ is the minimal branch separation of M in X, V is the (d-dimensional)volume of M,and (ignoringboundary effects) η_d is the volume of the unit ball in Euclidean d-space. For K-Isomap, we let ε be as above and ﬁx the ratio (K +1)/α=η_d(ε/2)^d/2. We then require
$e^{-(k+1)/4}\leqslant \mu \eta _d(\varepsilon /4)^d/4V$
$(e/4)^{(k+1)/2}\leqslant \mu \eta _d(\varepsilon /8)^d/16V$
$\alpha >[4log(8V/\mu \eta _d(\lambda _2\varepsilon /32\Pi )^d)]/\eta _d(\lambda _2\varepsilon /16\Pi )^d$
The exact content of these conditions—but not their general form—depends on the particular technical assumptions we adopt. For details and extensions to nonuniform densities, intrinsic curvature, and boundary effects, see http://isomap.stanford.edu.
In practice, for ﬁnite data sets, d_G(i,j) may fail to approximate d_M(i,j) for a small fraction of points that are disconnected from the giant component of the neighborhood graph G. These outliers are easily detected as having inﬁnite graph distances from the majority of other points and can be deleted from further analysis.
The Isomap embedding of the hand images is available at Science Online at www.sciencemag.org/cgi/content/full/290/5500/2319/DC1. For additional material and computer code, see http://isomap. stanford.edu.
R. Basri, D. Roth, D. Jacobs, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1998), pp. 414–420.
C. Bregler, S. M. Omohundro, Adv. Neural Info. Proc. Syst. 7, 973 (1995).
G. E. Hinton, M. Revow, P. Dayan, Adv. Neural Info. Proc. Syst. 7, 1015 (1995).
R. Durbin, D. Willshaw, Nature 326, 689 (1987).
T. Kohonen, Self-Organisation and Associative Memory (Springer-Verlag, Berlin, ed. 2, 1988), pp. 119– 157.
T. Hastie, W. Stuetzle, J. Am. Stat. Assoc. 84, 502 (1989).
M. A. Kramer, AIChE J. 37, 233 (1991).
D. DeMers, G. Cottrell, Adv. Neural Info. Proc. Syst. 5, 580 (1993).
R. Hecht-Nielsen, Science 269, 1860 (1995).
C. M. Bishop, M. Svens´en, C. K. I. Williams, Neural Comp. 10, 215 (1998).
P. Comon, Signal Proc. 36, 287 (1994).
A. J. Bell, T. J. Sejnowski, Neural Comp. 7, 1129 (1995).
R. N. Shepard, S. A. Judd, Science 191, 952 (1976).
M. Shiffrar, J. J. Freyd, Psychol. Science 1, 257 (1990).
R. N. Shepard, Psychon. Bull. Rev. 1, 2 (1994).
J. B. Tenenbaum, Adv. Neural Info. Proc. Syst. 10, 682 (1998).
T.Martinetz,K.Schulten,NeuralNetw.7,507(1994).
V. Kumar, A. Grama, A. Gupta, G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms (Benjamin / Cummings, Redwood City, CA, 1994), pp. 257–297.
D. Beymer, T. Poggio, Science 272, 1905 (1996).
Available at www.research.att.com/;yann/ocr/mnist.
P. Y. Simard, Y. LeCun, J. Denker, Adv. Neural Info. Proc. Syst. 5, 50 (1993).
In order to evaluate the ﬁts of PCA, MDS, and Isomap on comparable grounds, we use the residual variance
1–R²(D _M, D_Y). D_Y is the matrix of Euclidean distances in the low-dimensional embedding recovered by each algorithm. D _M is each algorithm’s best estimate of the intrinsic manifold distances: for Isomap, this is the graph distance matrix D_G; for PCA and MDS, it is the Euclidean input-space distance matrix D_X (except with the handwritten “2”s, where MDS uses the tangent distance). R is the standard linear correlation coefﬁcient, taken over all entries of D _M and D_Y.
In each sequence shown, the three intermediate images are those closest to the points 1/4, 1/2, and 3/4 of the way between the given endpoints.We can also synthesize an explicit mapping from input space X to the low-dimensional embedding Y, or vice versa, using the coordinates of corresponding points {xi, yi} in both spaces provided by Isomap together with standard supervised learning techniques (39).
Supported by the Mitsubishi Electric Research Laboratories, the Schlumberger Foundation, the NSF (DBS-9021648), and the DARPA Human ID program. We thank Y. LeCun for making available the MNIST database and S.RoweisandL.Saulforsharingrelated unpublished work. For many helpful discussions, we thank G. Carlsson, H. Farid, W. Freeman, T. Grifﬁths, R. Lehrer, S. Mahajan, D. Reich, W. Richards, J. M. Tenenbaum, Y. Weiss, and especially M. Bernstein.

10 August 2000; accepted 21 November 2000

你可能感兴趣的:(机器学习,计算机视觉)

《 YOLOv5、YOLOv8、YOLO11训练的关键文件：data.yaml文件编写全解》空云风语人工智能 YOLO 机器视觉目标跟踪人工智能计算机视觉 YOLO
走进YOLOv5、YOLOv8、YOLO11的data.yaml在计算机视觉领域的广袤星空中，目标检测无疑是一颗璀璨的明星，它广泛应用于自动驾驶、智能安防、工业检测、医疗影像分析等众多关键领域，发挥着不可或缺的作用。而YOLO系列算法，更是以其独特的“一次看全（YouOnlyLookOnce）”理念和卓越的性能，在目标检测领域中独树一帜，成为了众多研究者和开发者的首选工具。从最初的YOLOv1横空
机器学习之线性代数珠峰日记 AI理论与实践机器学习线性代数人工智能
文章目录一、引言：线性代数为何是AI的基石二、向量：AI世界的基本构建块（一）向量的定义（二）向量基础操作（三）重要概念三、矩阵：AI数据的强大容器（一）矩阵的定义（二）矩阵运算（三）矩阵特性（四）矩阵分解（五）Python示例（使用NumPy库）四、线性代数在AI中的应用（一）数据表示（二）降维：PCA（三）线性回归（四）计算机视觉（五）自然语言处理一、引言：线性代数为何是AI的基石在人工智能领
深度解析：DETR的多尺度特征融合 AI天才研究院 AI大模型企业级应用开发实战 DeepSeek R1 &大数据AI人工智能大模型计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
"深度解析：DETR的多尺度特征融合"作者：禅与计算机程序设计艺术1.背景介绍1.1目标检测的挑战与传统方法的局限性目标检测是计算机视觉领域中的一个基本任务，其目标是识别图像或视频中所有感兴趣的目标，并确定它们的位置和类别。传统的目标检测方法，如FasterR-CNN和YOLO，通常依赖于预定义的锚框或候选区域来生成目标proposals。然而，这些方法存在一些固有的局限性：人工先验知识:锚框的设
机器学习(Machine Learning) 七指琴魔御清绝大数据学习
原文链接：http://blog.csdn.net/zhoubl668/article/details/42921187希望转载的朋友，你可以不用联系我．但是一定要保留原文链接，因为这个项目还在继续也在不定期更新．希望看到文章的朋友能够学到更多．《BriefHistoryofMachineLearning》介绍:这是一篇介绍机器学习历史的文章，介绍很全面，从感知机、神经网络、决策树、SVM、Ada
机器学习实战——音乐流派分类（主页有源码）喵了个AI 机器学习实战机器学习分类人工智能
✨个人主页欢迎您的访问✨期待您的三连✨✨个人主页欢迎您的访问✨期待您的三连✨✨个人主页欢迎您的访问✨期待您的三连✨1.简介音乐流派分类是音乐信息检索（MusicInformationRetrieval,MIR）中的一个重要任务，旨在通过分析音频信号的特征，将音乐自动分类到不同的流派（如古典、摇滚、爵士、流行等）。随着数字音乐平台的普及，音乐流派分类技术被广泛应用于音乐推荐、自动标签生成和音乐库管理
目标检测项目 sho_re 神经网络人工智能 pytorch 目标检测
·识别图片中有哪些物体并且找到物体的存在位置多任务：位置+类别目标种类与数量繁多的问题目标尺度不均的问题遮挡、噪声等外部环境干扰VOC数据集：PASCALVOC挑战赛(ThePASCALVisualObjectClasses)是一个世界级的计算机视觉挑战赛。4大类，20小类VOC2007：9963图片/24640目标VOC2012：23080图片/54900目标·COCO数据集：起源于微软2014
HarmonyNext实战案例：基于ArkTS的高性能分布式机器学习应用开发 harmonyos-next
HarmonyNext实战案例：基于ArkTS的高性能分布式机器学习应用开发引言在HarmonyNext生态系统中，分布式机器学习是其核心特性之一。通过分布式机器学习，开发者可以充分利用多设备的计算资源，实现复杂模型的训练与推理。本文将深入探讨如何使用ArkTS12+语法开发一个高性能的分布式机器学习应用，涵盖从基础概念到高级技巧的全面讲解。通过本案例，您将学习到如何利用HarmonyNext的分
成功案例丨开发时间从1小时缩短到3分钟：如何利用历史数据训练AI模型，预测设计性能？ Altair澳汰尔 PhysicsAI 仿真 AI 机器学习 HyperWorks 数据分析
案例简介PhysicsAI™助力HEROMOTOCORP实现设计效率提升99%印度领先的跨国摩托车和踏板车制造商HeroMotoCorpLtd.（以下简称Hero）致力于通过将人工智能（AI）和机器学习技术融入有限元分析（FEA）流程，以加速产品开发周期。在其首个AI驱动项目——摩托车把手设计优化中，Hero采用了PhysicsAI™几何深度学习解决方案，利用历史数据训练AI模型并预测设计性能。A
Python学习指南：系统化路径 + 避坑建议程之编 Python全栈通关秘籍青少年编程 python 开发语言人工智能机器学习
新手小白学习编程就像搭积木——需要从基础开始，逐步构建知识体系。以下是为你量身定制的Python学习路径，帮你告别杂乱，高效入门！一、学习前的关键认知明确目标：想用Python做什么？数据分析（如Excel自动化、可视化）Web开发（如搭建网站）人工智能（如机器学习）自动化办公（如处理文件、邮件）目标不同，后续学习侧重点不同（但基础通用）。避免误区：❌只看教程不写代码✅边学边动手，哪怕抄代码也要运
机器学习之KMeans算法 Mr终游机器学习机器学习算法 kmeans
目录一、KMeans的核心思想二、KMeans算法流程三、KMeans的关键点1.优点：2.缺点：四、如何确定最佳k值1.肘部法则2.轮廓系数五、Kmeans的典型应用场景六、代码示例KMeans是一种广泛使用的无监督学习算法，主要用于聚类分析（Clustering）。它的目标是将数据集划分为K个互不重叠的子集（簇，Cluster），使得同一簇内的数据点尽可能相似，不同簇之间的数据点尽可能差异显著
Python机器学习实战：构建序列到序列(Seq2Seq)模型处理翻译任务 AGI大模型与大数据研究院程序员提升自我硅基计算碳基计算认知计算生物计算深度学习神经网络大数据 AIGC AGI LLM Java Python 架构设计 Agent 程序员实现财富自由
Python机器学习实战：构建序列到序列(Seq2Seq)模型处理翻译任务1.背景介绍1.1问题的由来翻译是跨语言沟通的重要桥梁，随着全球化进程的加速，翻译需求日益增长。传统的机器翻译方法主要依赖于规则和统计方法，如基于短语的翻译、基于统计的机器翻译等。然而，这些方法难以处理复杂的语言现象，翻译质量参差不齐。近年来，随着深度学习技术的快速发展，基于神经网络序列到序列（Sequence-to-Seq
【漫话机器学习系列】130.主成分（Principal Components） IT古董漫话机器学习系列专辑机器学习人工智能 python
主成分（PrincipalComponents）详解1.什么是主成分？主成分（PrincipalComponents，PCs）是数据集中方差最大的线性组合，它是主成分分析（PrincipalComponentAnalysis，PCA）中的核心概念。主成分可以看作是对原始特征的新表述方式，它通过数学变换找到一组新的正交坐标轴，使得数据的主要变化方向与这些轴对齐。简单来说：主成分是数据集中信息量（方差
C++开源库大全大王算法 C/C++开发实战365 C++入门及项目实战宝典 c++开源
程序员要站在巨人的肩膀上，C++拥有丰富的开源库，这里包括：标准库、Web应用框架、人工智能、数据库、图片处理、机器学习、日志、代码分析等。标准库C++StandardLibrary：是一系列类和函数的集合，使用核心语言编写，也是C++ISO自身标准的一部分。
基于PyTorch的深度学习——机器学习3 Wis4e 深度学习机器学习 pytorch
激活函数在神经网络中作用有很多，主要作用是给神经网络提供非线性建模能力。如果没有激活函数，那么再多层的神经网络也只能处理线性可分问题。在搭建神经网络时，如何选择激活函数？如果搭建的神经网络层数不多，选择sigmoid、tanh、relu、softmax都可以；而如果搭建的网络层次较多，那就需要小心，选择不当就可导致梯度消失问题。此时一般不宜选择sigmoid、tanh激活函数，因它们的导数都小于1
AI 驱动的软件测试革命：从自动化到智能化的进阶之路綦枫Maple AI+软件测试人工智能自动化运维
引言：软件测试的智能化转型浪潮在数字化转型加速的今天，软件产品的迭代速度与复杂度呈指数级增长。传统软件测试依赖人工编写用例、执行测试的模式，已难以应对快速交付与高质量要求的双重挑战。人工智能技术的突破为测试领域注入了新动能，通过机器学习、深度学习、自然语言处理等技术，测试流程正从“被动验证”向“主动预防”演进。本文将深入探讨AI与软件测试的融合路径，结合技术原理、工具实践与行业趋势，为读者呈现一幅
XGBClassifiler函数介绍浊酒南街 #算法机器学习 XGB
目录前言函数介绍示例前言XGBClassifier是XGBoost库中用于分类任务的类。XGBoost是一种高效且灵活的梯度提升决策树（GBDT）实现，它在多种机器学习竞赛中表现出色，尤其擅长处理表格数据。函数介绍XGBClassifiler(max_depth=3,learning_rate=0.1,n_estimators=100,objective='binary:logistic',boo
基于大数据架构的就业岗位推荐系统的设计与实现【java或python】—计算机毕业设计源码+LW文档 qq_375279829 大数据架构 python 课程设计算法
摘要随着互联网技术的迅猛发展和大数据时代的到来，就业市场日益复杂多变，求职者与招聘方之间的信息不对称问题愈发突出。为解决这一难题，本文设计并实现了一个基于大数据架构的就业岗位推荐系统。该系统通过收集、整合并分析大量求职者简历信息、企业招聘信息以及市场动态数据，运用先进的机器学习算法，为求职者提供个性化的岗位推荐服务，同时帮助企业快速定位到合适的候选人。本文将从系统设计的背景与意义、技术基础、需求分
向量数据库简介 openwin_top python编程示例系列 python编程示例系列二数据库
向量数据库（VectorDatabase）是一种专门用于存储和查询向量数据的数据库系统。向量数据库通常使用高效的向量索引技术，支持基于向量相似度的查询和检索，可以应用于图像搜索、自然语言处理、推荐系统、机器学习等领域。与传统的关系型数据库不同，向量数据库通常使用基于向量的数据模型，将向量作为数据的核心表示形式。向量数据库可以存储和处理大量的向量数据，支持高效的向量相似度计算和查询。常见的向量索引技
Ubuntu22.04安装CP2K最新版2025.1 jhonwyyc 机器学习深度学习 ubuntu
CP2K教程CP2K系列之一安装文章目录CP2K教程前言一、安装依赖库1.引入库二、下载并解压缩1.下载链接2.解压缩三、安装1.安装cp2k_toolchain2.安装cp2k3.指定根目录4.修改环境变量四、测试总结前言CP2K是一款开源的第一性原理计算软件，采用Fortran98编写。近年来结合机器学习与lammps，已成为热度逐年增加的软件。但是目前使用它仍存在不少难点。本文讲解在Ubun
Azure AI Document Intelligence 使用指南 scaFHIO azure 人工智能 flask python
AzureAIDocumentIntelligence使用指南AzureAIDocumentIntelligence（原名AzureFormRecognizer）是一项基于机器学习的服务，可以从数字或扫描PDF、图像、Office和HTML文件中提取文本（包括手写）、表格、文档结构（如标题、节标题等）和键值对。它支持多种格式，包括PDF、JPEG/JPG、PNG、BMP、TIFF、HEIF、DOC
鸢尾花数据集的四个特征具体是什么？学术乙方 Python 人工智能
鸢尾花数据集（IrisDataset）是机器学习领域中最经典的数据集之一，它包含150个样本，每个样本有4个特征，分别是：1.花萼长度（SepalLength）描述：花萼（花的外部绿色部分）的长度，单位为厘米。取值范围：通常为4.3cm到7.9cm。2.花萼宽度（SepalWidth）描述：花萼的宽度，单位为厘米。取值范围：通常为2.0cm到4.4cm。3.花瓣长度（PetalLength）描述：
DeepSeek源码解析（2）白鹭凡 deepseek ai
Tensor（张量）的介绍在计算机科学和机器学习领域，“张量”（Tensor）是一个数学概念，它被用来表示多维数组。在大模型（如深度学习模型）中，张量扮演着核心角色，具体来说：数据表示：张量用于表示输入数据、模型参数和中间计算结果。例如，在图像处理中，一张图片可以被表示为一个三维张量（高度、宽度、颜色通道数），而在自然语言处理中，一段文本可以被编码为一系列词向量组成的二维张量（句子长度、词向量维度
点云语义分割：PointNet++在S3DIS数据集上的训练完美代码 3d neo4j 点云
点云语义分割：PointNet++在S3DIS数据集上的训练点云语义分割是计算机视觉领域的一个重要任务，旨在将点云数据中的每个点分配给其对应的语义类别。PointNet++是一种流行的深度学习方法，可用于处理点云数据，并在各种任务中取得了良好的性能。在本文中，我们将探讨如何使用PointNet++模型在S3DIS数据集上进行训练，并提供相应的源代码。数据集介绍S3DIS数据集是一个常用的用于室内场
机器学习数学基础：29.t检验 @心都机器学习人工智能
一、t检验的定义与核心思想（一）定义t检验（Student’st-test）是一种在统计学领域中广泛应用的基于t分布的统计推断方法。其主要用途在于判断样本均值与总体均值之间，或者两个独立样本的均值之间、配对样本的均值之间是否存在显著差异。例如，在教育研究中，可以通过t检验判断某个班级学生的平均成绩与全校学生的平均成绩是否有显著差异；在医学实验里，可用于比较实验组和对照组的患者某项生理指标的均值是否
基于YOLOv5的烟雾检测系统：从数据集准备到UI界面实现深度学习&目标检测实战项目 YOLO ui 分类数据挖掘目标跟踪
1.引言烟雾是火灾发生的一个重要早期信号。烟雾检测能够在火灾初期及时识别并报警，为火灾的扑灭争取宝贵的时间。因此，烟雾检测的研究一直是计算机视觉领域中的一个热点问题。近年来，随着深度学习技术的发展，目标检测算法被广泛应用于烟雾检测，尤其是基于YOLOv5的目标检测模型，由于其较高的精度和较低的计算开销，已经成为许多实时检测系统的首选模型。在这篇博客中，我们将介绍如何使用YOLOv5模型进行烟雾检测
计算机视觉｜3D 点云处理黑科技：PointNet++ 原理剖析与实战指南紫雾凌寒 AI 炼金厂 #深度学习 #计算机视觉深度学习计算机视觉 3d cnn PointNet++3d云 3d云数据
一、引言在当今数字化与智能化快速发展的时代，3D点云处理技术在多个前沿领域中发挥着重要作用。特别是在自动驾驶和机器人视觉等领域，这项技术已成为实现智能化的关键支撑。以自动驾驶为例，车辆需要实时感知周围复杂的环境信息，包括行人、车辆、交通标志和路况等。3D点云数据能够提供高精度的三维空间信息，使自动驾驶车辆更准确地识别和定位周围物体，从而做出安全、合理的行驶决策。在城市街道上，自动驾驶车辆通过3D点
机器学习算法（2）—— 线性回归算法疯狂的石头。算法机器学习线性回归
‘’‘构造数据集’‘’x=[[80,86],[82,80],[85,78],[90,90],[86,82],[82,90],[78,80],[92,94]]y=[84.2,80.6,80.1,90,83.2,87.6,79.4,93.4]‘’‘模型训练’‘’实例化一个估计器estimator=LinearRegression()使用fit方法进行训练estimator.fit(x,y)查看回归系数
【基于手势识别的音量控制系统】合肥玉安人工智能工作室 Python OpenCV python mediapipe 手势手势控制音量
基于手势识别的音量控制系统github项目效果这是一个结合了计算机视觉和系统控制的实用项目，通过识别手势来实现音量的无接触控制，同时考虑到了用户隐私，加入了实时人脸遮罩功能。核心功能实现1.手势识别与音量映射系统使用MediaPipe框架进行手部关键点检测，通过计算大拇指和食指之间的距离来控制音量：def_process_landmarks(self,hand_landmarks):#获取手指关键
putty运行python代码_当我关闭putty时如何保持python脚本运行 weixin_39943000 putty运行python代码
我准备在VPS上运行Ubuntu上的python脚本.这是机器学习培训过程,因此需要花费大量时间进行培训.如何在不停止该过程的情况下关闭腻子.解决方法:您有两个主要选择：>使用nohup运行命令.这会将它与您的会话取消关联,并在断开连接后让它继续运行：nohuppythonScript.py请注意,该命令的stdout将附加到名为nohup.out的文件中,除非您重定向它(nohuppythonS
同一个问题看看Grok3怎么回答-什么是智能体？释迦呼呼 AI一千问架构深度学习人工智能机器学习自然语言处理
关键要点研究表明，智能体（可能是“智能代理”的意思）在人工智能中是一个能够感知环境、自主行动以实现目标的系统。证据倾向于认为，智能体可以是简单的（如恒温器），也可以是复杂的（如自动驾驶汽车），并可能通过机器学习改进性能。关于“智能体”这一术语，存在争议，可能指的是人工智能中的智能代理，或在某些上下文中指具有物理身体的AI系统（如机器人）。什么是智能体？定义智能体在人工智能中似乎是一个能够感知其环境
knob UI插件使用换个号韩国红果果 JavaScript jsonp knob
图形是用canvas绘制的 js代码 var paras = { max:800, min:100, skin:'tron',//button type thickness:.3,//button width width:'200',//define canvas width.,canvas height displayInput:'tr
Android+Jquery Mobile学习系列(5)-SQLite数据库白糖_ JQuery Mobile
目录导航 SQLite是轻量级的、嵌入式的、关系型数据库，目前已经在iPhone、Android等手机系统中使用,SQLite可移植性好，很容易使用，很小，高效而且可靠。因为Android已经集成了SQLite，所以开发人员无需引入任何JAR包，而且Android也针对SQLite封装了专属的API，调用起来非常快捷方便。我也是第一次接触S
impala-2.1.2-CDH5.3.2 dayutianfei impala
最近在整理impala编译的东西，简单记录几个要点：根据官网的信息（https://github.com/cloudera/Impala/wiki/How-to-build-Impala）： 1. 首次编译impala，推荐使用命令： ${IMPALA_HOME}/buildall.sh -skiptests -build_shared_libs -format 2.仅编译BE ${I
求二进制数中1的个数周凡杨 java 算法二进制
解法一：对于一个正整数如果是偶数，该数的二进制数的最后一位是 0 ，反之若是奇数，则该数的二进制数的最后一位是 1 。因此，可以考虑利用位移、判断奇偶来实现。 public int bitCount(int x){ int count = 0; while(x!=0){ if(x%2!=0){ /
spring中hibernate及事务配置 g21121 Hibernate
hibernate的sessionFactory配置：  <bean id="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean"> <
log4j.properties 使用 510888780 log4j
log4j.properties 使用一.参数意义说明输出级别的种类 ERROR、WARN、INFO、DEBUG ERROR 为严重错误主要是程序的错误 WARN 为一般警告，比如session丢失 INFO 为一般要显示的信息，比如登录登出 DEBUG 为程序的调试信息配置日志信息输出目的地 log4j.appender.appenderName = fully.qua
Spring mvc-jfreeChart柱图（2）布衣凌宇 jfreechart
上一篇中生成的图是静态的，这篇将按条件进行搜索，并统计成图表，左面为统计图，右面显示搜索出的结果。第一步：导包第二步；配置web.xml(上一篇有代码) 建BarRenderer类用于柱子颜色 import java.awt.Color; import java.awt.Paint; import org.jfree.chart.renderer.category.BarR
我的spring学习笔记14-容器扩展点之PropertyPlaceholderConfigurer aijuans Spring3
PropertyPlaceholderConfigurer是个bean工厂后置处理器的实现，也就是BeanFactoryPostProcessor接口的一个实现。关于BeanFactoryPostProcessor和BeanPostProcessor类似。我会在其他地方介绍。 PropertyPlaceholderConfigurer可以将上下文（配置文件）中的属性值放在另一个单独的标准java
maven 之 cobertura 简单使用 antlove maven test unit cobertura report
1. 创建一个maven项目 2. 创建com.CoberturaStart.java package com; public class CoberturaStart { public void helloEveryone(){ System.out.println("=================================================
程序的执行顺序百合不是茶 JAVA执行顺序
刚在看java核心技术时发现对java的执行顺序不是很明白了,百度一下也没有找到适合自己的资料,所以就简单的回顾一下吧代码如下; 经典的程序执行面试题 //关于程序执行的顺序 //例如： //定义一个基类 public class A(){ public A(
设置session失效的几种方法 bijian1013 web.xml session失效监听器
在系统登录后，都会设置一个当前session失效的时间，以确保在用户长时间不与服务器交互，自动退出登录，销毁session。具体设置很简单，方法有三种：（1）在主页面或者公共页面中加入：session.setMaxInactiveInterval(900);参数900单位是秒，即在没有活动15分钟后，session将失效。这里要注意这个session设置的时间是根据服务器来计算的，而不是客户端。所
java jvm常用命令工具 bijian1013 java jvm
一.概述程序运行中经常会遇到各种问题，定位问题时通常需要综合各种信息，如系统日志、堆dump文件、线程dump文件、GC日志等。通过虚拟机监控和诊断工具可以帮忙我们快速获取、分析需要的数据，进而提高问题解决速度。本文将介绍虚拟机常用监控和问题诊断命令工具的使用方法，主要包含以下工具: &nbs
【Spring框架一】Spring常用注解之Autowired和Resource注解 bit1129 Spring常用注解
Spring自从2.0引入注解的方式取代XML配置的方式来做IOC之后，对Spring一些常用注解的含义行为一直处于比较模糊的状态，写几篇总结下Spring常用的注解。本篇包含的注解有如下几个： Autowired Resource Component Service Controller Transactional 根据它们的功能、目的，可以分为三组，Autow
mysql 操作遇到safe update mode问题 bitray update
我并不知道出现这个问题的实际原理,只是通过其他朋友的博客,文章得知的一个解决方案,目前先记录一个解决方法,未来要是真了解以后,还会继续补全. 在mysql5中有一个safe update mode,这个模式让sql操作更加安全,据说要求有where条件,防止全表更新操作.如果必须要进行全表操作,我们可以执行 SET
nginx_perl试用 ronin47 nginx_perl试用
因为空闲时间比较多，所以在CPAN上乱翻，看到了nginx_perl这个项目(原名Nginx::Engine)，现在托管在github.com上。地址见：https://github.com/zzzcpan/nginx-perl 这个模块的目的，是在nginx内置官方perl模块的基础上，实现一系列异步非阻塞的api。用connector/writer/reader完成类似proxy的功能（这里
java-63-在字符串中删除特定的字符 bylijinnan java
public class DeleteSpecificChars { /** * Q 63 在字符串中删除特定的字符 * 输入两个字符串，从第一字符串中删除第二个字符串中所有的字符。 * 例如，输入”They are students.”和”aeiou”，则删除之后的第一个字符串变成”Thy r stdnts.” */ public static voi
EffectiveJava--创建和销毁对象 ccii 创建和销毁对象
本章内容： 1. 考虑用静态工厂方法代替构造器 2. 遇到多个构造器参数时要考虑用构建器（Builder模式） 3. 用私有构造器或者枚举类型强化Singleton属性 4. 通过私有构造器强化不可实例化的能力 5. 避免创建不必要的对象 6. 消除过期的对象引用 7. 避免使用终结方法 1. 考虑用静态工厂方法代替构造器类可以通过
[宇宙时代]四边形理论与光速飞行 comsci
从四边形理论来推论为什么光子飞船必须获得星光信号才能够进行光速飞行？一组星体组成星座向空间辐射一组由复杂星光信号组成的辐射频带，按照四边形-频率假说一组频率就代表一个时空的入口那么这种由星光信号组成的辐射频带就代表由这些星体所控制的时空通道，该时空通道在三维空间的投影是一
ubuntu server下python脚本迁移数据 cywhoyi python Kettle pymysql cx_Oracle ubuntu server
因为是在Ubuntu下，所以安装python、pip、pymysql等都极其方便，sudo apt-get install pymysql，但是在安装cx_Oracle（连接oracle的模块）出现许多问题，查阅相关资料，发现这边文章能够帮我解决，希望大家少走点弯路。http://www.tbdazhe.com/archives/602 1.安装python 2.安装pip、pymysql
Ajax正确但是请求不到值解决方案 dashuaifu Ajax async
Ajax正确但是请求不到值解决方案解决方案：1 . async: false , 2. 设置延时执行js里的ajax或者延时后台java方法！！！！！！！例如： $.ajax({ &
windows安装配置php+memcached dcj3sjt126com PHP Install memcache
Windows下Memcached的安装配置方法 1、将第一个包解压放某个盘下面，比如在c:\memcached。 2、在终端（也即cmd命令界面）下输入 'c:\memcached\memcached.exe -d install' 安装。 3、再输入： 'c:\memcached\memcached.exe -d start' 启动。（需要注意的: 以后memcached将作为windo
iOS开发学习路径的一些建议 dcj3sjt126com ios
iOS论坛里有朋友要求回答帖子，帖子的标题是：想学IOS开发高阶一点的东西，从何开始，然后我吧啦吧啦回答写了很多。既然敲了那么多字，我就把我写的回复也贴到博客里来分享，希望能对大家有帮助。欢迎大家也到帖子里讨论和分享，地址：http://bbs.csdn.net/topics/390920759 下面是我回复的内容：结合自己情况聊下iOS学习建议，
Javascript闭包概念 fanfanlovey JavaScript 闭包
1.参考资料 http://www.jb51.net/article/24101.htm http://blog.csdn.net/yn49782026/article/details/8549462 2.内容概述要理解闭包，首先需要理解变量作用域问题内部函数可以饮用外面全局变量 var n=999; 　　functio
yum安装mysql5.6 haisheng mysql
1、安装http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm 2、yum install mysql 3、yum install mysql-server 4、vi /etc/my.cnf 添加character_set_server=utf8
po/bo/vo/dao/pojo的详介 IT_zhlp80 java BO VO DAO POJO po
JAVA几种对象的解释 PO:persistant object持久对象,可以看成是与数据库中的表相映射的java对象。最简单的PO就是对应数据库中某个表中的一条记录，多个记录可以用PO的集合。PO中应该不包含任何对数据库的操作. VO:value object值对象。通常用于业务层之间的数据传递，和PO一样也是仅仅包含数据而已。但应是抽象出的业务对象,可
java设计模式 kerryg java 设计模式
设计模式的分类：一、设计模式总体分为三大类： 1、创建型模式（5种）：工厂方法模式，抽象工厂模式，单例模式，建造者模式，原型模式。 2、结构型模式（7种）：适配器模式，装饰器模式，代理模式，外观模式，桥接模式，组合模式，享元模式。 3、行为型模式（11种）：策略模式，模版方法模式，观察者模式，迭代子模式，责任链模式，命令模式，备忘录模式，状态模式，访问者
[1]CXF3.1整合Spring开发webservice——helloworld篇木头.java spring webservice CXF
Spring 版本3.2.10 CXF 版本3.1.1 项目采用MAVEN组织依赖jar 我这里是有parent的pom，为了简洁明了，我直接把所有的依赖都列一起了，所以都没version，反正上面已经写了版本 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="ht
Google 工程师亲授：菜鸟开发者一定要投资的十大目标 qindongliang1922 工作感悟人生
身为软件开发者，有什么是一定得投资的？ Google 软件工程师 Emanuel Saringan 整理了十项他认为必要的投资，第一项就是身体健康，英文与数学也都是必备能力吗？来看看他怎么说。（以下文字以作者第一人称撰写））你的健康无疑地，软件开发者是世界上最久坐不动的职业之一。每天连坐八到十六小时，休息时间只有一点点，绝对会让你的鲔鱼肚肆无忌惮的生长。肥胖容易扩大罹患其他疾病的风险，
linux打开最大文件数量1,048,576 tianzhihehe c linux
File descriptors are represented by the C int type. Not using a special type is often considered odd, but is, historically, the Unix way. Each Linux process has a maximum number of files th
java语言中PO、VO、DAO、BO、POJO几种对象的解释衞酆夼 java VO BO POJO po
PO:persistant object持久对象最形象的理解就是一个PO就是数据库中的一条记录。好处是可以把一条记录作为一个对象处理，可以方便的转为其它对象。可以看成是与数据库中的表相映射的java对象。最简单的PO就是对应数据库中某个表中的一条记录，多个记录可以用PO的集合。PO中应该不包含任何对数据库的操作。 BO:business object业务对象封装业务逻辑的java对象