一个处女座的程序猿

Paper：《Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields ∗》翻译并解读

Abstract

1、Introduction

2. Method

2.1. Simultaneous Detection and Association

2.2. Confidence Maps for Part Detection

2.3. Part Affinity Fields for Part Association

2.4. Multi-Person Parsing using PAFs

3. Results

3.1. Results on the MPII Multi-Person Dataset

3.2. Results on the COCO Keypoints Challenge

3.3. Runtime Analysis

4. Discussion

Acknowledgements

References

论文：《Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields ∗》

Abstract

We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image. The architecture is designed to jointly learn part locations and their association via two branches of the same sequential prediction process. Our method placed first in the inaugural COCO 2016 keypoints challenge, and significantly exceeds the previous state-of-the-art result on the MPII MultiPerson benchmark, both in performance and efficiency.

我们提出了一种有效检测图像中多人二维姿态的方法。该方法使用非参数表示（我们称之为部分关联区域（PAFs））来学习将身体部位与图像中的个体关联起来。该体系结构对全局上下文进行编码，允许贪婪的自下而上的解析步骤，在实现实时性能的同时保持高精度，而不考虑图像中的人数。该体系结构旨在通过同一序列预测过程的两个分支联合学习零件位置及其关联。我们的方法在首届COCO 2016关键点挑战赛中排名第一，在性能和效率方面都大大超过了之前MPII多人测试的最新结果。

1、Introduction

Human 2D pose estimation—the problem of localizing anatomical keypoints or “parts”—has largely focused on finding body parts of individuals [8, 4, 3, 21, 33, 13, 25, 31, 6, 24]. Inferring the pose of multiple people in images, especially socially engaged individuals, presents a unique set of challenges. First, each image may contain an unknown number of people that can occur at any position or scale. Second, interactions between people induce complex spatial interference, due to contact, occlusion, and limb articulations, making association of parts difficult. Third, runtime complexity tends to grow with the number of people in the image, making realtime performance a challenge.	人体二维姿势估计解剖学关键点或“部位”的定位问题主要集中在寻找个体的身体部位[8，4，3，21，33，13，25，31，6，24]。在图像中推断出多人的姿势，特别是团体中的个体，是一组独特的挑战。首先，每个图像可能包含未知数量的人，这些人可以出现在任何位置或规模。其次，由于接触、咬合和肢体关节，人与人之间的相互作用会导致复杂的空间干扰，使得部件之间的关联变得困难。第三，运行时间复杂度随着人数的增加而增加。
Figure 1. Top: Multi-person pose estimation. Body parts belonging to the same person are linked. Bottom left: Part Affinity Fields (PAFs) corresponding to the limb connecting right elbow and right wrist. The color encodes orientation. Bottom right: A zoomed in view of the predicted PAFs. At each pixel in the field, a 2D vector encodes the position and orientation of the limbs.	图1中，上图：多人姿势估计。属于同一个人的身体部位是相连的。左下角：对应于连接右肘和右腕的肢体的部分关联区域（PAFs）。颜色编码方向。右下角：放大了预测的PAFs。在场中的每个像素处，2D向量对肢体的位置和方向进行编码。
A common approach [23, 9, 27, 12, 19] is to employ a person detector and perform single-person pose estimation for each detection. These top-down approaches directly leverage existing techniques for single-person pose estimation [17, 31, 18, 28, 29, 7, 30, 5, 6, 20], but suffer from early commitment: if the person detector fails–as it is prone to do when people are in close proximity–there is no recourse to recovery. Furthermore, the runtime of these top-down approaches is proportional to the number of people: for each detection, a single-person pose estimator is run, and the more people there are, the greater the computational cost. In contrast, bottom-up approaches are attractive as they offer robustness to early commitment and have the potential to decouple runtime complexity from the number of people in the image. Yet, bottom-up approaches do not directly use global contextual cues from other body parts and other people. In practice, previous bottom-up methods [22, 11] do not retain the gains in efficiency as the final parse requires costly global inference. For example, the seminal work of Pishchulin et al. [22] proposed a bottom-up approach that jointly labeled part detection candidates and associated them to individual people. However, solving the integer linear programming problem over a fully connected graph is an NP-hard problem and the average processing time is on the order of hours. Insafutdinov et al. [11] built on [22] with stronger part detectors based on ResNet [10] and image-dependent pairwise scores, and vastly improved the runtime, but the method still takes several minutes per image, with a limit on the number of part proposals. The pairwise representations used in [11], are difficult to regress precisely and thus a separate logistic regression is required.	一种常见的方法[23,9,27,12,19]是使用一个人检测器，对每个检测进行单人姿态估计。这些自顶向下的方法直接利用了现有的单人姿态估计技术[17、31、18、28、29、7、30、5、6、20]，但是早期的承诺有问题:如果人员检测器失败——当人们接近时很容易失败——就无法恢复。此外，这些自顶向下方法的运行时间与人员数量成正比:对于每个检测，运行一个单人姿态估计器，人员越多，计算成本越大。相反，自底向上的方法很有吸引力，因为它们提供了对早期承诺的健壮性，并且有潜力将运行时复杂性与映像中的人员数量解耦。然而，自底向上的方法并不直接使用来自其他身体部位和其他人的全局上下文线索。在实践中，以前的自底向上方法[22,11]没有保留效率的提高，因为最后的解析需要昂贵的全局推理。例如，Pishchulin等人的开创性工作[22]提出了一种自底向上的方法，该方法联合标记部分检测候选对象并将它们与个人关联。然而，在全连通图上求解整数线性规划问题是一个np困难的问题，平均处理时间约为小时。Insafutdinov等人在[22]的基础上构建了更强大的基于ResNet[10]和图像相关的成对分数的部分检测器，并极大地改进了运行时间，但该方法仍然需要几分钟的图像，并限制了部分建议的数量。[11]中使用的成对表示很难精确地回归，因此需要单独的逻辑回归。
Figure 2. Overall pipeline. Our method takes the entire image as the input for a two-branch CNN to jointly predict confidence maps for body part detection, shown in (b), and part affinity fields for parts association, shown in (c). The parsing step performs a set of bipartite matchings to associate body parts candidates (d). We finally assemble them into full body poses for all people in the image (e).	图2。整体的流程。我们的方法将整个图像作为输入的两个分支CNN联合预测置信地图部位检测,(b)所示,部分协会和部分关联字段,(c)所示。解析步骤执行一组关联配合双方的身体部分候选(d)。我们终于将它们组装成完整的身体姿势对图像中所有人(e)。
In this paper, we present an efficient method for multiperson pose estimation with state-of-the-art accuracy on multiple public benchmarks. We present the first bottom-up representation of association scores via Part Affinity Fields (PAFs), a set of 2D vector fields that encode the location and orientation of limbs over the image domain. We demonstrate that simultaneously inferring these bottom-up representations of detection and association encode global context sufficiently well to allow a greedy parse to achieve high-quality results, at a fraction of the computational cost. We have publically released the code for full reproducibility, presenting the first realtime system for multi-person 2D pose detection.	在这篇论文中，我们提出了一种有效的方法，可以在多个公共基准上获得最精确的多姿态估计。我们提出了第一个自底向上表示的关联分数通过Part Affinity Fields (PAFs)，一组二维向量场编码的位置和方向的四肢在图像域。我们证明了同时推断这些自底向上的检测和关联表示可以很好地编码全局上下文，从而允许贪婪解析以一小部分计算成本获得高质量的结果。我们已经公开发布了代码的充分再现，提出了第一个实时系统的多个人2D位姿检测。

2. Method

Fig. 2 illustrates the overall pipeline of our method. The system takes, as input, a color image of size w × h (Fig. 2a) and produces, as output, the 2D locations of anatomical keypoints for each person in the image (Fig. 2e). First, a feedforward network simultaneously predicts a set of 2D confidence maps S of body part locations (Fig. 2b) and a set of 2D vector fields L of part affinities, which encode the degree of association between parts (Fig. 2c). The set S = (S1, S2, ..., SJ ) has J confidence maps, one per part, where Sj ∈ Rw×h , j ∈ {1 . . . J}. The set L = (L1,L2, ...,LC ) has C vector fields, one per limb1 , where Lc ∈ Rw×h×2 , c ∈ {1 . . . C}, each image location in Lc encodes a 2D vector (as shown in Fig. 1). Finally, the confidence maps and the affinity fields are parsed by greedy inference (Fig. 2d) to output the 2D keypoints for all people in the image.

图2展示了我们的方法的总体流程。该系统将大小为w×h的彩色图像作为输入(图2a)，并将图像中每个人的二维解剖关键点位置作为输出(图2e)。首先，前馈网络同时预测一组二维人体部位置信度图S(图2b)和一组二维向量场L(图2c)，后者对人体部位之间的关联度进行编码。集合S = (S1, S2，…其中，SJ∈Rw×h, J∈{1…J}。集合L = (L1,L2，…，LC)有C个向量场，每个limb1一个，其中LC∈Rw×h×2,C∈{1…C}， Lc中的每个图像位置编码一个二维向量(如图1所示)，最后通过贪婪推理(图2D)解析置信图和亲和域，输出图像中所有人的二维关键点。

2.1. Simultaneous Detection and Association

Our architecture, shown in Fig. 3, simultaneously predicts detection confidence maps and affinity fields that encode part-to-part association. The network is split into two branches: the top branch, shown in beige, predicts the confidence maps, and the bottom branch, shown in blue, predicts the affinity fields. Each branch is an iterative prediction architecture, following Wei et al. [31], which refines the predictions over successive stages, t ∈ {1, . . . , T}, with intermediate supervision at each stage.	我们的架构，如图3所示，同时预测检测置信映射和编码部分到部分关联的关联字段。该网络分为两个分支：顶部分支（以米色显示）预测置信图，底部分支（以蓝色显示）预测关联字段。每个分支都是一个迭代预测体系结构，遵循Wei等人[31]改进了连续阶段的预测，t∈{1。. . ，T}，每个阶段都有中间监督。
The image is first analyzed by a convolutional network (initialized by the first 10 layers of VGG-19 [26] and finetuned), generating a set of feature maps F that is input to the first stage of each branch. At the first stage, the network produces a set of detection confidence maps S 1 = ρ 1 (F) and a set of part affinity fields L 1 = φ 1 (F), where ρ 1 and φ 1 are the CNNs for inference at Stage 1. In each subsequent stage, the predictions from both branches in the previous stage, along with the original image features F, are concatenated and used to produce refined predictions	图像首先由卷积网络（由VGG-19的前10层初始化并微调）进行分析，生成一组输入到每个分支的第一级的特征映射F。在第一阶段，该网络产生一组检测置信映射S 1＝ρ1（F）和一组部分部件关联场l1＝φ1（F），其中ρ1和φ1是在第1阶段进行推断的CNNs。在随后的每个阶段中，将前一阶段中来自两个分支的预测与原始图像特征F连接起来，并用于生成精确的预测
Figure 4. Confidence maps of the right wrist (first row) and PAFs (second row) of right forearm across stages. Although there is confusion between left and right body parts and limbs in early stages, the estimates are increasingly refined through global inference in later stages, as shown in the highlighted areas.	图4 右腕（第一排）和右前臂（第二排）跨级的置信度图。尽管早期左右身体部位和四肢之间存在混淆，但在后期通过全局推断，估计值会越来越精确，如突出显示的区域所示。
Fig. 4 shows the refinement of the confidence maps and affinity fields across stages. To guide the network to iteratively predict confidence maps of body parts in the first branch and PAFs in the second branch, we apply two loss functions at the end of each stage, one at each branch respectively. We use an L2 loss between the estimated predictions and the groundtruth maps and fields. Here, we weight the loss functions spatially to address a practical issue that some datasets do not completely label all people. Specifically, the loss functions at both branches at stage t are: where S ∗ j is the groundtruth part confidence map, L ∗ c is the groundtruth part affinity vector field, W is a binary mask with W(p) = 0 when the annotation is missing at an image location p. The mask is used to avoid penalizing the true positive predictions during training. The intermediate supervision at each stage addresses the vanishing gradient problem by replenishing the gradient periodically [31]. The overall objective is	图4示出了跨阶段的置信映射和亲和域的细化。为了指导网络迭代预测第一分支和第二分支的身体部位的置信度图，我们在每个阶段的末尾分别应用两个损失函数，每个分支一个。我们在估计的预测和标定真值图和场之间使用L2损失。在这里，我们对损失函数进行空间加权，以解决一些数据集不能完全标记所有人的实际问题。具体来说，t阶段两个分支的损失函数为：

2.2. Confidence Maps for Part Detection

To evaluate fS in Eq. (5) during training, we generate the groundtruth confidence maps S ∗ from the annotated 2D keypoints. Each confidence map is a 2D representation of the belief that a particular body part occurs at each pixel location. Ideally, if a single person occurs in the image, a single peak should exist in each confidence map if the corresponding part is visible; if multiple people occur, there should be a peak corresponding to each visible part j for each person k.

We first generate individual confidence maps S ∗ j,k for each person k. Let xj,k ∈ R2 be the groundtruth position of body part j for person k in the image. The value at location p ∈ R2 in S ∗ j,k is defined as,

where σ controls the spread of the peak. The groundtruth confidence map to be predicted by the network is an aggregation of the individual confidence maps via a max operator,

We take the maximum of the confidence maps instead of the average so that the precision of close by peaks remains distinct, as illustrated in the right figure. At test time, we predict confidence maps (as shown in the first row of Fig. 4), and obtain body part candidates by performing non-maximum suppression.

2.3. Part Affinity Fields for Part Association

Given a set of detected body parts (shown as the red and blue points in Fig. 5a), how do we assemble them to form the full-body poses of an unknown number of people? We need a confidence measure of the association for each pair of body part detections, i.e., that they belong to the same person. One possible way to measure the association is to detect an additional midpoint between each pair of parts on a limb, and check for its incidence between candidate part detections, as shown in Fig. 5b. However, when people crowd together—as they are prone to do—these midpoints are likely to support false associations (shown as green lines in Fig. 5b). Such false associations arise due to two limitations in the representation: (1) it encodes only the position, and not the orientation, of each limb; (2) it reduces the region of support of a limb to a single point.
To address these limitations, we present a novel feature representation called part affinity fields that preserves both location and orientation information across the region of support of the limb (as shown in Fig. 5c). The part affinity is a 2D vector field for each limb, also shown in Fig. 1d: for each pixel in the area belonging to a particular limb, a 2D vector encodes the direction that points from one part of the limb to the other. Each type of limb has a corresponding affinity field joining its two associated body parts.
Consider a single limb shown in the figure below. Let xj1,k and xj2,k be the groundtruth positions of body parts j1 and j2 from the limb c for person k in the image. If a point 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gaussian 1 Gaussian 2 Max Average p S 0 0.2 0.4 0.6 0.8 1 Gaussian 1 Gaussian 2 Max Average S p p v v? p xj2,1 xj1,k xj2,k p lies on the limb, the value at L ∗ c,k(p) is a unit vector that points from j1 to j2; for all other points, the vector is zero-valued.
To evaluate fL in Eq. 5 during training, we define the groundtruth part affinity vector field, L ∗ c,k, at an image point p as L ∗ c,k(p) = ( v if p on limb c, k 0 otherwise. (8) Here, v = (xj2,k − xj1,k)/\|\|xj2,k −xj1,k\|\|2 is the unit vector in the direction of the limb. The set of points on the limb is defined as those within a distance threshold of the line segment, i.e., those points p for which 0 ≤ v · (p − xj1,k) ≤ lc,k and \|v⊥ · (p − xj1,k)\| ≤ σl where the limb width σl is a distance in pixels, the limb length is lc,k = \|\|xj2,k − xj1,k\|\|2, and v⊥ is a vector perpendicular to v. The groundtruth part affinity field averages the affinity fields of all people in the image, L ∗ c (p) = 1 nc(p) X k L ∗ c,k(p), (9) where nc(p) is the number of non-zero vectors at point p across all k people (i.e., the average at pixels where limbs of different people overlap).
During testing, we measure association between candidate part detections by computing the line integral over the corresponding PAF, along the line segment connecting the candidate part locations. In other words, we measure the alignment of the predicted PAF with the candidate limb that would be formed by connecting the detected body parts. Specifically, for two candidate part locations dj1 and dj2 , we sample the predicted part affinity field, Lc along the line segment to measure the confidence in their association: where p(u) interpolates the position of the two body parts dj1 and dj2 , E = Z u=1 u=0 Lc (p(u)) · dj2 − dj1 \|\|dj2 − dj1 \|\|2 du, (10) In practice, we approximate the integral by sampling and summing uniformly-spaced values of u. p(u) = (1 − u)dj1 + udj2 . (11)
Figure 6. Graph matching. (a) Original image with part detections (b) K-partite graph (c) Tree structure (d) A set of bipartite graphs

2.4. Multi-Person Parsing using PAFs

We perform non-maximum suppression on the detection confidence maps to obtain a discrete set of part candidate locations. For each part, we may have several candidates, due to multiple people in the image or false positives (shown in Fig. 6b). These part candidates define a large set of possible limbs. We score each candidate limb using the line integral computation on the PAF, defined in Eq. 10. The problem of finding the optimal parse corresponds to a K-dimensional matching problem that is known to be NP-Hard [32] (shown in Fig. 6c). In this paper, we present a greedy relaxation that consistently produces high-quality matches. We speculate the reason is that the pair-wise association scores implicitly encode global context, due to the large receptive field of the PAF network.
Formally, we first obtain a set of body part detection candidates DJ for multiple people, where DJ = {d m j : for j ∈ {1 . . . J}, m ∈ {1 . . . Nj}}, with Nj the number of candidates of part j, and d m j ∈ R2 is the location of the m-th detection candidate of body part j. These part detection candidates still need to be associated with other parts from the same person—in other words, we need to find the pairs of part detections that are in fact connected limbs. We define a variable z mn j1j2 ∈ {0, 1} to indicate whether two detection candidates d m j1 and d n j2 are connected, and the goal is to find the optimal assignment for the set of all possible connections, Z = {z mn j1j2 : for j1, j2 ∈ {1 . . . J}, m ∈ {1 . . . Nj1 }, n ∈ {1 . . . Nj2 }}.
If we consider a single pair of parts j1 and j2 (e.g., neck and right hip) for the c-th limb, finding the optimal association reduces to a maximum weight bipartite graph matching problem [32]. This case is shown in Fig. 5b. In this graph matching problem, nodes of the graph are the body part detection candidates Dj1 and Dj2 , and the edges are all possible connections between pairs of detection candidates. Additionally, each edge is weighted by Eq. 10—the part affinity aggregate. A matching in a bipartite graph is a subset of the edges chosen in such a way that no two edges share a node. Our goal is to find a matching with maximum weight for the chosen edges max Zc Ec = max Zc X m∈Dj1 X n∈Dj2 Emn · z mn j1j2 , (12) s.t. ∀m ∈ Dj1 , X n∈Dj2 z mn j1j2 ≤ 1, (13) ∀n ∈ Dj2 , X m∈Dj1 z mn j1j2 ≤ 1, (14) where Ec is the overall weight of the matching from limb type c, Zc is the subset of Z for limb type c, Emn is the part affinity between parts d m j1 and d n j2 defined in Eq. 10. Eqs. 13 and 14 enforce no two edges share a node, i.e., no two limbs of the same type (e.g., left forearm) share a part. We can use the Hungarian algorithm [14] to obtain the optimal matching.
When it comes to finding the full body pose of multiple people, determining Z is a K-dimensional matching problem. This problem is NP Hard [32] and many relaxations exist. In this work, we add two relaxations to the optimization, specialized to our domain. First, we choose a minimal number of edges to obtain a spanning tree skeleton of human pose rather than using the complete graph, as shown in Fig. 6c. Second, we further decompose the matching problem into a set of bipartite matching subproblems and determine the matching in adjacent tree nodes independently, as shown in Fig. 6d. We show detailed comparison results in Section 3.1, which demonstrate that minimal greedy inference well-approximate the global solution at a fraction of the computational cost. The reason is that the relationship between adjacent tree nodes is modeled explicitly by PAFs, but internally, the relationship between nonadjacent tree nodes is implicitly modeled by the CNN. This property emerges because the CNN is trained with a large receptive field, and PAFs from non-adjacent tree nodes also influence the predicted PAF.
With these two relaxations, the optimization is decomposed simply as: max Z E = X C c=1 max Zc Ec. (15) We therefore obtain the limb connection candidates for each limb type independently using Eqns. 12- 14. With all limb connection candidates, we can assemble the connections that share the same part detection candidates into full-body poses of multiple people. Our optimization scheme over the tree structure is orders of magnitude faster than the optimization over the fully connected graph [22, 11].

3. Results

We evaluate our method on two benchmarks for multiperson pose estimation: (1) the MPII human multi-person dataset [2] and (2) the COCO 2016 keypoints challenge dataset [15]. These two datasets collect images in diverse scenarios that contain many real-world challenges such as crowding, scale variation, occlusion, and contact. Our approach set the state-of-the-art on the inaugural COCO 2016 keypoints challenge [1], and significantly exceeds the previous state-of-the-art result on the MPII multi-person benchmark. We also provide runtime analysis to quantify the efficiency of the system. Fig. 10 shows some qualitative results from our algorithm.

3.1. Results on the MPII Multi-Person Dataset

For comparison on the MPII dataset, we use the toolkit [22] to measure mean Average Precision (mAP) of all body parts based on the PCKh threshold. Table 1 compares mAP performance between our method and other approaches on the same subset of 288 testing images as in [22], and the entire MPI testing set, and self-comparison on our own validation set. Besides these measures, we compare the average inference/optimization time per image in seconds. For the 288 images subset, our method outperforms previous state-of-the-art bottom-up methods [11] by 8.5% mAP. Remarkably, our inference time is 6 orders of magnitude less. We report a more detailed runtime analysis in Section 3.3. For the entire MPII testing set, our method without scale search already outperforms previous state-of-the-art methods by a large margin, i.e., 13% absolute increase on mAP. Using a 3 scale search (×0.7, ×1 and ×1.3) further increases the performance to 75.6% mAP. The mAP comparison with previous bottom-up approaches indicate the effectiveness of our novel feature representation, PAFs, to associate body parts. Based on the tree structure, our greedy parsing method achieves better accuracy than a graphcut optimization formula based on a fully connected graph structure [22, 11].
In Table 2, we show comparison results on different skeleton structures as shown in Fig. 6 on our validation set, i.e., 343 images excluded from the MPII training set. We train our model based on a fully connected graph, and compare results by selecting all edges (Fig. 6b, approximately solved by Integer Linear Programming), and minimal tree edges (Fig. 6c, approximately solved by Integer Linear Pro-gramming, and Fig. 6d, solved by the greedy algorithm presented in this paper). Their similar performance shows that it suffices to use minimal edges. We trained another model that only learns the minimal edges to fully utilize the network capacity—the method presented in this paper—that is denoted as Fig. 6d (sep). This approach outperforms Fig. 6c and even Fig. 6b, while maintaining efficiency. The reason is that the much smaller number of part association channels (13 edges of a tree vs 91 edges of a graph) makes it easier for training convergence.
Figure 7. mAP curves over different PCKh threshold on MPII validation set. (a) mAP curves of self-comparison experiments. (b) mAP curves of PAFs across stages.
Fig. 7a shows an ablation analysis on our validation set. For the threshold of PCKh-0.5, the result using PAFs outperforms the results using the midpoint representation, specifically, it is 2.9% higher than one-midpoint and 2.3% higher than two intermediate points. The PAFs, which encodes both position and orientation information of human limbs, is better able to distinguish the common cross-over cases, e.g., overlapping arms. Training with masks of unlabeled persons further improves the performance by 2.3% because it avoids penalizing the true positive prediction in the loss during training. If we use the ground-truth keypoint location with our parsing algorithm, we can obtain a mAP of 88.3%. In Fig. 7a, the mAP of our parsing with GT detection is constant across different PCKh thresholds due to no localization error. Using GT connection with our keypoint detection achieves a mAP of 81.6%. It is notable that our parsing algorithm based on PAFs achieves a similar mAP as using GT connections (79.4% vs 81.6%). This indicates parsing based on PAFs is quite robust in associating correct part detections. Fig. 7b shows a comparison of performance across stages. The mAP increases monotonically with the iterative refinement framework. Fig. 4 shows the qualitative improvement of the predictions over stages.

3.2. Results on the COCO Keypoints Challenge

Paper：《Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields ∗》翻译并解读_第10张图片

Table 3. Results on the COCO 2016 keypoint challenge. Top: results on test-challenge. Bottom: results on test-dev (top methods only). AP50 is for OKS = 0.5, APL is for large scale persons.

The COCO training set consists of over 100K person instances labeled with over 1 million total keypoints (i.e. body parts). The testing set contains “test-challenge”, “test-dev” and “test-standard” subsets, which have roughly 20K images each. The COCO evaluation defines the object key point similarity (OKS) and uses the mean average precision (AP) over 10 OKS thresholds as main competition metric [1]. The OKS plays the same role as the IoU in object detection. It is calculated from scale of the person and the distance between predicted points and GT points. Table 3 shows results from top teams in the challenge. It is noteworthy that our method has lower accuracy than the top-down methods on people of smaller scales (APM). The reason is that our method has to deal with a much larger scale range spanned by all people in the image in one shot. In contrast, top-down methods can rescale the patch of each detected area to a larger size and thus suffer less degradation at smaller scales.

In Table 4, we report self-comparisons on a subset of the COCO validation set, i.e., 1160 images that are randomly selected. If we use the GT bounding box and a single person CPM [31], we can achieve a upper-bound for the top-down approach using CPM, which is 62.7% AP. If we use the state-of-the-art object detector, Single Shot MultiBox Detector (SSD)[16], the performance drops 10%. This comparison indicates the performance of top-down approaches rely heavily on the person detector. In contrast, our bottom-up method achieves 58.4% AP. If we refine the results of our method by applying a single person CPM on each rescaled region of the estimated persons parsed by our method, we gain an 2.6% overall AP increase. Note that we only update estimations on predictions that both methods agree well enough, resulting in improved precision and recall. We expect a larger scale search can further improve the performance of our bottom-up method. Fig. 8 shows a breakdown of errors of our method on the COCO validation set. Most of the false positives come from imprecise localization, other than background confusion. This indicates there is more improvement space in capturing spatial dependencies than in recognizing body parts appearances.

3.3. Runtime Analysis

To analyze the runtime performance of our method, we collect videos with a varying number of people. The original frame size is 1080×1920, which we resize to 368×654 during testing to fit in GPU memory. The runtime analysis is performed on a laptop with one NVIDIA GeForce GTX-1080 GPU. In Fig. 8d, we use person detection and single-person CPM as a top-down comparison, where the runtime is roughly proportional to the number of people in the image. In contrast, the runtime of our bottom-up approach increases relatively slowly with the increasing number of people. The runtime consists of two major parts: (1) CNN processing time whose runtime complexity is O(1), constant with varying number of people; (2) Multi-person parsing time whose runtime complexity is O(n 2 ), where n represents the number of people. However, the parsing time does not significantly influence the overall runtime because it is two orders of magnitude less than the CNN processing time, e.g., for 9 people, the parsing takes 0.58 ms while CNN takes 99.6 ms. Our method has achieved the speed of 8.8 fps for a video with 19 people.

4. Discussion

Moments of social significance, more than anything else, compel people to produce photographs and videos. Our photo collections tend to capture moments of personal significance: birthdays, weddings, vacations, pilgrimages, sports events, graduations, family portraits, and so on. To enable machines to interpret the significance of such photographs, they need to have an understanding of people in images. Machines, endowed with such perception in real time, would be able to react to and even participate in the individual and social behavior of people.

In this paper, we consider a critical component of such perception: realtime algorithms to detect the 2D pose of multiple people in images. We present an explicit nonparametric representation of the keypoints association that encodes both position and orientation of human limbs. Second, we design an architecture for jointly learning parts detection and parts association. Third, we demonstrate that a greedy parsing algorithm is sufficient to produce highquality parses of body poses, that maintains efficiency even as the number of people in the image increases. We show representative failure cases in Fig. 9. We have publicly released our code (including the trained models) to ensure full reproducibility and to encourage future research in the area.

Acknowledgements

We acknowledge the effort from the authors of the MPII and COCO human pose datasets. These datasets make 2D human pose estimation in the wild possible. This research was supported in part by ONR Grants N00014-15-1-2358 and N00014-14-1-0595.

References

[1] MSCOCO keypoint evaluation metric. http://mscoco. org/dataset/#keypoints-eval. 5, 6‌
[2] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele. 2D human pose estimation: new benchmark and state of the art analysis. In CVPR, 2014. 5
[3] M. Andriluka, S. Roth, and B. Schiele. Pictorial structures revisited: people detection and articulated pose estimation. In CVPR, 2009. 1
[4] M. Andriluka, S. Roth, and B. Schiele. Monocular 3D pose estimation and tracking by detection. In CVPR, 2010. 1‌
[5] V. Belagiannis and A. Zisserman. Recurrent human pose es- timation. In 12th IEEE International Conference and Work- shops on Automatic Face and Gesture Recognition (FG), 2017. 1
[6] A. Bulat and G. Tzimiropoulos. Human pose estimation via convolutional part heatmap regression. In ECCV, 2016. 1
[7] X. Chen and A. Yuille. Articulated pose estimation by a graphical model with image dependent pairwise relations. In NIPS, 2014. 1
[8] P. F. Felzenszwalb and D. P. Huttenlocher. Pictorial struc- tures for object recognition. In IJCV, 2005. 1
[9] G. Gkioxari, B. Hariharan, R. Girshick, and J. Malik. Us- ing k-poselets for detecting people and localizing their key- points. In CVPR, 2014. 1
[10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. 1
[11] E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and
B. Schiele. Deepercut: A deeper, stronger, and faster multi- person pose estimation model. In ECCV, 2016. 1, 5, 6
[12] U. Iqbal and J. Gall. Multi-person pose estimation with local joint-to-person associations. In ECCV Workshops, Crowd Understanding, 2016. 1, 5
[13] S. Johnson and M. Everingham. Clustered pose and nonlin- ear appearance models for human pose estimation. In BMVC, 2010. 1
[14] H. W. Kuhn. The hungarian method for the assignment prob- lem. In Naval research logistics quarterly. Wiley Online Li- brary, 1955. 5
[15] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ra- manan, P. Dolla´r, and C. L. Zitnick. Microsoft COCO: com- mon objects in context. In ECCV, 2014. 5
[16] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed.
Ssd: Single shot multibox detector. In ECCV, 2016. 6
[17] A. Newell, K. Yang, and J. Deng. Stacked hourglass net- works for human pose estimation. In ECCV, 2016. 1
[18] W. Ouyang, X. Chu, and X. Wang. Multi-source deep learn- ing for human pose estimation. In CVPR, 2014. 1
[19] G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tomp- son, C. Bregler, and K. Murphy. Towards accurate multi-person pose estimation in the wild. arXiv preprint arXiv:1701.01779, 2017. 1, 6
[20] T. Pfister, J. Charles, and A. Zisserman. Flowing convnets for human pose estimation in videos. In ICCV, 2015. 1
[21] L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele. Pose- let conditioned pictorial structures. In CVPR, 2013. 1
[22] L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. An- driluka, P. Gehler, and B. Schiele. Deepcut: Joint subset partition and labeling for multi person pose estimation. In CVPR, 2016. 1, 5
[23] L. Pishchulin, A. Jain, M. Andriluka, T. Thorma¨hlen, and
B. Schiele. Articulated people detection and pose estimation:
Reshaping the future. In CVPR, 2012. 1
[24] V. Ramakrishna, D. Munoz, M. Hebert, J. A. Bagnell, and
Y. Sheikh. Pose machines: Articulated pose estimation via inference machines. In ECCV, 2014. 1
[25] D. Ramanan, D. A. Forsyth, and A. Zisserman. Strike a Pose: Tracking people by finding stylized poses. In CVPR, 2005. 1‌
[26] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 2
[27] M. Sun and S. Savarese. Articulated part-based model for joint object detection and pose estimation. In ICCV, 2011. 1
[28] J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler.
Efficient object localization using convolutional networks. In
CVPR, 2015. 1
[29] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint train- ing of a convolutional network and a graphical model for human pose estimation. In NIPS, 2014. 1
[30] A. Toshev and C. Szegedy. Deeppose: Human pose estima- tion via deep neural networks. In CVPR, 2014. 1
[31] S.-E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. Con- volutional pose machines. In CVPR, 2016. 1, 2, 3, 6
[32] D. B. West et al. Introduction to graph theory, volume 2.
Prentice hall Upper Saddle River, 2001. 4, 5
[33] Y. Yang and D. Ramanan. Articulated human detection with flexible mixtures of parts. In TPAMI, 2013. 1

你可能感兴趣的:(Paper)

深入探索连续变量量子神经网络：开启量子计算新纪元倪姿唯Kara
深入探索连续变量量子神经网络：开启量子计算新纪元quantum-neural-networksThisrepositorycontainsthesourcecodeusedtoproducetheresultspresentedinthepaper"Continuous-variablequantumneuralnetworks".Duetosubsequentinterfaceupgrades,
python 快速实现链接转 word 文档嘿嘿潶黑黑 python word
python快速实现链接转word文档演示代码展示最后演示代码展示fromnewspaperimportArticlefromdocximportDocumentfromdocx.sharedimportPt,RGBColorfromdocx.enum.styleimportWD_STYLE_TYPEfromdocx.oxml.nsimportqn#tkinterGUIimporttkintera
【CVPR 2021】Knowledge Review：知识蒸馏新解法 BIT可达鸭深度学习人工智能计算机视觉模型压缩知识蒸馏
【CVPR2021】KnowledgeReview：知识蒸馏新解法论文地址：主要问题：主要思路：符号假设：具体实现：实验结果：关注我的公众号：联系作者：论文地址：https://jiaya.me/papers/kdreview_cvpr21.pdf主要问题：目前大部分关于KD的方法都是基于相同层或者相同Block之间的知识迁移。但是Teacher往往深层表示抽象的语义信息，底层表示简单的知识的信息
[论文阅读] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution qianx77 论文阅读 pytorch 论文阅读人工智能计算机视觉
文章目录一、前言二、主要贡献三、Introduction四、Methodology4.1Motivation：4.2FrameworkOverview.**一、前言通信作者是香港理工大学&OPPO研究所的张磊教授，也是图像超分ISR的一个大牛了。论文如下SeeSR:TowardsSemantics-AwareReal-WorldImageSuper-Resolution[paper][code]二
【YOLOv11改进- 主干网络】YOLOv11+CSWinTransformer: 交叉窗口注意力Transformer助力YOLOv11有效涨点；算法conv_er YOLOv11目标检测改进 YOLO 目标跟踪人工智能目标检测深度学习 transformer 计算机视觉
YOLOV11目标检测改进实例与创新改进专栏专栏地址：YOLOv11目标检测改进专栏，包括backbone、neck、loss、分配策略、组合改进、原创改进等本文介绍发paper，毕业皆可使用。本文给大家带来的改进内容是在YOLOv11中更换主干网络为CSWinTransformer，助力YOLOv11有效涨点，通过创新性地开发了十字形窗口自注意力机制。该机制通过将输入特征分割为等宽条纹，在水平与
易飞ERP 查询报表打印凭证报错：Error reading Quick Report.PaperLength：Invalid pointer operation S3软件易飞ERP windows
处理办法：查询报表打印凭证报错：ErrorreadingQuickReport.PaperLength：Invalidpointeroperation-S3软件此问题，主要是由于计算机的默认打印设置错误导致，一定要将运行易飞的计算机设置一台状态正常的打印机！以上处理方法做完即可解决问题！最上面是使用中最常见的方...https://blog.s3.sh.cn/thread-64-1-2.html
GUROBI之如何快速定位模型infeasible的原因吃面包的快乐小狗 python 数学建模
今天在用GUROBI写EVRPTW问题的模型时，遇到了很多问题参考：github上的一个用cplex来求解的paper:TheElectricVehicle-RoutingProblemwithTimeWindowsandRechargingStations(informs.org)code:E-VRPTW/E-VRPTW.modatmain·jmanzolli/E-VRPTW(github.co
InfiniteHiP - 在单个GPU上扩展 LLM 上下文至300万tokens 伊织code #Paper Reading InfiniteHiP 推理 GPU LLM token
InfiniteHiP:ExtendingLanguageModelContextUpto3MillionTokensonaSingleGPUPaper:https://huggingface.co/papers/2502.08910Sourcecode:https://github.com/DeepAuto-AI/hip-attention/SGLangIntegrationavailablen
CVPR 2023 | 一文看尽12篇Best Paper候选（附合集）马拉AI 人工智能
CVPR2023日前已经放榜，并公布了12篇bestpaper候选论文。本文就带大家一睹这12篇论文的风采，相关合集点击这里跳转获取。1、EgoEgo：通过自我头部姿势估计进行自我身体姿势估计Ego-BodyPoseEstimationviaEgo-HeadPoseEstimation项目地址：https://lijiaman.github.io/projects/egoego/从以自我为中心的视
DARTS-PT: 重新思考可微分神经架构搜索中的架构选择凌洲丰Edwina
DARTS-PT:重新思考可微分神经架构搜索中的架构选择darts-pt[ICLR2021OutstandingPaper]RethinkingArchitectureSelectioninDifferentiableNAS项目地址:https://gitcode.com/gh_mirrors/da/darts-pt项目介绍DARTS-PT是一个基于GitHub的开源项目，源自ICLR2021的一
【神经网络搜索】NasBench301 使用代理模型构建Benchmark *pprp* 神经网络搜索 AutoML NAS工具箱神经网络人工智能深度学习
【GiantPandaCV导语】本文介绍的是NAS中的一个benchmark-NASBench301,由automl.org组织发表，其核心思想是针对表格型基准存在的不足提出使用代理模型拟合架构与对应准确率。Paper:NAS-Bench-301andThecaseforsurrogatebenchmarksforNeuralArchitectureSearchCode:https://githu
React Native第三方组件库汇总 2401_85124812 作者\/react native react.js javascript
项目地址:https://github.com/wix/react-native-ui-lib9，ReactNativePaperReactNativePaper是一个跨平台的UI组件库，它遵循MaterialDesign指南，提供了全局主题支持和可选的babel插件，用以减少捆绑包大小。ReactNativePaper项目地址:https://github.com/callstack/react
爆肝两千字！掌握CSS选择器与响应式设计：从基础到高级应用洛可可白前端专栏 css 前端
前言欢迎来到我的技术小宇宙！这里不仅是我记录技术点滴的后花园，也是我分享学习心得和项目经验的乐园。无论你是技术小白还是资深大牛，这里总有一些内容能触动你的好奇心。洛可可白：个人主页个人专栏：✅前端技术✅后端技术个人博客：洛可可白博客代码获取：bestwishes0203封面壁纸：洛可可白wallpaper文章目录爆肝两千字！掌握CSS选择器与响应式设计：从基础到高级应用CSS常见选择器1.元素选择
An impassioned circulation of affection （ Codeforces Round 418 (Div. 2) ） BoBoo文睡不醒 acm训练集合暴力枚举 dp 双指针
Animpassionedcirculationofaffection（CodeforcesRound418(Div.2)）Nadeko’sbirthdayisapproaching!Asshedecoratedtheroomfortheparty,alonggarlandofDianthus-shapedpaperpieceswasplacedonaprominentpartofthewall.
Stable Diffusion创始人：DeepSeek没有抄袭！ Datawhale stable diffusion 人工智能
Datawhale分享观点：EmadMostaque，编译：Datawhale视频中英对照如下：Distillationisnothingnew,andthere'snowaytokindofstopthisfromthemodelbasis.蒸馏技术并不是什么新事物，而且从模型的角度来看，没有办法完全阻止这种情况的发生。Butifyouactuallylookatwhatthepapersays
Faceboxes pytorch代码解读(一) box_utils.py(上篇) Faded浩 pytorch 深度学习神经网络 python 算法
Faceboxespytorch代码解读(一)box_utils.py（上篇）有幸读到ShifengZhang老师团队的人脸检测论文，感觉对自己的人脸学习论文十分有帮助。通过看别人的paper,学习别人的代码，能够使得我们对人脸检测算法的学习有更近一步的理解。但是在学习的时候发现，自己看别人的代码是一个耗时而又头疼的事情。毕竟每个人的思路都不一样，跟着别人的思路走确实不容易。所以希望能够分享一下自
利用ChatGPT阅读文献：指南与技巧摆烂大大王 chatgpt MathorCup数学建模 chatgpt 论文阅读人工智能学习
阅读文献对于学术研究和深度学习至关重要。ChatGPT作为一款高级人工智能聊天机器人，可以帮助用户更高效地阅读和理解文献。以下是如何利用ChatGPT阅读文献的一些指南和技巧。1.文献检索在你阅读文献之前，首先需要找到相关文献。可以使用如下命令让ChatGPT帮助你进行文献检索：/findpapers关键词或主题例如，如果你需要寻找关于人工智能在医疗领域应用的文献，可以输入：/findpapers
全网最新最全AI写作工具大汇总（含14个AI写作工具）一只贴代码君 AI写作 chatgpt 机器学习算法人工智能数据库
笔灵AI写作网址：https://ibiling.cn/?from=ai-bot描述：面向专业写作领域的AI写作工具。Paperpal网址：https://www.editage.cn/paperpal?utm_source=ai-bot&utm_medium=Banner&utm_campaign=Banner描述：英文论文写作助手。新华妙笔网址：https://miaobi.xinhuaskl
IDEA+Java+SSH+Bootstrap+Mysql实现在线考试系统(含论文) 2401_89694162 java intellij-idea ssh
–Recordsoft_managerINSERTINTOt_managerVALUES(1,‘管理员’,‘admin’,‘admin’);–Tablestructurefort_paperDROPTABLEIFEXISTSt_paper;CREATETABLEt_paper(idint(11)NOTNULLAUTO_INCREMENT,joinDatedatetime(0)NULLDEFAULT
Vue.js组件开发-实现HTML内容打印 LCG元前端 vue.js html 前端
在Vue项目中实现打印功能，可以借助vue-html-to-paper插件来完成。步骤创建Vue项目：如果还没有Vue项目，可以使用VueCLI来创建一个新的项目。npminstall-g@vue/clivuecreatevue-print-templatecdvue-print-template安装vue-html-to-paper插件：该插件可以将HTML内容转换为打印页面。npminstal
nedi新型的基于边缘指导的图像插值算法matlab代码,New-edge-directed-interpolationNEDI 新边缘导向算法，的原理介绍以及实现，在去马赛克方面的应用 Grap... weixin_39640265
详细说明：新边缘导向算法，算法的原理介绍以及实现，以及在去马赛克方面的应用-Thispaperproposesanedge-directedinterpolationalgorithmfornaturalimages.Thebasicideaistofirstestimatelocalcovariancecoefficientsalow-resolutionimageandthenusethese
OpenAI的编程语言和框架，给程序员带来了帮助有哪些 API技术大佬Anzexi58 OpenAI 人工智能人工智能深度学习
OpenAI是一个人工智能开发公司，成立于2015年，总部位于美国旧金山。这家公司致力于研究和开发先进的人工智能技术，旨在将这些技术应用到解决全球一些最棘手的问题上。OpenAI以其卓越的技术和实验室出品的groundbreakingAIpapers而闻名。OpenAI的研究涉及深度学习、自然语言处理、视觉感知、强化学习等多个领域，并已在各种应用中取得了令人瞩目的成果。例如，在机器人领域，Open
CT-Mamba:一种用于低剂量CT降噪的混合卷积状态空间模型论文解读 ZcZc__1 深度学习人工智能图像处理
论文：CT-Mamba:AHybridConvolutionalStateSpaceModelforLow-DoseCTDenoising代码：zy2219105/CT-Mamba，作者称将会在论文正式发表后提供。本文参考了该网站，其对CT-Mamba提供了更详细的描述：https://www.aimodels.fyi/papers/arxiv/ct-mamba-hybrid-convolutio
第76期 | GPTSecurity周报云起无垠 GPTSecurity 人工智能网络安全
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.关于使用大语言模型
第84期 | GPTSecurity周报云起无垠 GPTSecurity 人工智能 gpt AIGC
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.利用数据流路径对大
第72期 | GPTSecurity周报云起无垠 GPTSecurity 人工智能安全
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.从孤立指令到互动鼓
idea连接mysql weixin_33758863 开发工具
https://blog.csdn.net/Golden_soft/article/details/80952243转载于:https://www.cnblogs.com/jitipaper/p/10784406.html
第84期 | GPTSecurity周报 aigc
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.利用数据流路径对大
Video-P2P：通过控制 cross-attention 编辑视频 ScienceLi1125 3D视觉视频编辑
Paper:LiuS,ZhangY,LiW,etal.Video-p2p:Videoeditingwithcross-attentioncontrol[C]//ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecognition.2024:8599-8608.Introduction:https://video-p2p.gi
awesome-Gaussian-Splatting Jfeng7810 3d
Awesome3DGaussianSplattingResourcesAcuratedlistofpapersandopen-sourceresourcesfocusedon3DGaussianSplatting,intendedtokeeppacewiththeanticipatedsurgeofresearchinthecomingmonths.Ifyouhaveanyadditionsors
Spring4.1新特性——综述 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
Schema与数据类型优化 annan211 数据结构 mysql
目前商城的数据库设计真是一塌糊涂，表堆叠让人不忍直视，无脑的架构师，说了也不听。在数据库设计之初，就应该仔细揣摩可能会有哪些查询，有没有更复杂的查询，而不是仅仅突出很表面的业务需求，这样做会让你的数据库性能成倍提高，当然，丑陋的架构师是不会这样去考虑问题的。选择优化的数据类型 1 更小的通常更好更小的数据类型通常更快，因为他们占用更少的磁盘、内存和cpu缓存，
第一节 HTML概要学习 chenke html Web css
第一节 HTML概要学习 1. 什么是HTML HTML是英文Hyper Text Mark-up Language(超文本标记语言)的缩写，它规定了自己的语法规则，用来表示比“文本”更丰富的意义，比如图片，表格，链接等。浏览器（IE,FireFox等）软件知道HTML语言的语法，可以用来查看HTML文档。目前互联网上的绝大部分网页都是使用HTML编写的。打开记事本输入一下内
MyEclipse里部分习惯的更改 Array_06 eclipse
继续补充中---------------------- 1.更改自己合适快捷键windows-->prefences-->java-->editor-->Content Assist--> Activation triggers for java的右侧“.”就可以改变常用的快捷键选中 Text
近一个月的面试总结 cugfy 面试
本文是在学习中的总结，欢迎转载但请注明出处：http://blog.csdn.net/pistolove/article/details/46753275 前言打算换个工作，近一个月面试了不少的公司，下面将一些面试经验和思考分享给大家。另外校招也快要开始了，为在校的学生提供一些经验供参考，希望都能找到满意的工作。
HTML5一个小迷宫游戏 357029540 html5
通过《HTML5游戏开发》摘抄了一个小迷宫游戏，感觉还不错，可以画画，写字，把摘抄的代码放上来分享下，喜欢的同学可以拿来玩玩！ <html> <head> <title>创建运行迷宫</title> <script type="text/javascript"
10步教你上传githib数据张亚雄 git
官方的教学还有其他博客里教的都是给懂的人说得，对已我们这样对我大菜鸟只能这么来锻炼，下面先不玩什么深奥的，先暂时用着10步干净利索。等玩顺溜了再用其他的方法。操作过程（查看本目录下有哪些文件NO.1）ls （跳转到子目录NO.2）cd+空格+目录（继续NO.3）ls （匹配到子目录NO.4）cd+ 目录首写字母+tab键+（首写字母“直到你所用文件根就不再按TAB键了”）（查看文件
MongoDB常用操作命令大全 adminjun mongodb 操作命令
成功启动MongoDB后，再打开一个命令行窗口输入mongo，就可以进行数据库的一些操作。输入help可以看到基本操作命令，只是MongoDB没有创建数据库的命令，但有类似的命令如：如果你想创建一个“myTest”的数据库，先运行use myTest命令，之后就做一些操作（如：db.createCollection('user')）,这样就可以创建一个名叫“myTest”的数据库。一
bat调用jar包并传入多个参数 aijuans
下面的主程序是通过eclipse写的： 1.在Main函数接收bat文件传递的参数（String[] args）如： String ip =args[0]; String user=args[1]; &nbs
Java中对类的主动引用和被动引用 ayaoxinchao java 主动引用对类的引用被动引用类初始化
在Java代码中，有些类看上去初始化了，但其实没有。例如定义一定长度某一类型的数组，看上去数组中所有的元素已经被初始化，实际上一个都没有。对于类的初始化，虚拟机规范严格规定了只有对该类进行主动引用时，才会触发。而除此之外的所有引用方式称之为对类的被动引用，不会触发类的初始化。虚拟机规范严格地规定了有且仅有四种情况是对类的主动引用，即必须立即对类进行初始化。四种情况如下：1.遇到ne
导出数据库提示 outfile disabled BigBird2012 mysql
在windows控制台下，登陆mysql，备份数据库： mysql>mysqldump -u root -p test test > D:\test.sql 使用命令 mysqldump 格式如下： mysqldump -u root -p *** DBNAME > E:\\test.sql。注意：执行该命令的时候不要进入mysql的控制台再使用，这样会报
Javascript 中的 && 和 || bijian1013 JavaScript &&||
准备两个对象用于下面的讨论 var alice = { name: "alice", toString: function () { return this.name; } } var smith = { name: "smith",
[Zookeeper学习笔记之四]Zookeeper Client Library会话重建 bit1129 zookeeper
为了说明问题，先来看个简单的示例代码： package com.tom.zookeeper.book; import com.tom.Host; import org.apache.zookeeper.WatchedEvent; import org.apache.zookeeper.ZooKeeper; import org.apache.zookeeper.Wat
【Scala十一】Scala核心五：case模式匹配 bit1129 scala
package spark.examples.scala.grammars.caseclasses object CaseClass_Test00 { def simpleMatch(arg: Any) = arg match { case v: Int => "This is an Int" case v: (Int, String)
运维的一些面试题 yuxianhua linux
1、Linux挂载Winodws共享文件夹 mount -t cifs //1.1.1.254/ok /var/tmp/share/ -o username=administrator,password=yourpass 或 mount -t cifs -o username=xxx,password=xxxx //1.1.1.1/a /win
Java lang包-Boolean BrokenDreams boolean
Boolean类是Java中基本类型boolean的包装类。这个类比较简单，直接看源代码吧。 public final class Boolean implements java.io.Serializable,
读《研磨设计模式》-代码笔记-命令模式-Command bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.Collection; import java.util.List; /** * GOF 在《设计模式》一书中阐述命令模式的意图：“将一个请求封装
matlab下GPU编程笔记 cherishLC matlab
不多说，直接上代码 gpuDevice % 查看系统中的gpu,,其中的DeviceSupported会给出matlab支持的GPU个数。 g=gpuDevice(1); %会清空 GPU 1中的所有数据,,将GPU1 设为当前GPU reset(g) %也可以清空GPU中数据。 a=1; a=gpuArray(a); %将a从CPU移到GPU中 onGP
SVN安装过程 crabdave SVN
SVN安装过程 subversion-1.6.12 ./configure --prefix=/usr/local/subversion --with-apxs=/usr/local/apache2/bin/apxs --with-apr=/usr/local/apr --with-apr-util=/usr/local/apr --with-openssl=/
sql　行列转换 daizj sql 行列转换行转列列转行
行转列的思想是通过case when 来实现列转行的思想是通过union all 来实现下面具体例子：假设有张学生成绩表(tb)如下: Name Subject Result 张三语文　　74 张三数学　　83 张三物理　　93 李四语文　　74 李四数学　　84 李四物理　　94 */ /* 想变成姓名 &
MySQL--主从配置 dcj3sjt126com mysql
linux下的mysql主从配置：说明：由于MySQL不同版本之间的(二进制日志)binlog格式可能会不一样，因此最好的搭配组合是Master的MySQL版本和Slave的版本相同或者更低， Master的版本肯定不能高于Slave版本。（版本向下兼容） mysql1 : 192.168.100.1 //master mysq
关于yii 数据库添加新字段之后model类的修改 dcj3sjt126com Model
rules: array('新字段','safe','on'=>'search') 1、array('新字段', 'safe')//这个如果是要用户输入的话，要加一下， 2、array('新字段', 'numerical'),//如果是数字的话 3、array('新字段', 'length', 'max'=>100),//如果是文本 1、2、3适当的最少要加一条，新字段才会被
sublime text3 中文乱码解决 dyy_gusi Sublime Text
sublime text3中文乱码解决原因：缺少转换为UTF-8的插件目的：安装ConvertToUTF8插件包第一步：安装能自动安装插件的插件，百度“Codecs33”，然后按照步骤可以得到以下一段代码： import urllib.request,os,hashlib; h = 'eb2297e1a458f27d836c04bb0cbaf282' + 'd0e7a30980927
概念了解：CGI，FastCGI，PHP-CGI与PHP-FPM geeksun PHP
CGI CGI全称是“公共网关接口”(Common Gateway Interface)，HTTP服务器与你的或其它机器上的程序进行“交谈”的一种工具，其程序须运行在网络服务器上。 CGI可以用任何一种语言编写，只要这种语言具有标准输入、输出和环境变量。如php,perl,tcl等。 FastCGI FastCGI像是一个常驻(long-live)型的CGI，它可以一直执行着，只要激活后，不
Git push 报错 "error: failed to push some refs to " 解决 hongtoushizi git
Git push 报错 "error: failed to push some refs to " . 此问题出现的原因是：由于远程仓库中代码版本与本地不一致冲突导致的。由于我在第一次git pull --rebase 代码后，准备push的时候，有别人往线上又提交了代码。所以出现此问题。解决方案： 1： git pull 2：
第四章 Lua模块开发 jinnianshilongnian nginx lua
在实际开发中，不可能把所有代码写到一个大而全的lua文件中，需要进行分模块开发；而且模块化是高性能Lua应用的关键。使用require第一次导入模块后，所有Nginx 进程全局共享模块的数据和代码，每个Worker进程需要时会得到此模块的一个副本（Copy-On-Write），即模块可以认为是每Worker进程共享而不是每Nginx Server共享；另外注意之前我们使用init_by_lua中初
java.lang.reflect.Proxy liyonghui160com
1.简介 Proxy 提供用于创建动态代理类和实例的静态方法（1）动态代理类的属性代理类是公共的、最终的，而不是抽象的未指定代理类的非限定名称。但是，以字符串 "$Proxy" 开头的类名空间应该为代理类保留代理类扩展 java.lang.reflect.Proxy 代理类会按同一顺序准确地实现其创建时指定的接口
Java中getResourceAsStream的用法 pda158 java
1.Java中的getResourceAsStream有以下几种： 1. Class.getResourceAsStream(String path) ： path 不以’/'开头时默认是从此类所在的包下取资源，以’/'开头则是从ClassPath根下获取。其只是通过path构造一个绝对路径，最终还是由ClassLoader获取资源。　　2. Class.getClassLoader.get
spring 包官方下载地址（非maven） sinnk spring
SPRING官方网站改版后，建议都是通过 Maven和Gradle下载，对不使用Maven和Gradle开发项目的，下载就非常麻烦，下给出Spring Framework jar官方直接下载路径: http://repo.springsource.org/libs-release-local/org/springframework/spring/ s
Oracle学习笔记(7) 开发PLSQL子程序和包 vipbooks oracle sql 编程
哈哈，清明节放假回去了一下，真是太好了，回家的感觉真好啊！现在又开始出差之旅了，又好久没有来了，今天继续Oracle的学习！这是第七章的学习笔记，学习完第六章的动态SQL之后，开始要学习子程序和包的使用了……，希望大家能多给俺一些支持啊！编程时使用的工具是PLSQL