东升董事长

MaskRCNN论文阅读笔记

Abstract
We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.
The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN,running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing,single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition.
我们提出了一个概念上简单，灵活，通用的对象实例分割框架。我们的方法有效地检测图像中的对象，同时为每个实例生成高质量的分割掩码。该方法称为掩码R-CNN，通过添加用于预测与现有分支并行的对象掩码的分支来扩展更快的R-CNN。用于边界框识别。 Mask R-CNN很容易训练，只需很少的开销就可以以5 fps的速度加速R-CNN。此外，Mask R-CNN很容易推广到其他任务，例如，允许我们在同一框架中估计人体姿势。我们在COCO挑战套件的所有三个轨道中展示了最佳结果，包括实例分割，边界框对象检测和人员关键点检测。没有花里胡哨，Mask R-CNN在每项任务中都优于所有现有的单一模型，包括COCO 2016挑战赛冠军。我们希望我们简单有效的方法将成为一个坚实的基线，并有助于简化未来在实例级认可方面的研究。
引言
The vision community has rapidly improved object detection and semantic segmentation results over a short period of time. In large part, these advances have been driven by powerful baseline systems, such as the Fast/Faster RCNN [1], [2] and Fully Convolutional Network (FCN) [3] frameworks for object detection and semantic segmentation,respectively. These methods are conceptually intuitive and offer flexibility and robustness, together with fast training and inference time. Our goal in this work is to develop a comparably enabling framework for instance segmentation.Instance segmentation is challenging because it requires the correct detection of all objects in an image while also precisely segmenting each instance. It therefore combines elements from the classical computer vision tasks of object detection, where the goal is to classify individual objects and localize each using a bounding box, and semantic segmentation, where the goal is to classify each pixel into a fixed set of categories without differentiating object instances. 1 Given this, one might expect a complex method is required to achieve good results. However, we show that a surprisingly simple, flexible, and fast system can surpass The vision community has rapidly improved object detection and semantic segmentation results over a short period of time. In large part, these advances have been driven by powerful baseline systems, such as the Fast/Faster RCNN [1], [2] and Fully Convolutional Network (FCN) [3] frameworks for object detection and semantic segmentation,respectively. These methods are conceptually intuitive and offer flexibility and robustness, together with fast training
and inference time. Our goal in this work is to develop a comparably enabling framework for instance segmentation.Instance segmentation is challenging because it requires the correct detection of all objects in an image while also precisely segmenting each instance. It therefore combines elements from the classical computer vision tasks of object detection, where the goal is to classify individual objects and localize each using a bounding box, and semantic segmentation, where the goal is to classify each pixel into a fixed set of categories without differentiating object instances. 1 Given this, one might expect a complex method is required to achieve good results. However, we show that a surprisingly simple, flexible, and fast system can surpass prior state-of-the-art instance segmentation results.prior state-of-the-art instance segmentation results.
视觉社区在短时间内迅速改进了对象检测和语义分割结果。在很大程度上，这些进步是由强大的基线系统驱动的，例如快速/快速RCNN [1]，[2]和完全卷积网络（FCN）[3]框架分别用于对象检测和语义分割。这些方法在概念上是直观的，并提供灵活性和稳健性，以及快速的培训和推理时间。我们在这项工作中的目标是为实例分割开发一个可比较的支持框架。实例分割具有挑战性，因为它需要正确检测图像中的所有对象，同时还要精确地分割每个实例。因此，它结合了来自对象检测的经典计算机视觉任务的元素，其目标是对各个对象进行分类并使用边界框对每个对象进行定位，以及语义分割，其目标是将每个像素分类为固定的一组类别而不区分对象实例。 1鉴于此，人们可能期望获得良好结果需要复杂的方法。然而，我们表明，一个令人惊讶的简单，灵活，快速的系统可以超越视觉社区在短时间内迅速改进了对象检测和语义分割结果。在很大程度上，这些进步是由强大的基线系统驱动的，例如快速/快速RCNN [1]，[2]和完全卷积网络（FCN）[3]框架分别用于对象检测和语义分割。这些方法在概念上是直观的，并提供灵活性和稳健性，以及快速的培训和推理时间。我们在这项工作中的目标是为实例分割开发一个可比较的支持框架。实例分割具有挑战性，因为它需要正确检测图像中的所有对象，同时还要精确地分割每个实例。因此，它结合了来自对象检测的经典计算机视觉任务的元素，其目标是对各个对象进行分类并使用边界框对每个对象进行定位，以及语义分割，其目标是将每个像素分类为固定的一组类别而不区分对象实例。 1鉴于此，人们可能期望获得良好结果需要复杂的方法。然而，我们表明，一个令人惊讶的简单，灵活和快速的系统可以超越先前的最新实例分割结果。最先进的实例分割结果。
Our method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting segmentation masks on each Region of Interest (RoI), in parallel with the existing branch for classification and bounding box regression(Figure 1). The mask branch is a small FCN applied to each RoI, predicting a segmentation mask in a pixel-to-pixel manner. Mask R-CNN is simple to implement and train given the Faster R-CNN framework, which facilitates a wide range of flexible architecture designs. Additionally,the mask branch only adds a small computational overhead,enabling a fast system and rapid experimentation.
我们的方法称为Mask R-CNN，通过添加分支来扩展Faster R-CNN，用于预测每个感兴趣区域（RoI）上的分割掩码，与现有分支并行进行分类和边界框回归（图1）。掩模分支是应用于每个RoI的小FCN，以像素到像素的方式预测分割掩模。鉴于更快的R-CNN框架，Mask R-CNN易于实施和训练，这有助于广泛的灵活架构设计。此外，掩码分支仅增加了小的计算开销，实现了快速系统和快速实验。
In principle Mask R-CNN is an intuitive extension of Faster R-CNN, yet constructing the mask branch properly is critical for good results. Most importantly, Faster R-CNN was not designed for pixel-to-pixel alignment between network inputs and outputs. This is most evident in how RoIPool [5], [1], the de facto core operation for attending to instances, performs coarse spatial quantization for feature extraction. To fix the misalignment, we propose a simple,quantization-free layer, called RoIAlign, that faithfully preserves exact spatial locations. Despite being a seemingly minor change, RoIAlign has a large impact: it improves mask accuracy by relative 10% to 50%, showing bigger gains under stricter localization metrics. Second, we found it essential to decouple mask and class prediction: we predict a binary mask for each class independently, without competition among classes, and rely on the network’s RoI classification branch to predict the category. In contrast,FCNs usually perform per-pixel multi-class categorization,
原则上，MaskR-CNN是FasterR-CNN的直观扩展，但正确构建掩模分支对于获得良好结果至关重要。最重要的是，FasterR-CNN并非设计用于网络输入和输出之间的像素到像素对齐。这一点在RoIPool [5]，[1]，参与实例的事实核心操作如何为特征提取执行粗略空间量化方面最为明显。为了解决这个错位，我们提出了一个简单的，无量化的层，称为RoIAlign，它忠实地保留了精确的空间位置。尽管看似微小的变化，但RoIAlign产生了巨大的影响：它将掩模精度提高了10％到50％，在更严格的本地化指标下显示出更大的收益。其次，我们发现将掩模和类预测分离是必不可少的：我们独立地预测每个类的二进制掩码，没有类之间的竞争，并依赖于网络的RoI分类分支来预测类别。相比之下，FCN通常执行每像素多类别分类，其耦合分割和分类，并且基于我们的实验分割效果不佳。
Without bells and whistles, Mask R-CNN surpasses all previous state-of-the-art single-model results on the COCO instance segmentation task [6], including the heavily-
engineered entries from the 2016 competition winner. As a by-product, our method also excels on the COCO object detection task. In ablation experiments, we evaluate multiple basic instantiations, which allows us to demonstrate its robustness and analyze the effects of core factors.
Our models can run at about 200ms per frame on a GPU,and training on COCO takes one to two days on a single 8-GPU machine. We believe the fast train and test speeds,
together with the framework’s flexibility and accuracy, will benefit and ease future research on instance segmentation.
Finally, we showcase the generality of our framework via the task of human pose estimation on the COCO keypoint dataset [6]. By viewing each keypoint as a one-
hot binary mask, with minimal modification Mask R-CNN can be applied to detect instance-specific poses. Mask R-CNN surpasses the winner of the 2016 COCO keypoint competition, and at the same time runs at 5 fps. Mask RCNN, therefore, can be seen more broadly as a flexible framework for instance-level recognition and can be readily extended to more complex tasks.
A preliminary version of this manuscript was published previously [7]. As a generic framework, Mask R-CNN is compatible with complementary techniques developed for detection/segmentation, as have been widely witnessed in Fast/Faster R-CNN and FCN in the past years. This manuscript also describes some techniques that improve over our original results published in [7]. Thanks to its generality and flexibility, Mask R-CNN was used as the framework by the three winning teams in the COCO 2017 instance segmentation competition, which all significantly outperformed the previous state of the art.We have released code to facilitate future research.
没有花里胡哨，Mask R-CNN在COCO实例分割任务[6]上超越了所有先前最先进的单一模型结果，包括2016年竞赛获胜者的大量设计作品。作为副产品，我们的方法也擅长COCO对象检测任务。在消融实验中，我们评估了多个基本实例，这使我们能够证明其稳健性并分析核心因素的影响。
我们的模型可以在GPU上以每帧大约200ms的速度运行，而COCO上的培训需要在一台8-GPU机器上进行一到两天的培训。我们相信快速训练和测试速度以及框架的灵活性和准确性将有利于并简化未来对实例分割的研究。
最后，我们通过COCO关键点数据集上的人体姿态估计任务展示了我们框架的一般性[6]。通过将每个关键点视为单热二进制掩码，通过最小的修改，可以应用掩码R-CNN来检测特定于实例的姿势。 Mask R-CNN超越2016年COCO关键点竞赛的冠军，同时以5 fps的速度运行。因此，掩码RCNN可以更广泛地被视为用于实例级识别的灵活框架，并且可以容易地扩展到更复杂的任务。
该手稿的初步版本先前已发表[7]。作为通用框架，Mask R-CNN与为检测/分割开发的互补技术兼容，如过去几年在快速/快速R-CNN和FCN中广泛见到的那样。该手稿还描述了一些改进我们在[7]中发表的原始结果的技术。由于其通用性和灵活性，Mask R-CNN被COCO 2017实例细分竞赛中的三个获奖团队用作框架，所有这些都明显优于先前的技术水平。我们已经发布了代码以促进未来的研究。
Related Work
R-CNN: The Region-based CNN (R-CNN) approach [8] to bounding-box object detection is to attend to a manageable number of candidate object regions [9], [10] and evaluate convolutional networks [11], [12] independently on each RoI. R-CNN was extended [5], [1] to allow attending to RoIs on feature maps using RoIPool, leading to fast speed and better accuracy. Faster R-CNN [2] advanced this stream by learning the attention mechanism with a Region Proposal Network (RPN). Faster R-CNN is flexible and robust to many follow-up improvements (e.g., [13], [14], [15]), and is the current leading framework in several benchmarks.Instance Segmentation: Driven by the effectiveness of R-CNN, many approaches to instance segmentation are based on segment proposals. Earlier methods [8], [16], [17],[18] resorted to bottom-up segments [9], [19]. DeepMask[20] and following works [21], [22] learn to propose segment candidates, which are then classified by Fast RCNN. In these methods, segmentation precedes recognition,which is slow and less accurate. Likewise, Dai et al. [23]proposed a complex multiple-stage cascade that predicts segment proposals from bounding-box proposals, followed by classification. Instead, our method is based on parallel prediction of masks and class labels, which is simpler and more flexible.
Most recently, Li et al. [24] combined the segment proposal system in [22] and object detection system in [25]for “fully convolutional instance segmentation” (FCIS).The common idea in [22], [25], [24] is to predict a set of position-sensitive output channels fully convolutionally. These channels simultaneously address object classes,boxes, and masks, making the system fast. But FCIS exhibits systematic errors on overlapping instances and creates spurious edges (Figure 6), showing that it is challenged by the fundamental difficulties of segmenting instances.Another family of solutions [26], [27], [28], [29] to instance segmentation are driven by the success of semantic segmentation. Starting from per-pixel classification results (e.g., FCN outputs), these methods attempt to cut the pixels of the same category into different instances. In contrast to the segmentation-first strategy of these methods, Mask RCNN is based on an instance-first strategy. We expect a deeper incorporation of both strategies will be studied in the future.
R-CNN：基于区域的CNN（R-CNN）方法[8]用于边界框对象检测是为了处理可管理数量的候选对象区域[9]，[10]并评估卷积网络[11]， [12]独立于每个RoI。 R-CNN被扩展[5]，[1]允许使用RoIPool在特征图上参与RoI，从而实现更快的速度和更高的准确性。更快的R-CNN [2]通过学习区域提议网络（RPN）的注意机制来推进这一流。更快的R-CNN对许多后续改进具有灵活性和鲁棒性（例如，[13]，[14]，[15]），并且是目前几个基准测试中的领先框架。实例分割：由R-的有效性驱动CNN，实例细分的许多方法都基于细分提议。早期的方法[8]，[16]，[17]，[18]采用自下而上的方法[9]，[19]。 DeepMask [20]及其后的作品[21]，[22]学会提出段候选，然后通过快速RCNN进行分类。在这些方法中，分割先于识别，这是缓慢且不太准确的。同样，戴等人。 [23]提出了一个复杂的多阶段级联，它从边界框提议中预测分段提议，然后进行分类。相反，我们的方法基于掩模和类标签的并行预测，这更简单，更灵活。
最近，李等人。 [24]将[22]中的分段建议系统和[25]中的目标检测系统结合起来进行“完全卷积实例分割”（FCIS）。[22]，[25]，[24]中的常见思想是预测a一组位置敏感的输出通道完全卷积。这些通道同时处理对象类，盒子和掩码，使系统快速。但FCIS在重叠实例上表现出系统误差，并产生虚假边缘（图6），表明它受到分割实例的基本困难的挑战。另一类解决方案[26]，[27]，[28]，[29]实例分割是由语义分割的成功驱动的。从每像素分类结果（例如，FCN输出）开始，这些方法试图将相同类别的像素切割成不同的实例。与这些方法的分段优先策略相比，掩码RCNN基于实例优先策略。我们预计未来将研究两种策略的更深层次结合。
3.MaskRCNN
Mask R-CNN is conceptually simple: Faster R-CNN has two outputs for each candidate object, a class label and a bounding-box offset; to this we add a third branch that outputs the object mask. Mask R-CNN is thus a natural and intuitive idea. But the additional mask output is distinct from the class and box outputs, requiring extraction of much finer spatial layout of an object. Next, we introduce the key elements of Mask R-CNN, including pixel-to-pixel alignment, which is the main missing piece of Fast/Faster R-CNN.
MaskR-CNN在概念上很简单：FasterR-CNN为每个候选对象提供两个输出，一个类标签和一个边界框偏移; 为此，我们添加了第三个输出目标mask的分支。因此，MaskR-CNN是一种自然而直观的想法。但是额外的mask输出与类和框输出不同，需要提取对象的更精细的空间布局。接下来，我们介绍Mask R-CNN的关键元素，包括像素到像素的对齐，这是Fast / Faster R-CNN的主要缺失部分。
Faster R-CNN: We begin by briefly reviewing the Faster R-CNN detector [2]. Faster R-CNN consists of two stages.The first stage, called a Region Proposal Network (RPN),proposes candidate object bounding boxes. The second stage, which is in essence Fast R-CNN [1], extracts features using RoIPool from each candidate box and performs classification and bounding-box regression. The features
used by both stages can be shared for faster inference. We refer readers to [15] for latest, comprehensive comparisons between Faster R-CNN and other frameworks.
FasterR-CNN：我们首先简要回顾一下FasterR-CNN探测器[2]。 FasterR-CNN由两个阶段组成。第一阶段称为区域提议网络（RPN），提出候选对象边界框。第二阶段，实质上是FastR-CNN [1]，从每个候选框中使用RoIPool提取特征，并执行分类和边界框回归。特点是两个阶段使用的可以共享以便更快地推断。我们向读者推荐[15]，以便在更快的R-CNN和其他框架之间进行最新，全面的比较。
Mask R-CNN: Mask R-CNN adopts the same two-stage procedure, with an identical first stage (which is RPN). In the second stage, in parallel to predicting the class and box offset, Mask R-CNN also outputs a binary mask for each RoI. This is in contrast to most recent systems, where classification depends on mask predictions (e.g. [20], [23],[24]). Our approach follows the spirit of Fast R-CNN [1]that applies bounding-box classification and regression in parallel (which turned out to largely simplify the multi-stage pipeline of original R-CNN [8]).
MaskR-CNN：MaskR-CNN采用相同的两阶段过程，具有相同的第一阶段（即RPN）。在第二阶段，与预测类和盒偏移并行，Mask R-CNN还为每个RoI输出二进制掩码。这与最近的系统形成对比，其中分类取决于Mask预测（例如[20]，[23]，[24]）。我们的方法遵循Fast R-CNN [1]的精神，它并行应用边界框分类和回归（结果大大简化了原始R-CNN的多阶段流水线[8]）。
Formally, during training, we define a multi-task loss on each sampled RoI as L = L cls + L box + L mask . The classification loss L cls and bounding-box loss L box are identical as those defined in [1]. The mask branch has a Km 2 -dimensional output for each RoI, which encodes K binary masks of resolution m × m, one for each of the K classes. To this we apply a per-pixel sigmoid, and define L mask as the average binary cross-entropy loss. For an RoI associated with ground-truth class k, L mask is only defined on the k-th mask (other mask outputs do not contribute to the loss).
正式地，在训练期间，我们将每个采样的RoI上的多任务损失定义为L = L cls + L box + L mask。分类损失L cls和边界框损失L框与[1]中定义的相同。掩码分支对于每个RoI具有 $K*m^2$ 维输出，其编码分辨率为m×m的K个二进制掩码，每个K类对应一个。为此，我们应用每像素S形，并将L掩模定义为平均二元交叉熵损失。对于与地面实况类k相关联的RoI，L掩模仅在第k个掩模上定义（其他掩模输出不会导致损耗）。
Our definition of $L_{mask}$ allows the network to generate masks for every class without competition among classes;we rely on the dedicated classification branch to predict the class label used to select the output mask. This decouples mask and class prediction. This is different from common practice when applying FCNs [3] to semantic segmentation,which typically uses a per-pixel softmax and a multinomial cross-entropy loss. In that case, masks across classes compete; in our case, with a per-pixel sigmoid and a binary loss,they do not. We show by experiments that this formulation is key for good instance segmentation results.
我们对$ L_ {mask} $的定义允许网络为每个类生成掩码而不会在类之间进行竞争;我们依靠专用的分类分支来预测用于选择输出掩码的类标签。这解耦了掩码和类预测。这与将FCN [3]应用于语义分割时的常规做法不同，后者通常使用每像素softmax和多项交叉熵损失。在这种情况下，各类的面具竞争; 在我们的例子中，每像素sigmoid和二进制丢失，他们没有。我们通过实验表明，该公式是良好实例分割结果的关键。
Mask Representation: A mask encodes an input object’s spatial layout. Thus, unlike class labels or box offsets that are inevitably collapsed into short output vectors by
fully-connected (fc) layers, extracting the spatial structure of masks can be addressed naturally by the pixel-to-pixel correspondence provided by convolutions.
Specifically, we predict an m × m mask from each RoI using an FCN [3]. This allows each layer in the mask branch to maintain the explicit m×m object spatial layout
without collapsing it into a vector representation that lacks spatial dimensions. Unlike previous methods that resort to fc layers for mask prediction [20], [21], [23], our fully
convolutional representation requires fewer parameters, and is more accurate as demonstrated by experiments.
This pixel-to-pixel behavior requires our RoI features,which themselves are small feature maps, to be well aligned to faithfully preserve the explicit per-pixel spatial correspondence. This motivated us to develop the following RoIAlign layer that plays a key role in mask prediction.
掩码表示：掩码编码输入对象的空间布局。因此，与通过完全连接（fc）层不可避免地折叠成短输出矢量的类标签或盒偏移不同，提取掩模的空间结构可以通过由卷积提供的像素到像素的对应自然地解决。具体来说，我们使用FCN预测每个RoI的m×m掩模[3]。这允许掩模分支中的每个层保持显式的m×m对象空间布局，而不将其折叠成缺少空间维度的矢量表示。与先前使用fc层进行掩模预测的方法[20]，[21]，[23]不同，我们的完全卷积表示需要更少的参数，并且如实验所证明的更准确。
这种像素到像素的行为要求我们的RoI特征（它们本身就是小特征映射）要很好地对齐，以忠实地保持显式的每像素空间对应关系。这促使我们开发以下RoIAlign层，该层在掩模预测中起关键作用。
RoIAlign: RoIPool [1] is a standard operation for extracting a small feature map (e.g., 7×7) from each RoI.RoIPool first quantizes a floating-number RoI to the discrete granularity of the feature map, this quantized RoI is then subdivided into spatial bins which are themselves quantized, and finally feature values covered by each bin
are aggregated (usually by max pooling). Quantization is performed, e.g., on a continuous coordinate x by computing [x/16], where 16 is a feature map stride and [·] is rounding;likewise, quantization is performed when dividing into bins (e.g., 7×7). These quantizations introduce misalignments between the RoI and the extracted features. While this may not impact classification, which is robust to small translations, it has a large negative effect on predicting pixel-accurate masks.
RoIAlign：RoIPool [1]是从每个RoI中提取小特征映射（例如，7×7）的标准操作.RoIPool首先将浮点数RoI量化为特征映射的离散粒度，然后将该量化的RoI细分进入自身量化的空间区间，最后聚合每个区间覆盖的特征值（通常通过最大池化）。例如，通过计算[x / 16]在连续坐标x上执行量化，其中16是特征图步幅并且[·]是舍入的;同样，当分成区间（例如，7×7）时执行量化。这些量化引入了RoI和提取的特征之间的未对准。虽然这可能不会影响分类，这对小翻译很有效，但它对预测像素精确掩模有很大的负面影响。
To address this, we propose an RoIAlign layer that removes the harsh quantization of RoIPool, properly aligning the extracted features with the input. Our proposed change
is simple: we avoid any quantization of the RoI boundaries or bins (i.e., we use x/16 instead of [x/16]). We use bilinear interpolation [31] to compute the exact values of the input features at four regularly sampled locations in each RoI bin, and aggregate the result (using max or average). 2 See Figure 3 for our implementation details. We note that the results are not sensitive to where the four sampling points are located in the bin, or how many points are sampled, as long as no quantization is performed on any coordinates involved.
RoIAlign leads to large improvements as we show in §4.2. We also compare to the RoIWarp operation proposed in [23]. Unlike RoIAlign, RoIWarp overlooked the alignment issue and was implemented in [23] as quantizing RoI just like RoIPool. So even though RoIWarp also adopts bilinear resampling motivated by [31], it performs on par with RoIPool as shown by experiments (more details in Table 2c), demonstrating the crucial role of alignment.
为了解决这个问题，我们提出了一个RoIAlign层来消除RoIPool的严格量化，正确地将提取的特征与输入对齐。我们提出的改变很简单：我们避免对RoI边界或区间进行任何量化（即，我们使用x / 16而不是[x / 16]）。我们使用双线性插值[31]来计算每个RoI仓中四个常规采样位置的输入要素的精确值，并汇总结果（使用最大值或平均值）。 2有关我们的实施细节，请参见图3。我们注意到，只要不对所涉及的任何坐标执行量化，结果对于四个采样点位于箱中的位置或者采样多少点都不敏感。
正如我们在§4.2中所展示的那样，RoIAlign带来了巨大的改进。我们还比较了[23]中提出的RoIWarp操作。与RoIAlign不同，RoIWarp忽略了对齐问题，并在[23]中实现为量化RoI，就像RoIPool一样。因此，尽管RoIWarp也采用[31]推动的双线性重采样，但它与RoIPool相当，如实验所示（表2c中的更多细节），证明了对齐的关键作用。

RoIAlign的实施。虚线网格是执行RoIAlign的特征图，实线表示RoI（在此示例中为2×2个区间），点表示每个区间内的4个采样点。计算每个采样点的值通过特征图上附近网格点的双线性插值。不对RoI，其箱或采样点中涉及的任何坐标执行量化。

Network Architecture: To demonstrate the generality of our approach, we instantiate Mask R-CNN with multiple architectures. For clarity, we differentiate between: (i) the convolutional backbone architecture used for feature extraction over an entire image, and (ii) the network head for bounding-box recognition (classification and regression) and mask prediction that is applied separately to each RoI. We denote the backbone architecture using the nomenclature network-depth-features. We evaluate ResNet [4] and ResNeXt [32] networks of depth 50 or 101 layers. The original implementation of Faster R-CNN with ResNets [4] extracted features from the final convolutional layer of the 4-th stage, which we call C4. This backbone with ResNet-50, for example, is denoted by ResNet-50-C4. This is a common choice used in [4], [23], [15], [33].
网络架构：为了演示我们方法的一般性，我们使用多种架构实例化Mask R-CNN。为清楚起见，我们区分：（i）用于整个图像上的特征提取的卷积骨干架构，以及（ii）用于边界框识别（分类和回归）的网络头和单独应用于每个RoI的掩模预测。我们使用命名法网络深度特征来表示骨干架构。我们评估深度为50或101层的ResNet [4]和ResNeXt [32]网络。最快的R-CNN与ResNets [4]的实现从第4阶段的最终卷积层中提取了特征，我们称之为C4。例如，ResNet-50的这个主干由ResNet-50-C4表示。这是[4]，[23]，[15]，[33]中常用的选择。
我们还探索了Lin等人最近提出的另一个更有效的主干。 [14]，称为特征金字塔网络（FPN）。 FPN使用具有横向连接的自上而下架构，从单一尺度输入构建网内特征金字塔。带有FPN的更快的R-CNN
骨干根据其规模从特征金字塔的不同级别提取RoI特征，但其他方法类似于vanilla ResNet。使用ResNet-FPN骨干网通过Mask R-CNN进行特征提取，可以在精度和速度方面获得极佳的提升。有关FPN的更多详细信息，请参阅[14]。
对于网络头，我们密切关注以前工作中提出的体系结构，我们在其中添加了完全卷积模板预测分支。具体来说，我们从ResNet [4]和FPN [14]论文中扩展了更快的R-CNN盒头。详细信息如图4所示.ResNet-C4主干网的头部包括ResNet的第5阶段（即9层’res5’[4]），这是计算密集型的。对于FPN，骨干已经包括res5，因此允许更有效的头使用更少的过滤器。我们注意到我们的掩模分支具有简单的结构。更复杂的设计有可能提高性能，但不是这项工作的重点。
3.1实现细节
We set hyper-parameters following existing Fast/Faster R-CNN work [1], [2], [14]. Although these decisions were made for object detection in original papers [1], [2], [14], we found our instance segmentation system is robust to them.
我们根据现有的快速/快速R-CNN工作设置超参数[1]，[2]，[14]。虽然这些决定是在原始论文[1]，[2]，[14]中为对象检测做出的，但我们发现我们的实例分割系统对它们是健壮的。
Training: As in Fast R-CNN, an RoI is considered positive if it has IoU with a ground-truth box of at least 0.5 and negative otherwise. The mask loss L mask is defined only on positive RoIs. The mask target is the intersection between an RoI and its associated ground-truth mask.
训练：如在快速R-CNN中，如果具有至少为0.5的地面实况框的IoU，则认为RoI为正，否则为负。掩模丢失L掩模仅在正RoI上定义。掩模目标是RoI与其相关的地面实况掩模之间的交集。
We adopt image-centric training [1]. Images are resized such that their scale (shorter edge) is 800 pixels [14]. Each mini-batch has 2 images per GPU and each image has N
sampled RoIs, with a ratio of 1:3 of positive to negatives[1]. N is 64 for the C4 backbone (as in [1], [2]) and 512 for FPN (as in [14]). We train on 8 GPUs (so effective mini-batch size is 16) for 160k iterations, with a learning rate of 0.02 which is decreased by 10 at the 120k iteration.
We use a weight decay of 0.0001 and a momentum of 0.9.When using ResNeXt [32], we use a mini-batch size of 1 image per GPU with the same number of iterations, and a
starting learning rate of 0.01.
The RPN anchors span 5 scales and 3 aspect ratios,following [14]. For convenient ablation, RPN is trained separately and does not share features with Mask R-CNN,
unless specified. For every entry in this paper, RPN and Mask R-CNN have the same backbones and so they are shareable.
我们采用以图像为中心的培训[1]。调整图像大小以使其比例（较短边缘）为800像素[14]。每个小批量每个GPU有2个图像，每个图像有N个采样的RoI，正负比为1：3 [1]。 C4主干的N为64（如[1]，[2]中所述），FPN为512（如[14]所示）。我们在8个GPU（有效的小批量大小为16）上进行160k次迭代训练，学习率为0.02，在120k迭代时减少10。
我们使用0.0001的权重衰减和0.9的动量。当使用ResNeXt [32]时，我们使用每个GPU的1个图像的小批量大小，具有相同的迭代次数，并且起始学习率为0.01。
在[14]之后，RPN锚点跨越5个尺度和3个纵横比。为了方便消融，RPN是单独训练的，除非另有说明，否则不与Mask R-CNN共享功能。对于本文中的每个条目，RPN和Mask R-CNN具有相同的主干，因此它们是可共享的。
Inference: At test time, the proposal number is 300 for the C4 backbone (as in [2]) and 1000 for FPN (as in [14]). We run the box prediction branch on these proposals, followed by non-maximum suppression [34]. The mask branch is then applied to the highest scoring 100 detection boxes.
Although this differs from the parallel computation used in training, it speeds up inference and improves accuracy (due to the use of fewer, more accurate RoIs). The mask branch can predict K masks per RoI, but we only use the k-th mask, where k is the predicted class by the classification branch. The m×m floating-number mask output is then
resized to the RoI size, and binarized at a threshold of 0.5.Note that since we only compute masks on the top 100 detection boxes, Mask R-CNN adds a small overhead to its Faster R-CNN counterpart (e.g., ∼ 20% on typical models).
推论：在测试时，C4主干的提议编号为300（如[2]所示），FPN为1000（如[14]所示）。我们在这些提议上运行盒子预测分支，然后是非最大抑制[34]。然后将掩模分支应用于最高得分100检测框。
虽然这与训练中使用的并行计算不同，但它加速了推理并提高了准确性（由于使用更少，更准确的RoI）。掩码分支可以预测每个RoI的K个掩码，但是我们仅使用第k个掩码，其中k是分类分支的预测类。然后是m×m浮点数掩码输出
调整大小为RoI大小，并在0.5的阈值处进行二值化。注意，由于我们只在前100个检测框上计算掩码，因此Mask R-CNN为其更快的R-CNN对应物增加了一小部分开销（例如，约20％）典型的模型）。

你可能感兴趣的:(MaskRCNN论文阅读笔记)

论文阅读笔记——QLORA: Efficient Finetuning of Quantized LLMs 寻丶幽风论文阅读笔记论文阅读笔记人工智能深度学习语言模型
QLoRA论文4-bit标准浮点数量化常见的量化技术是最大绝对值量化：XInt8=round(127absmax(XFP32)XFP32)=round(cFP32,XFP32)式(1)X^{Int8}=round(\frac{127}{absmax(X^{FP32})}X^{FP32})=round(c^{FP32},X^{FP32})\qquad\qquad\text{式(1)}XInt8=ro
论文阅读笔记：Graph Matching Networks for Learning the Similarity of Graph Structured Objects 游离态GLZ不可能是金融技术宅知识图谱机器学习深度学习人工智能
论文做的是用于图匹配的神经网络研究，作者做出了两点贡献:证明GNN可以经过训练，产生嵌入graph-leve的向量可以用于相似性计算。作者提出了一种新的基于注意力的跨图匹配机制GMN(cross-graphattention-basedmatchingmechanism)，来计算出一对图之间的相似度评分。（核心创新点）论文证明了该模型在不同领域的有效性，包括具有挑战性的基于控制流图(control
论文阅读笔记——π0: A Vision-Language-Action Flow Model for General Robot Control 寻丶幽风论文阅读笔记论文阅读笔记人工智能机器人语言模型
π0论文π0π_0π0是基于预训练的VLM模型增加了actionexpert，并结合了flowmatching方法训练的自回归模型，能够直接输出模型的actionchunk（50）。π0采用FlowMatching技术来建模连续动作的分布，这一创新使模型能够精确控制高频率的灵巧操作任务，同时具备处理多模态数据的能力。架构受到Transfusion的启发：通过单一Transformer处理多目标任务
论文阅读笔记——Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware 寻丶幽风论文阅读笔记论文阅读笔记人工智能深度学习机器人
ALOHA论文ALOHA解决了策略中的错误可能随时间累积，且人类演示可能是非平稳的，提出了ACT（ActionChunkingwithTransformers）方法。ActionChunking模仿学习中，compoundingerror是致使任务失败的主要原因。具体来说，当智能体（agent）在测试时遇到训练集中未见过的情况时，可能会产生预测误差。这些误差会逐步累积，导致智能体进入未知状态，最终
Self-Attentive Sequential Recommendation论文阅读笔记调包调参侠推荐系统学习深度学习机器学习神经网络算法
SASRec论文阅读笔记论文标题：Self-AttentiveSequentialRecommendation发表于：2018ICDM作者：Wang-ChengKang,JulianMcAuley论文代码：https://github.com/pmixer/SASRec.pytorch论文地址：https://arxiv.org/pdf/1808.09781v1.pdf摘要顺序动态是许多现代推荐系
论文阅读笔记2 sixfrogs 论文阅读笔记论文阅读 cnn
OptimizingMemoryEfficiencyforDeepConvolutionalNeuralNetworksonGPUs1论文简介作者研究了CNN各层的访存效率，并揭示了数据结构和访存模式对CNN的性能影响。并提出了优化方法。2方法介绍2.1Benchmarks数据集：MNIST，CIFAR，ImageNetCNN：AlexNet，ZFNet，VGG2.2实验设置CPU：IntelXe
大模型隐空间推理论文阅读笔记猴猴猪猪 AIGC python 实验记录人工智能深度学习
文章目录TrainingLargeLanguageModelstoReasoninaContinuousLatentSpace一.简介1.1摘要1.2引言TrainingLargeLanguageModelstoReasoninaContinuousLatentSpace一.简介机构：Meta代码：任务:特点:方法:1.1摘要现状：大语言模型往往局限在“languagespace"进行推理，在解决
【网安AIGC专题】46篇前沿代码大模型论文、24篇论文阅读笔记汇总_大模型在代码缺陷检测领域的应用实践(1) 2401_84972910 程序员 AIGC 论文阅读笔记
欢迎一起踏上探险之旅，挖掘无限可能，共同成长！写在最前面本文为邹德清教授的《网络安全专题》课堂笔记系列的文章，本次专题主题为大模型。本系列文章不仅涵盖了46篇关于前沿代码大模型的论文，还包含了24篇深度论文阅读笔记，全面覆盖了代码生成、漏洞检测、程序修复、生成测试等多个应用方向，深刻展示了这些技术如何在网络安全领域中起到革命性作用。同时，本系列还细致地介绍了大模型技术的基础架构、增强策略、关键数据
论文阅读笔记——Prediction with Action: Visual Policy Learning via Joint Denoising Process 寻丶幽风论文阅读笔记论文阅读笔记人工智能
以前的method是输入视频输出视频或者输入视频和action学习action，该方法认为action，video和othercondition具有一定联系，所以一次性对所有的进行jointdenoise。网络结构采用MaskedMulti-headAttention关联不同模态，使用DiT的backbone。
深度学习重要论文阅读笔记 ResNet （2025.2.26）北岛寒沫逐界星辰2025 计算机科研深度学习论文阅读笔记
文章目录问题背景数据预处理神经网络模型模型性能知识点积累英语单词积累问题背景随着神经网络变得更深（层数变多），模型的训练过程也会变得更加困难。当神经网络的深度增加，就会出现梯度消失和梯度下降现象，妨碍模型的收敛。不过，这种情况可以通过归一化的模型初始化和中间的归一化层基本解决。但是，尽管在增加了归一化技术的情况下很深的神经网络可以收敛，又出现了另外一个问题，即随着模型深度的增加，模型的准确率反而下
论文阅读笔记1——DARTS：Differentiable Architecture Search可微分架构搜索（一）（论文翻译学习） fuhao7i 论文阅读笔记深度学习人工智能机器学习算法计算机视觉
DARTS：DifferentiableArchitectureSearch可微分架构搜索（一）DARTS：DifferentiableArchitectureSearch（一）ABSTRACT摘要1.INTRODUCTION介绍2.可微的结构搜索加油加油！如果你感觉你现在很累，那么恭喜你，你现在正在走上坡路！让我们一起加油！欢迎关注我的讲解视频，让我们一起学习：Bilibili主页：https:
【CCM-SLAM论文阅读笔记】随机取名字协同SLAM论文阅读 slam
CCM-SLAM论文阅读笔记整体框架结构如图所示：单智能体只负责采集图像数据，运行实时视觉里程计VO以估计当前位姿和环境地图，由于单智能体计算资源有限，负责生成的局部地图只包含当前N个最近的关键帧。服务器负责地图管理、地点识别、地图融合和全局BA优化。所有局部地图使用本地里程计框架，地图信息在从一个本地里程计到另一个本地里程计框架的相对坐标中进行交换。CCM-SLAM不假设任何关于智能体初始位置的
【论文阅读笔记|EMNLP2023】DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Ext Rose sait 论文阅读笔记
论文题目：DemoSG:Demonstration-enhancedSchema-guidedGenerationforLow-resourceEventExtraction论文来源：EMNLP2023论文链接：2023.findings-emnlp.121.pdf(aclanthology.org)代码链接：https://github.com/GangZhao98/DemoSG0摘要当前大多数
神经网络压缩实验-Deep-compression 无用技术研究所
首发于个人博客，结合论文阅读笔记更佳实验准备基础网络搭建为了实现神经网络的deepcompression，首先要训练一个深度神经网络，为了方便实现，这里实现一个两层卷积+两层MLP的神经网络classnet(pt.nn.Module):def__init__(self):super(net,self).__init__()self.conv1=pt.nn.Conv2d(in_channels=1,
论文阅读笔记（9）——《A Practical Survey on Faster and Lighter Transformers》 StriveQueen 自然语言处理机器学习论文阅读笔记算法神经网络机器学习 Transformer
1Abstract2Introductionrecurrentneuralnetworks(RNNs)longshort-termmemory(LSTM)networksequencetosequenceframeworkinter-attentionrelativeeffectivecontextlength(RECL)Transformer3TransformerA.EncoderB.Deco
论文阅读笔记：AI+RPA 几道之旅人工智能
文章目录论文题目下载地址论文摘要论文题目Challengesandopportunities:ImplementingRPAandAIinfrauddetectioninthebankingsector下载地址点击这里下载论文摘要在银行业中，将机器人流程自动化（RPA）和人工智能（AI）集成用于欺诈检测是一项重大变革，既带来了挑战，也带来了机遇。随着金融机构面临日益复杂的欺诈企图，RPA和AI成为
论文阅读笔记（十九）：YOLO9000: Better, Faster, Stronger __Sunshine__ 笔记 YOLO9000 detection classification
WeintroduceYOLO9000,astate-of-the-art,real-timeobjectdetectionsystemthatcandetectover9000objectcategories.FirstweproposevariousimprovementstotheYOLOdetectionmethod,bothnovelanddrawnfrompriorwork.Theim
论文阅读笔记: DINOv2: Learning Robust Visual Features without Supervision 小夏refresh 论文计算机视觉深度学习论文阅读笔记深度学习计算机视觉人工智能
DINOv2:LearningRobustVisualFeatureswithoutSupervision论文地址:https://arxiv.org/abs/2304.07193代码地址:https://github.com/facebookresearch/dinov2摘要大量数据上的预训练模型在NLP方面取得突破，为计算机视觉中的类似基础模型开辟了道路。这些模型可以通过生成通用视觉特征(即无
SAFEFL: MPC-friendly Framework for Private and Robust Federated Learning论文阅读笔记慘綠青年627 论文阅读笔记深度学习
SAFEFL:MPC-friendlyFrameworkforPrivateandRobustFederatedLearning适用于私有和鲁棒联邦学习的MPC友好框架SAFEFL，这是一个利用安全多方计算(MPC)来评估联邦学习(FL)技术在防止隐私推断和中毒攻击方面的有效性和性能的框架。概述传统机器学习（ML）：集中收集数据->隐私保护问题privacy-preservingML(PPML)采
A Tutorial on Near-Field XL-MIMO Communications Towards 6G【论文阅读笔记】 Cc小跟班【论文阅读】相关论文阅读笔记
此系列是本人阅读论文过程中的简单笔记，比较随意且具有严重的偏向性（偏向自己研究方向和感兴趣的），随缘分享，共同进步~论文主要内容：建立XL-MIMO模型，考虑NUSW信道和非平稳性；基于近场信道模型，分析性能（SNRscalinglaws，波束聚焦、速率、DoF）XL-MIMO设计问题：信道估计、波束码本、波束训练、DAMXL-MIMO信道特性变化：UPW➡NUSW空间平稳–>空间非平稳（可视区域
时序预测相关论文阅读笔记能力越小责任越小YA 论文阅读笔记时序预测 Transformer
笔记链接：【有道云笔记】读论文（记录）https://note.youdao.com/s/52ugLbot用于个人学习记录。
Your Diffusion Model is Secretly a Zero-Shot Classifier论文阅读笔记 Rising_Flashlight 论文阅读笔记计算机视觉
YourDiffusionModelisSecretlyaZero-ShotClassifier论文阅读笔记这篇文章我感觉在智源大会上听到无数个大佬讨论，包括OpenAISora团队负责人，谢赛宁，好像还有杨植麟。虽然这个文章好像似乎被引量不是特别高，但是和AI甚至人类理解很本质的问题很相关，即是不是要通过生成来构建理解的问题，文章的做法也很巧妙，感觉是一些学者灵机一动的产物，好好学习一个！摘要这
Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport论文阅读笔记猪猪想上树论文阅读笔记
ConditionalFlowMatching:Simulation-FreeDynamicOptimalTransport笔记发现问题连续正规化流（CNF）是一种有吸引力的生成式建模技术，但在基于模拟的最大似然训练中受到了限制。解决问题介绍一种新的条件流匹配（CFM)，一种针对CNFs的免模拟训练目标。具有稳定的回归目标，用于扩散模型中的随机流，但享有确定性流模型的有效推断。与扩散模型和CNF目
论文阅读笔记《SimpleShot: Revisiting Nearest-Neighbor Classification for Few-Shot Learning》深视论文阅读笔记 #小样本学习深度学习小样本学习
小样本学习&元学习经典论文整理||持续更新核心思想本文提出一种基于最近邻方法的小样本学习算法（SimpleShot），作者指出目前大量的小样本学习算法都采用了元学习的方案，而作者却发现使用简单的特征提取器+最近邻分类器的方法就能实现非常优异的小样本分类效果。本文首先用特征提取网络fθf_{\theta}fθ+线性分类器在一个基础数据集上对网络进行训练，将训练得到的特征提取网络增加一个简单的特征
【论文阅读笔记】（2015 ICML）Unsupervised Learning of Video Representations using LSTMs 小吴同学真棒学习人工智能 LSTM 动作识别无监督自监督 self-supervised
UnsupervisedLearningofVideoRepresentationsusingLSTMs（2015ICML）NitishSrivastava,ElmanMansimov,RuslanSalakhutdinovNotesContributionsOurmodelusesanencoderLSTMtomapaninputsequenceintoafixedlengthrepresent
使用动态网格的流体动画 Fluid Animation with Dynamic Meshes 论文阅读笔记 hijackedbycsdn Fluid Simulation 笔记
目录引言背景方法离散化离散化的导数算子速度插值广义的半拉格朗日步重新网格化双向流固耦合和质量守恒原文：Klingner,BryanM.,etal.“Fluidanimationwithdynamicmeshes.”ACMSIGGRAPH2006Papers.2006.820-825.引言使用[Alliezetal.,2005]的方法动态生成不规则的四面体网格根据边界的位置、边界的形状、基于流体和速
【论文阅读笔记】AutoAugment:Learning Augmentation Strategies from Data 少写代码少看论文多多睡觉 #论文阅读笔记论文阅读笔记
AutoAugment:LearningAugmentationStrategiesfromData摘要研究方法:本文描述了一种名为AutoAugment的简单程序，通过这个程序可以自动寻找改进的数据增强策略。研究设计了一个策略空间，其中策略包含多个子策略，在每个小批量数据中针对每张图片随机选择一个子策略。每个子策略由两个操作组成，每个操作是图像处理函数（如平移、旋转或剪切），以及应用这些函数的概
【论文阅读笔记】Contrastive Learning with Stronger Augmentations 少写代码少看论文多多睡觉 #论文阅读笔记论文阅读笔记
ContrastiveLearningwithStrongerAugmentations摘要基于提供的摘要，该论文的核心焦点是在对比学习领域提出的一个新框架——利用强数据增强的对比学习（ContrastiveLearningwithStrongerAugmentations，简称CLSA）。以下是对摘要的解析：问题陈述：表征学习（representationlearning）已在对比学习方法的推动
使用八叉树模拟水和烟雾 Simulating Water and Smoke with an Octree Data Structure 论文阅读笔记 hijackedbycsdn Fluid Simulation 笔记
原文：Losasso,Frank,FrédéricGibou,andRonFedkiw.“Simulatingwaterandsmokewithanoctreedatastructure.”Acmsiggraph2004papers.2004.457-462.引言这篇文章扩展了[Popinet2003]的工作，拓展到表面自由流，并且使得八叉树不受限制自适应网格划分的一个缺点是，它的模板不是均匀的，
PointMixer论文阅读笔记 ZHANG8023ZHEN 论文阅读笔记
MLP-mixer是最近很流行的一种网络结构，比起Transformer和CNN的节构笨重，MLP-mixer不仅节构简单，而且在图像识别方面表现优异。但是MLP-mixer在点云识别方面表现欠佳，PointMixer就是在保留了MLP-mixer优点的同时，还可以很好的处理点云问题。PointMixer可以很好的处理intra-set,inter-set,hierarchical-set的点云。
ztree设置禁用节点 3213213333332132 JavaScript ztree json setDisabledNode Ajax
ztree设置禁用节点的时候注意，当使用ajax后台请求数据,必须要设置为同步获取数据，否者会获取不到节点对象，导致设置禁用没有效果。 $(function(){ showTree(); setDisabledNode(); });
JVM patch by Taobao bookjovi java HotSpot
在网上无意中看到淘宝提交的hotspot patch，共四个，有意思，记录一下。 7050685：jsdbproc64.sh has a typo in the package name 7058036：FieldsAllocationStyle=2 does not work in 32-bit VM 7060619：C1 should respect inline and
将session存储到数据库中 dcj3sjt126com sql PHP session
CREATE TABLE sessions ( id CHAR(32) NOT NULL, data TEXT, last_accessed TIMESTAMP NOT NULL, PRIMARY KEY (id) ); <?php /** * Created by PhpStorm. * User: michaeldu * Date
Vector 171815164 vector
public Vector<CartProduct> delCart(Vector<CartProduct> cart, String id) { for (int i = 0; i < cart.size(); i++) { if (cart.get(i).getId().equals(id)) { cart.remove(i);
各连接池配置参数比较 g21121 连接池
排版真心费劲，大家凑合看下吧，见谅~ Druid DBCP C3P0 Proxool 数据库用户名称 Username Username User 数据库密码 Password Password Password 驱动名
[简单]mybatis insert语句添加动态字段 53873039oycg mybatis
mysql数据库,id自增,配置如下： <insert id="saveTestTb" useGeneratedKeys="true" keyProperty="id" parameterType=&
struts2拦截器配置云端月影 struts2拦截器
struts2拦截器interceptor的三种配置方法方法1. 普通配置法 <struts> <package name="struts2" extends="struts-default"> &
IE中页面不居中，火狐谷歌等正常 aijuans IE中页面不居中
问题是首页在火狐、谷歌、所有IE中正常显示，列表页的页面在火狐谷歌中正常，在IE6、7、8中都不中，觉得可能那个地方设置的让IE系列都不认识，仔细查看后发现，列表页中没写HTML模板部分没有添加DTD定义，就是<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3
String,int,Integer,char 几个类型常见转换 antonyup_2006 html sql .net
如何将字串 String 转换成整数 int? int i = Integer.valueOf(my_str).intValue(); int i=Integer.parseInt(str); 如何将字串 String 转换成Integer ? Integer integer=Integer.valueOf(str); 如何将整数 int 转换成字串 String ? 1.
PL/SQL的游标类型百合不是茶显示游标(静态游标)隐式游标游标的更新和删除 %rowtype ref游标(动态游标)
游标是oracle中的一个结果集,用于存放查询的结果; PL/SQL中游标的声明; 1,声明游标 2,打开游标(默认是关闭的); 3,提取数据 4,关闭游标注意的要点:游标必须声明在declare中,使用open打开游标,fetch取游标中的数据,close关闭游标隐式游标:主要是对DML数据的操作隐
JUnit4中@AfterClass @BeforeClass @after @before的区别对比 bijian1013 JUnit4 单元测试
一.基础知识 JUnit4使用Java5中的注解（annotation），以下是JUnit4常用的几个annotation： @Before：初始化方法对于每一个测试方法都要执行一次（注意与BeforeClass区别，后者是对于所有方法执行一次）@After：释放资源对于每一个测试方法都要执行一次（注意与AfterClass区别，后者是对于所有方法执行一次
精通Oracle10编程SQL(12)开发包 bijian1013 oracle 数据库 plsql
/* *开发包 *包用于逻辑组合相关的PL/SQL类型（例如TABLE类型和RECORD类型）、PL/SQL项（例如游标和游标变量）和PL/SQL子程序（例如过程和函数） */ --包用于逻辑组合相关的PL/SQL类型、项和子程序，它由包规范和包体两部分组成 --建立包规范：包规范实际是包与应用程序之间的接口，它用于定义包的公用组件，包括常量、变量、游标、过程和函数等 --在包规
【EhCache二】ehcache.xml配置详解 bit1129 ehcache.xml
在ehcache官网上找了多次，终于找到ehcache.xml配置元素和属性的含义说明文档了，这个文档包含在ehcache.xml的注释中！ ehcache.xml ： http://ehcache.org/ehcache.xml ehcache.xsd ： http://ehcache.org/ehcache.xsd ehcache配置文件的根元素是ehcahe ehcac
java.lang.ClassNotFoundException: org.springframework.web.context.ContextLoaderL 白糖_ java eclipse spring tomcat Web
今天学习spring+cxf的时候遇到一个问题：在web.xml中配置了spring的上下文监听器： <listener> <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class> </listener> 随后启动
angular.element boyitech AngularJS AngularJS API angular.element
angular.element 描述: 包裹着一部分DOM element或者是HTML字符串，把它作为一个jQuery元素来处理。（类似于jQuery的选择器啦）如果jQuery被引入了，则angular.element就可以看作是jQuery选择器，选择的对象可以使用jQuery的函数；如果jQuery不可用，angular.e
java-给定两个已排序序列，找出共同的元素。 bylijinnan java
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class CommonItemInTwoSortedArray { /** * 题目：给定两个已排序序列，找出共同的元素。 * 1.定义两个指针分别指向序列的开始。 * 如果指向的两个元素
sftp 异常，有遇到的吗？求解 Chen.H java jcraft auth jsch jschexception
com.jcraft.jsch.JSchException: Auth cancel at com.jcraft.jsch.Session.connect(Session.java:460) at com.jcraft.jsch.Session.connect(Session.java:154) at cn.vivame.util.ftp.SftpServerAccess.connec
[生物智能与人工智能]神经元中的电化学结构代表什么? comsci 人工智能
我这里做一个大胆的猜想,生物神经网络中的神经元中包含着一些化学和类似电路的结构,这些结构通常用来扮演类似我们在拓扑分析系统中的节点嵌入方程一样,使得我们的神经网络产生智能判断的能力,而这些嵌入到节点中的方程同时也扮演着"经验"的角色.... 我们可以尝试一下...在某些神经
通过LAC和CID获取经纬度信息 dai_lm lac cid
方法1：用浏览器打开http://www.minigps.net/cellsearch.html，然后输入lac和cid信息(mcc和mnc可以填0)，如果数据正确就可以获得相应的经纬度方法2：发送HTTP请求到http://www.open-electronics.org/celltrack/cell.php?hex=0&lac=<lac>&cid=&
JAVA的困难分析 datamachine java
前段时间转了一篇SQL的文章（http://datamachine.iteye.com/blog/1971896），文章不复杂，但思想深刻，就顺便思考了一下java的不足，当砖头丢出来，希望引点和田玉。 -----------------------------------------------------------------------------------------
小学5年级英语单词背诵第二课 dcj3sjt126com english word
money 钱 paper 纸 speak 讲，说 tell 告诉 remember 记得，想起 knock 敲，击，打 question 问题 number 数字，号码 learn 学会，学习 street 街道 carry 搬运，携带 send 发送，邮寄，发射 must 必须 light 灯，光线，轻的 front
linux下面没有tree命令 dcj3sjt126com linux
centos p安装 yum -y install tree mac os安装 brew install tree 首先来看tree的用法 tree 中文解释：tree 功能说明：以树状图列出目录的内容。语　　法：tree [-aACdDfFgilnNpqstux][-I <范本样式>][-P <范本样式
Map迭代方式，Map迭代，Map循环蕃薯耀 Map循环 Map迭代 Map迭代方式
Map迭代方式，Map迭代，Map循环 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年
Spring Cache注解+Redis hanqunfeng spring
Spring3.1 Cache注解依赖jar包：  <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-redis</artifactId>
Guava中针对集合的 filter和过滤功能 jackyrong filter
在guava库中，自带了过滤器(filter)的功能，可以用来对collection 进行过滤，先看例子： @Test public void whenFilterWithIterables_thenFiltered() { List<String> names = Lists.newArrayList("John"
学习编程那点事 lampcy 编程 android PHP html5
一年前的夏天，我还在纠结要不要改行，要不要去学php？能学到真本事吗？改行能成功吗？太多的问题，我终于不顾一切，下定决心，辞去了工作，来到传说中的帝都。老师给的乘车方式还算有效，很顺利的就到了学校，赶巧了，正好学校搬到了新校区。先安顿了下来，过了个轻松的周末，第一次到帝都，逛逛吧！接下来的周一，是我噩梦的开始，学习内容对我这个零基础的人来说，除了勉强完成老师布置的作业外，我已经没有时间和精力去
架构师之流处理---------bytebuffer的mark,limit和flip nannan408 ByteBuffer
1.前言。如题，limit其实就是可以读取的字节长度的意思，flip是清空的意思，mark是标记的意思。 2.例子. 例子代码: String str = "helloWorld"; ByteBuffer buff = ByteBuffer.wrap(str.getBytes()); Sy
org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1, column 1 Everyday都不同 $转义 el表达式
最近在做Highcharts的过程中，在写js时，出现了以下异常：严重: Servlet.service() for servlet jsp threw exception org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1,
用Java实现发送邮件到163 tntxia java实现
/* 在java版经常看到有人问如何用javamail发送邮件？如何接收邮件？如何访问多个文件夹等。问题零散，而历史的回复早已经淹没在问题的海洋之中。本人之前所做过一个java项目，其中包含有WebMail功能，当初为用java实现而对javamail摸索了一段时间，总算有点收获。看到论坛中的经常有此方面的问题，因此把我的一些经验帖出来，希望对大家有些帮助。此篇仅介绍用
探索实体类存在的真正意义 java小叶檀 POJO
一. 实体类简述实体类其实就是俗称的POJO,这种类一般不实现特殊框架下的接口，在程序中仅作为数据容器用来持久化存储数据用的 POJO（Plain Old Java Objects）简单的Java对象它的一般格式就是 public class A{ private String id; public Str