17420

Faster-RCNN解读材料优选

先吐槽一下，目前CSDN上的一些关于AI方面的文章都是靠为了蹭热度粗制滥造。骗人点进去，其实什么有价值的内容也没有，浪费大家时间。吐槽完毕。

1. 知乎《一文读懂Faster RCNN》，通过此文能对Faster RCNN有一个全面的了解。文章后面关于训练的部分讲的比较粗糙。

2. "Object Detection and Classification using R-CNNs"，强烈推荐。也很适合作为阅读代码的参考(说到代码，github上有很多faster-rcnn的实现。推荐看facebookresearch/maskrcnn-benchmark的版本。该版支持新版本pytorch，包含内容丰富且代码架构书写上很清晰，多阅读对python编程也有好处，典范！)。

以下是Object Detection and Classification using R-CNN的copy，文章写得很好，搬运来分享，以后有时间再翻译(你懂的~~~）

Object Detection and Classification using R-CNNs

March 11, 2018 ankur6ue Computer Vision, Machine Learning, object detection 27

In this post, I’ll describe in detail how R-CNN (Regions with CNN features), a recently introduced deep learning based object detection and classification method works. R-CNN’s have proved highly effective in detecting and classifying objects in natural images, achieving mAP scores far higher than previous techniques. The R-CNN method is described in the following series of papers by Ross Girshick et al.

R-CNN (Girshick et al. 2013)*
Fast R-CNN (Girshick 2015)*
Faster R-CNN (Ren et al. 2015)*

This post describes the final version of the R-CNN method described in the last paper. I considered at first to describe the evolution of the method from its first introduction to the final version, however that turned out to be a very ambitious undertaking. I settled on describing the final version in detail.

Fortunately, there are many implementations of the R-CNN algorithm available on the web in TensorFlow, PyTorch and other machine learning libraries. I used the following implementation:

https://github.com/ruotianluo/pytorch-faster-rcnn

Much of the terminology used in this post (for example the names of different layers) follows the terminology used in the code. Understanding the information presented in this post should make it much easier to follow the PyTorch implementation and make your own modifications.

Table of Contents

Post Organization
Image Pre-Processing
Network Organization
- Network Architecture
Implementation Details: Training
- Anchor Generation Layer
- Region Proposal Layer
  - Region Proposal Network
- Proposal Layer
- Anchor Target Layer
  - Calculating RPN Loss
- Calculating Classification Layer Loss
- Proposal Target Layer
- Crop Pooling
- Classification Layer
Implementation Details: Inference
Appendix
- ResNet 50 Network Architecture
- Non-Maximum Suppression (NMS)
Bibliography

Post Organization

Section 1 – Image Pre-Processing: In this section, we’ll describe the pre-processing steps that are applied to an input image. These steps include subtracting a mean pixel value and scaling the image. The pre-processing steps must be identical between training and inference
Section 2 – Network Organization: In this section, we’ll describe the three main components of the network – the “head” network, the region proposal network (RPN) and the classification network.
Section 3 – Implementation Details (Training): This is the longest section of the post and describes in detail the steps involved in training a R-CNN network
Section 4 – Implementation Details (Inference): In this section, we’ll describe the steps involved during inference – i.e., using the trained R-CNN network to identify promising regions and classify the objects in those regions.
Appendix: Here we’ll cover the details of some of the frequently used algorithms during the operation of a R-CNN such as non-maximum suppression and the details of the Resnet 50 architecture.

Image Pre-Processing

The following pre-processing steps are applied to an image before it is sent through the network. These steps must be identical for both training and inference. The mean vector (, one number corresponding to each color channel) is not the mean of the pixel values in the current image but a configuration value that is identical across all training and test images.

The default values for and parameters are 600 and 1000 respectively.

Network Organization

A R-CNN uses neural networks to solve two main problems:

Identify promising regions (Region of Interest – ROI) in an input image that are likely to contain foreground objects
Compute the object class probability distribution of each ROI – i.e., compute the probability that the ROI contains an object of a certain class. The user can then select the object class with the highest probability as the classification result.

R-CNNs consist of three main types of networks:

Head
Region Proposal Network (RPN)
Classification Network

R-CNNs use the first few layers of a pre-trained network such as ResNet 50 to identify promising features from an input image. Using a network trained on one dataset on a different problem is possible because neural networks exhibit “transfer learning” (Yosinski et al. 2014)*. The first few layers of the network learn to detect general features such as edges and color blobs that are good discriminating features across many different problems. The features learnt by the later layers are higher level, more problem specific features. These layers can either be removed or the weights for these layers can be fine-tuned during back-propagation. The first few layers that are initialized from a pre-trained network constitute the “head” network. The convolutional feature maps produced by the head network are then passed through the Region Proposal Network (RPN) which uses a series of convolutional and fully connected layers to produce promising ROIs that are likely to contain a foreground object (problem 1 mentioned above). These promising ROIs are then used to crop out corresponding regions from the feature maps produced by the head network. This is called “Crop Pooling”. The regions produced by crop pooling are then passed through a classification network which learns to classify the object contained in each ROI.

As an aside, you may notice that weights for a ResNet are initialized in a curious way:

1 2	n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels m.weight.data.normal_(0, math.sqrt(2. / n))

If you are interested in learning more about why this method works, read my post about initializing weights for convolutional and fully connected layers.

Network Architecture

The diagram below shows the individual components of the three network types described above. We show the dimensions of the input and output of each network layer which assists in understanding how data is transformed by each layer of the network. and represent the width and height of the input image (after pre-processing).

Implementation Details: Training

In this section, we’ll describe in detail the steps involved in training a R-CNN. Once you understand how training works, understanding inference is a lot easier as it simply uses a subset of the steps involved in training. The goal of training is to adjust the weights in the RPN and Classification network and fine-tune the weights of the head network (these weights are initialized from a pre-trained network such as ResNet). Recall that the job of the RPN network is to produce promising ROIs and the job of the classification network to assign object class scores to each ROI. Therefore, to train these networks, we need the corresponding ground truth i.e., the coordinates of the bounding boxes around the objects present in an image and the class of those objects. This ground truth comes from free to use image databases that come with an annotation file for each image. This annotation file contains the coordinates of the bounding box and the object class label for each object present in the image (the object classes are from a list of pre-defined object classes). These image databases have been used to support a variety of object classification and detection challenges. Two commonly used databases are:

PASCAL VOC: The VOC 2007 database contains 9963 training/validation/test images with 24,640 annotations for 20 object classes.
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

COCO (Common Objects in Context): The COCO dataset is much larger. It contains > 200K labelled images with 90 object categories.

I used the smaller PASCAL VOC 2007 dataset for my training. R-CNN is able to train both the region proposal network and the classification network in the same step.

Let’s take a moment to go over the concepts of “bounding box regression coefficients” and “bounding box overlap” that are used extensively in the remainder of this post.

Bounding Box Regression Coefficients (also referred to as “regression coefficients” and “regression targets”): One of the goals of R-CNN is to produce good bounding boxes that closely fit object boundaries. R-CNN produces these bounding boxes by taking a given bounding box (defined by the coordinates of the top left corner, width and height) and tweaking its top left corner, width and height by applying a set of “regression coefficients”. These coefficients are computed as follows (Appendix C of (Anon. 2014)*. Let the x, y coordinates of the top left corner of the target and original bounding box be denoted by respectively and the width/height of the target and original bounding box by respectively. Then, the regression targets (coefficients of the function that transform the original bounding box to the target box) are given as:
- . This function is readily invertible, i.e., given the regression coefficients and coordinates of the top left corner and the width and height of the original bounding box, the top left corner and width and height of the target box can be easily calculated. Note the regression coefficients are invariant to an affine transformation with no shear. This is an important point as while calculating the classification loss, the target regression coefficients are calculated in the original aspect ratio while the classification network output regression coefficients are calculated after the ROI pooling step on square feature maps (1:1 aspect ratio). This will become clearer when we discuss classification loss below.
Intersection over Union (IoU) Overlap: We need some measure of how close a given bounding box is to another bounding box that is independent of the units used (pixels etc) to measure the dimensions of a bounding box. This measure should be intuitive (two coincident bounding boxes should have an overlap of 1 and two non-overlapping boxes should have an overlap of 0) and fast and easy to calculate. A commonly used overlap measure is the “Intersection over Union (IoU) overlap, calculated as shown below.

With these preliminaries out of the way, lets now dive into the implementation details for training a R-CNN. In the software implementation, R-CNN execution is broken down into several layers, as shown below. A layer encapsulates a sequence of logical steps that can involve running data through one of the neural networks and other steps such as comparing overlap between bounding boxes, performing non-maxima suppression etc.

Anchor Generation Layer: This layer generates a fixed number of “anchors” (bounding boxes) by first generating 9 anchors of different scales and aspect ratios and then replicating these anchors by translating them across uniformly spaced grid points spanning the input image.
Proposal Layer: Transform the anchors according to the bounding box regression coefficients to generate transformed anchors. Then prune the number of anchors by applying non-maximum suppression (see Appendix) using the probability of an anchor being a foreground region
Anchor Target Layer: The goal of the anchor target layer is to produce a set of “good” anchors and the corresponding foreground/background labels and target regression coefficients to train the Region Proposal Network. The output of this layer is only used to train the RPN network and is not used by the classification layer. Given a set of anchors (produced by the anchor generation layer, the anchor target layer identifies promising foreground and background anchors. Promising foreground anchors are those whose overlap with some ground truth box is higher than a threshold. Background boxes are those whose overlap with any ground truth box is lower than a threshold. The anchor target layer also outputs a set of bounding box regressors i.e., a measure of how far each anchor target is from the closest bounding box. These regressors only make sense for the foreground boxes as there is no notion of “closest bounding box” for a background box.
RPN Loss: The RPN loss function is the metric that is minimized during optimization to train the RPN network. The loss function is a combination of:
- The proportion of bounding boxes produced by RPN that are correctly classified as foreground/background
- Some distance measure between the predicted and target regression coefficients.
Proposal Target Layer: The goal of the proposal target layer is to prune the list of anchors produced by the proposal layer and produce class specific bounding box regression targets that can be used to train the classification layer to produce good class labels and regression targets
ROI Pooling Layer: Implements a spatial transformation network that samples the input feature map given the bounding box coordinates of the region proposals produced by the proposal target layer. These coordinates will generally not lie on integer boundaries, thus interpolation based sampling is required.
Classification Layer: The classification layer takes the output feature maps produced by the ROI Pooling Layer and passes them through a series of convolutional layers. The output is fed through two fully connected layers. The first layer produces the class probability distribution for each region proposal and the second layer produces a set of class specific bounding box regressors.
Classification Loss: Similar to RPN loss, classification loss is the metric that is minimized during optimization to train the classification network. During back propagation, the error gradients flow to the RPN network as well, so training the classification layer modifies the weights of the RPN network as well. We’ll have more to say about this point later. The classification loss is a combination of:
- The proportion of bounding boxes produced by RPN that are correctly classified (as the correct object class)
- Some distance measure between the predicted and target regression coefficients.

We’ll now go through each of these layers in detail.

Anchor Generation Layer

The anchor generation layer produces a set of bounding boxes (called “anchor boxes”) of varying sizes and aspect ratios spread all over the input image. These bounding boxes are the same for all images i.e., they are agnostic of the content of an image. Some of these bounding boxes will enclose foreground objects while most won’t. The goal of the RPN network is to learn to identify which of these boxes are good boxes – i.e., likely to contain a foreground object and to produce target regression coefficients, which when applied to an anchor box turns the anchor box into a better bounding box (fits the enclosed foreground object more closely).

The diagram below demonstrates how these anchor boxes are generated.

Region Proposal Layer

Object detection methods need as input a “region proposal system” that produces a set of sparse (for example selective search (Anon.)*) or a dense (for example features used in deformable part models (Anon.)*) set of features. The first version of the R-CNN system used the selective search method for generating region proposal. In the current version (known as “Faster R-CNN”), a “sliding window” based technique (described in the previous section) is used to generate a set of dense candidate regions and then a neural network driven region proposal network is used to rank region proposals according to the probability of a region containing a foreground object. The region proposal layer has two goals:

From a list of anchors, identify background and foreground anchors
Modify the position, width and height of the anchors by applying a set of “regression coefficients” to improve the quality of the anchors (for example, make them fit the boundaries of objects better)

The region proposal layer consists of a Region Proposal Network and three layers – Proposal Layer, Anchor Target Layer and Proposal Target Layer. These three layers are described in detail in the following sections.

Region Proposal Network

The region proposal layer runs feature maps produced by the head network through a convolutional layer (called rpn_net in code) followed by RELU. The output of rpn_net is run through two (1,1) kernel convolutional layers to produce background/foreground class scores and probabilities and corresponding bounding box regression coefficients. The stride length of the head network matches the stride used while generating the anchors, so the number of anchor boxes are in 1-1 correspondence with the information produced by the region proposal network (number of anchor boxes = number of class scores = number of bounding box regression coefficients = )

Proposal Layer

The proposal layer takes the anchor boxes produced by the anchor generation layer and prunes the number of boxes by applying non-maximum suppression based on the foreground scores (see appendix for details). It also generates transformed bounding boxes by applying the regression coefficients generated by the RPN to the corresponding anchor boxes.

Anchor Target Layer

The goal of the anchor target layer is to select promising anchors that can be used to train the RPN network to:

distinguish between foreground and background regions and
generate good bounding box regression coefficients for the foreground boxes.

It is useful to first look at how the RPN Loss is calculated. This will reveal the information needed to calculate the RPN loss which makes it easy to follow the operation of the Anchor Target Layer.

Calculating RPN Loss

Remember the goal of the RPN layer is to generate good bounding boxes. To do so from a set of anchor boxes, the RPN layer must learn to classify an anchor box as background or foreground and calculate the regression coefficients to modify the position, width and height of a foreground anchor box to make it a “better” foreground box (fit a foreground object more closely). RPN Loss is formulated in such a way to encourage the network to learn this behaviour.

RPN loss is a sum of the classification loss and bounding box regression loss. The classification loss uses cross entropy loss to penalize incorrectly classified boxes and the regression loss uses a function of the distance between the true regression coefficients (calculated using the closest matching ground truth box for a foreground anchor box) and the regression coefficients predicted by the network (see rpn_bbx_pred_net in the RPN network architecture diagram).

Classification Loss:

cross_entropy(predicted _class, actual_class)

Bounding Box Regression Loss:

Sum over the regression losses for all foreground anchors. Doing this for background anchors doesn’t make sense as there is no associated ground truth box for a background anchor

This shows how the regression loss for a given foreground anchor is calculated. We take the difference between the predicted (by the RPN) and target (calculated using the closest ground truth box to the anchor box) regression coefficients. There are four components – corresponding to the coordinates of the top left corner and the width/height of the bounding box. The smooth L1 function is defined as follows:

Here is chosen arbitrarily (set to 3 in my code). Note that in the python implementation, a mask array for the foreground anchors (called “bbox_inside_weights”) is used to calculate the loss as a vector operation and avoid for-if loops.

Thus, to calculate the loss we need to calculate the following quantities:

Class labels (background or foreground) and scores for the anchor boxes
Target regression coefficients for the foreground anchor boxes

We’ll now follow the implementation of the anchor target layer to see how these quantities are calculated. We first select the anchor boxes that lie within the image extent. Then, good foreground boxes are selected by first computing the IoU (Intersection over Union) overlap of all anchor boxes (within the image) with all ground truth boxes. Using this overlap information, two types of boxes are marked as foreground:

type A: For each ground truth box, all foreground boxes that have the max IoU overlap with the ground truth box
type B: Anchor boxes whose maximum overlap with some ground truth box exceeds a threshold

these boxes are shown in the image below:

Note that only anchor boxes whose overlap with some ground truth box exceeds a threshold are selected as foreground boxes. This is done to avoid presenting the RPN with the “hopeless learning task” of learning the regression coefficients of boxes that are too far from the best match ground truth box. Similarly, boxes whose overlap are less than a negative threshold are labeled background boxes. Not all boxes that are not foreground boxes are labeled background. Boxes that are neither foreground or background are labeled “don’t care”. These boxes are not included in the calculation of RPN loss.

There are two additional thresholds related to the total number of background and foreground boxes we want to achieve and the fraction of this number that should be foreground. If the number of foreground boxes that pass the test exceeds the threshold, we randomly mark the excess foreground boxes to “don’t care”. Similar logic is applied to the background boxes.

Next, we compute bounding box regression coefficients between the foreground boxes and the corresponding ground truth box with maximum overlap. This is easy and one just needs to follow the formula to calculate the regression coefficients.

This concludes our discussion of the anchor target layer. To recap, let’s list the parameters and input/output for this layer:

Parameters:

TRAIN.RPN_POSITIVE_OVERLAP: Threshold used to select if an anchor box is a good foreground box (Default: 0.7)
TRAIN.RPN_NEGATIVE_OVERLAP: If the max overlap of a anchor from a ground truth box is lower than this thershold, it is marked as background. Boxes whose overlap is > than RPN_NEGATIVE_OVERLAP but < RPN_POSITIVE_OVERLAP are marked “don’t care”.(Default: 0.3)
TRAIN.RPN_BATCHSIZE: Total number of background and foreground anchors (default: 256)
TRAIN.RPN_FG_FRACTION: fraction of the batch size that is foreground anchors (default: 0.5). If the number of foreground anchors found is larger than TRAIN.RPN_BATCHSIZE TRAIN.RPN_FG_FRACTION, the excess (indices are selected randomly) is marked “don’t care”.

Input:

RPN Network Outputs (predicted foreground/background class labels, regression coefficients)
Anchor boxes (generated by the anchor generation layer)
Ground truth boxes

Output

Good foreground/background boxes and associated class labels
Target regression coefficients

The other layers, proposal target layer, ROI Pooling layer and classification layer are meant to generate the information needed to calculate classification loss. Just as we did for the anchor target layer, let’s first look at how classification loss is calculated and what information is needed to calculate it

Calculating Classification Layer Loss

Similar to the RPN Loss, classification layer loss has two components – classification loss and bounding box regression loss

The key difference between the RPN layer and the classification layer is that while the RPN layer dealt with just two classes – foreground and background, the classification layer deals with all the object classes (plus background) that our network is being trained to classify.

The classification loss is the cross entropy loss with the true object class and predicted class score as the parameters. It is calculated as shown below.

The bounding box regression loss is also calculated similar to the RPN except now the regression coefficients are class specific. The network calculates regression coefficients for each object class. The target regression coefficients are obviously only available for the correct class which is the object class of the ground truth bounding box that has the maximum overlap with a given anchor box. While calculating the loss, a mask array which marks the correct object class for each anchor box is used. The regression coefficients for the incorrect object classes are ignored. This mask array allows the computation of loss to be a matrix multiplication as opposed to requiring a for-each loop.

Thus the following quantities are needed to calculate classification layer loss:

Predicted class labels and bounding box regression coefficients (these are outputs of the classification network)
class labels for each anchor box
Target bounding box regression coefficients

Let’s now look at how these quantities are calculated in the proposal target and classification layers.

Proposal Target Layer

The goal of the proposal target layer is to select promising ROIs from the list of ROIs output by the proposal layer. These promising ROIs will be used to perform crop pooling from the feature maps produced by the head layer and passed to the rest of the network (head_to_tail) that calculates predicted class scores and box regression coefficients.

Similar to the anchor target layer, it is important to select good proposals (those that have significant overlap with gt boxes) to pass on to the classification layer. Otherwise, we’ll be asking the classification layer to learn a “hopeless learning task”.

The proposal target layer starts with the ROIs computed by the proposal layer. Using the max overlap of each ROI with all ground truth boxes, it categorizes the ROIs into background and foreground ROIs. Foreground ROIs are those for which max overlap exceeds a threshold (TRAIN.FG_THRESH, default: 0.5). Background ROIs are those whose max overlap falls between TRAIN.BG_THRESH_LO and TRAIN.BG_THRESH_HI (default 0.1, 0.5 respectively). This is an example of “hard negative mining” used to present difficult background examples to the classifier.

There is some additional logic that tries to make sure that the total number of foreground and background region is constant. In case too few background regions are found, it tries to fill in the batch by randomly repeating some background indices to make up for the shortfall.

Next, bounding box target regression targets are computed between each ROI and the closest matching ground truth box (this includes the background ROIs also, as an overlapping ground truth box exists for these ROIs also). These regression targets are expanded for all classes as shown in the figure below.

the bbox_inside_weights array acts as a mask. It is 1 only for the correct class for each foreground ROI. It is zero for the background ROIs as well. Thus, while computing the bounding box regression component of the classification layer loss, only the regression coefficients for the foreground regions are taken into account. This is not the case for the classification loss – the background ROIs are included as well as they belong to the “background” class.

Input:

ROIs produced by the proposal layer
ground truth information

Output:

Selected foreground and background ROIs that meet overlap criteria.
Class specific target regression coefficients for the ROIs

Parameters:

TRAIN.FG_THRESH: (default: 0.5) Used to select foreground ROIs. ROIs whose max overlap with a ground truth box exceeds FG_THRESH are marked foreground
TRAIN.BG_THRESH_HI: (default 0.5)
TRAIN.BG_THRESH_LO: (default 0.1) These two thresholds are used to select background ROIs. ROIs whose max overlap falls between BG_THRESH_HI and BG_THRESH_LO are marked background
TRAIN.BATCH_SIZE: (default 128) Maximum number of foreground and background boxes selected.
TRAIN.FG_FRACTION: (default 0.25). Number of foreground boxes can’t exceed BATCH_SIZE*FG_FRACTION

Crop Pooling

Proposal target layer produces promising ROIs for us to classify along with the associated class labels and regression coefficients that are used during training. The next step is to extract the regions corresponding to these ROIs from the convolutional feature maps produced by the head network. The extracted feature maps are then run through the rest of the network (“tail” in the network diagram shown above) to produce object class probability distribution and regression coefficients for each ROI. The job of the Crop Pooling layer is to perform region extraction from the convolutional feature maps.

The key ideas behind crop pooling are described in the paper on “Spatial Transformation Networks” (Anon. 2016)*. The goal is to apply a warping function (described by a affine transformation matrix) to an input feature map to output a warped feature map. This is shown in the figure below

There are two steps involved in crop pooling:

For a set of target coordinates, apply the given affine transformation to produce a grid of source coordinates.
. Here are height/width normalized coordinates (similar to the texture coordinates used in graphics), so .
In the second step, the input (source) map is sampled at the source coordinates to produce the output (destination) map. In this step, each coordinate defines the spatial location in the input where a sampling kernel (for example bi-linear sampling kernel) is applied to get the value at a particular pixel in the output feature map.

The sampling methodology described in the spatial transformation gives a differentiable sampling mechanism allowing for loss gradients to flow back to the input feature map and the sampling grid coordinates.

Fortunately, crop pooling is implementated in PyTorch and the API consists of two functions that mirror these two steps. torch.nn.functional.affine_grid takes an affine transformation matrix and produces a set of sampling coordinates and torch.nn.functional.grid_sample samples the grid at those coordinates. Back-propagating gradients during the backward step is handled automatically by pyTorch.

To use crop pooling, we need to do the following:

Divide the ROI coordinates by the stride length of the “head” network. The coordinates of the ROIs produced by the proposal target layer are in the original image space (! 800 600). To bring these coordinates into the space of the output feature maps produced by “head”, we must divide them by the stride length (16 in the current implementation).
To use the API shown above, we need the affine transformation matrix. This affine transformation matrix is computed as shown below
We also need the number of points in the and dimensions on the target feature map. This is provided by the configuration parameter cfg.POOLING_SIZE (default 7). Thus, during crop pooling, non-square ROIs are used to crop out regions from the convolution feature map which are warped to square windows of constant size. This warping must be done as the output of crop pooling is passed to further convolutional and fully connected layers which need input of a fixed dimension.

Classification Layer

The crop pooling layer takes the ROI boxes output by the proposal target layer and the convolutional feature maps output by the “head” network and outputs square feature maps. The feature maps are then passed through layer 4 of ResNet following by average pooling along the spatial dimensions. The result (called “fc7” in code) is a one-dimensional feature vector for each ROI. This process is shown below.

The feature vector is then passed through two fully connected layers – bbox_pred_net and cls_score_net. The cls_score_net layer produces the class scores for each bounding box (which can be converted into probabilities by applying softmax). The bbox_pred_net layer produces the class specific bounding box regression coefficients which are combined with the original bounding box coordinates produced by the proposal target layer to produce the final bounding boxes. These steps are shown below.

It’s good to recall the difference between the two sets of bounding box regression coefficients – one set produced by the RPN network and the second set produced by the classification network. The first set is used to train the RPN layer to produce good foreground bounding boxes (that fit more tightly around object boundaries). The target regression coefficients i.e., the coefficients needed to align a ROI box with its closest matching ground truth bounding box are generated by the anchor target layer. It is difficult to identify precisely how this learning takes place, but I’d imagine the RPN convolutional and fully connected layers learn how to interpret the various image features generated by the neural network into deciphering good object bounding boxes. When we consider Inference in the next section, we’ll see how these regression coefficients are used.

The second set of bounding box coefficients is generated by the classification layer. These coefficients are class specific, i.e., one set of coefficients are generated per object class for each ROI box. The target regression coefficients for these are generated by the proposal target layer. Note that the classification network operates on square feature maps that are a result of the affine transformation (described above) applied to the head network output. However since the regression coefficients are invariant to an affine transformation with no shear, the target regression coefficients computed by the proposal target layer can be compared with those produced by the classification network and act as a valid learning signal. This point seems obvious in hindsight, but took me some time to understand.

It is interesting to note that while training the classification layer, the error gradients propagate to the RPN network as well. This is because the ROI box coordinates used during crop pooling are themselves network outputs as they are a result of applying the regression coefficients generated by the RPN network to the anchor boxes. During back-propagation, the error gradients will propagate back through the crop-pooling layer to the RPN layer. Calculating and applying these gradients would be quite tricky to implement, however thankfully the crop pooling API is provided by PyTorch as a built-in module and the details of calculating and applying the gradients are handled internally. This point is discussed in Section 3.2 (iii) of the Faster RCNN paper (Ren et al. 2015)*.

Implementation Details: Inference

The steps carried out during inference are shown below

Anchor target and proposal target layers are not used. The RPN network is supposed to have learnt how to classify the anchor boxes into background and foreground boxes and generate good bounding box coefficients. The proposal layer simply applies the bounding box coefficients to the top ranking anchor boxes and performs NMS to eliminate boxes with a large amount of overlap. The output of these steps are shown below for additional clarity. The resulting boxes are sent to the classification layer where class scores and class specific bounding box regression coefficients are generated.

The red boxes show the top 6 anchors ranked by score. Green boxes show the anchor boxes after applying the regression parameters computed by the RPN network. The green boxes appear to fit the underlying object more tightly. Note that after applying the regression parameters, a rectangle remains a rectangle, i.e., there is no shear. Also note the significant overlap between rectangles. This redundancy is addressed by applying non-maxima suppression

Red boxes show the top 5 bounding boxes before NMS, green boxes show the top 5 boxes after NMS. By suppressing overlapping boxes, other boxes (lower in the scores list) get a chance to move up

From the final classification scores array (dim: n, 21), we select the column corresponding to a certain foreground object, say car. Then, we select the row corresponding to the max score in this array. This row corresponds to the proposal that is most likely to be a car. Let the index of this row be car_score_max_idx Now, let the array of final bounding box coordinates (after applying the regression coefficients) be bboxes (dim: n,21*4). From this array, we select the row corresponding to car_score_max_idx. We expect that the bounding box corresponding to the car column should fit the car in the test image better than the other bounding boxes (which correspond to the wrong object classes). This is indeed the case. The red boxcorresponds to the original proposal box, the blue box is the calculated bounding box for the car class and the white boxes correspond to the other (incorrect) foreground classes. It can be seen that the blue box fits the actual car better than the other boxes.

For showing the final classification results, we apply another round of NMS and apply an object detection threshold to the class scores. We then draw all transformed bounding boxes corresponding to the ROIs that meet the detection threshold. The result is shown below.

Appendix

ResNet 50 Network Architecture

Non-Maximum Suppression (NMS)

Non-maximum suppression is a technique used to reduce the number of candidate boxes by eliminating boxes that overlap by an amount larger than a threhold. The boxes are first sorted by some criteria (usually the y coordinate of the bottom right corner). We then go through the list of boxes and suppress those boxes whose IoU overlap with the box under consideration exceeds a threshold. Sorting the boxes by the y coordinate results in the lowest box among a set of overlapping boxes being retained. This may not always be the desired outcome. NMS used in R-CNN sorts the boxes by the foreground score. This results in the box with the highest score among a set of overlapping boxes being retained. The figures below show the difference between the two approaches. The numbers in black are the foreground scores for each box. The image on the right shows the result of applying NMS to the image on left. The first figure uses standard NMS (boxes are ranked by y coordinate of bottom right corner). This results in the box with a lower score being retained. The second figure uses modified NMS (boxes are ranked by foreground scores). This results in the box with the highest foreground score being retained, which is more desirable. In both cases, the overlap between the boxes is assumed to be higher than the NMS overlap threhold.

Bibliography

Anon. 2014. . October 23. https://arxiv.org/pdf/1311.2524.pdf.

Anon. 2016. . February 5. https://arxiv.org/pdf/1506.02025.pdf.

Anon. . http://link.springer.com/article/10.1007/s11263-013-0620-5.

Anon. Object Detection with Discriminatively Trained Part-Based Models - IEEE Journals & Magazine. https://doi.org/10.1109/TPAMI.2009.167.

Girshick, Ross. 2015. Fast R-CNN. arXiv.org. April 30. https://arxiv.org/abs/1504.08083.

Girshick, Ross, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2013. Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv.org. November 11. https://arxiv.org/abs/1311.2524.

Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv.org. June 4. https://arxiv.org/abs/1506.01497.

Yosinski, Jason, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? arXiv.org. November 6. https://arxiv.org/abs/1411.1792.

你可能感兴趣的:(算法,机器学习,计算机视觉,深度学习,faster-rcnn)

TensorFlow为AI人工智能航空航天领域带来变革 AI原生应用开发人工智能 tensorflow python ai
TensorFlow为AI人工智能航空航天领域带来变革关键词：TensorFlow、人工智能、航空航天、机器学习、深度学习、神经网络、自主系统摘要：本文探讨了TensorFlow这一强大的机器学习框架如何推动航空航天领域的创新。我们将从基础概念入手，逐步深入分析TensorFlow在航天器导航、卫星图像处理、飞行器自主决策等关键应用场景中的实现原理。通过实际代码示例和架构图解，展示TensorFl
多语言文本分类在AI应用中的实践 AI原生应用开发人工智能分类数据挖掘 ai
多语言文本分类在AI应用中的实践关键词：多语言文本分类、自然语言处理、机器学习、深度学习、BERT、迁移学习、跨语言模型摘要：本文深入探讨多语言文本分类在AI领域的应用实践。我们将从基础概念出发，逐步讲解其核心原理、技术架构和实现方法，并通过实际案例展示如何构建一个高效的多语言文本分类系统。文章将涵盖从传统机器学习方法到最先进的深度学习技术，特别关注跨语言迁移学习在实际业务场景中的应用。背景介绍目
从零开始构建AI原生应用的认知架构 AI原生应用开发 AI-native 架构 ai
从零开始构建AI原生应用的认知架构关键词：AI原生应用、认知架构、机器学习、知识图谱、神经网络、智能决策、系统设计摘要：本文深入探讨如何从零开始构建AI原生应用的认知架构。我们将从基本概念出发，逐步解析认知架构的核心组件，包括知识表示、推理机制和学习能力等。通过生动的比喻和实际代码示例，帮助读者理解如何设计一个能够模拟人类认知过程的AI系统。文章还将介绍当前最先进的认知架构模型，并展望未来发展趋势
【华为OD机试真题 Python语言】135、采样过滤 | 机试真题+思路参考+代码解析 KFickle 华为od python 华为华为OD机试真题采样过滤
文章目录一、题目题目描述输入输出样例1二、思路参考三、代码参考作者：鲨鱼狼臧个人博客首页：鲨鱼狼臧专栏介绍：2024华为OD机试真题，使用Python进行解答，专栏每篇文章都包括真题，思路参考，代码分析，思路参考超过百字，欢迎大家订阅学习一、题目题目描述在做物理实验时，为了计算物体移动的速率，通过相机等工具周期性的采样物体移动距离。由于工具故障，采样数据存在误差甚至相误的情况。需要通过一个算法过滤
MATLAB在工业缺陷检测中的应用
本文还有配套的精品资源，点击获取简介：缺陷检测、伤痕检测、瑕疵检测和划痕检测是工业自动化和质量控制中至关重要的环节，MATLAB作为一种高级编程环境，在图像处理和计算机视觉任务中扮演了重要角色。本文详细介绍了如何使用MATLAB实现这些检测过程，包括图像采集、预处理、特征提取和决策制定等步骤。通过介绍内置图像处理工具箱中的应用，色彩转换技术、边缘检测算法以及形态学操作等方法，我们阐述了如何识别和处
10、区块链技术及其应用吃瓜不吐籽595 解密《质量4.0与数字化转型》区块链比特币去中心化
区块链技术及其应用1.区块链简介区块链技术作为一种分布式账本，近年来受到了广泛关注。它不仅仅是一种技术革新，更是一种思维模式的转变。区块链的核心在于其去中心化、不可篡改和透明的特性，使得它在多个领域都有广泛的应用前景。区块链的基本概念区块链本质上是一个共享的、不可变的数字账本，记录了所有参与者之间的交易。每个区块包含了一系列交易记录，并通过加密算法与前一个区块相连，形成一条链。这种结构确保了数据的
【缺陷检测】基于计算机视觉实现电路板智能检测系统附Matlab代码 matlab科研助手计算机视觉 matlab 人工智能
✅作者简介：热爱科研的Matlab仿真开发者，擅长数据处理、建模仿真、程序设计、完整代码获取、论文复现及科研仿真。往期回顾关注个人主页：Matlab科研工作室个人信条：格物致知,完整Matlab代码及仿真咨询内容私信。内容介绍随着信息技术的飞速发展和电子产品的日益普及，印刷电路板（PCB）作为电子产品的核心组件，其质量直接关系到整个系统的性能和可靠性。传统的电路板检测主要依赖人工目检，存在效率低下
学习嵌入式第六天缺口212 学习算法数据结构
一.数组的排序1.冒泡排序冒泡排序是一种简单的排序算法，其核心思想是通过重复遍历待排序的数组，每次比较相邻的两个元素，如果它们的顺序错误就把它们交换过来，直到没有元素需要交换为止。从数组的第一个元素开始，依次比较相邻的两个元素。如果前一个元素大于后一个元素，则交换这两个元素。每完成一轮遍历，最大的元素会“冒泡”到数组的末尾。之后缩小遍历范围（不再考虑已排好的末尾元素），重复上述过程，直到所有元素有
华为OD机试2025B卷真题题库目录｜机考题库 + 算法考点详解（Python/JS/C/C++）
专栏导读本专栏收录于《华为OD机试真题（Python/JS/C/C++）》。刷的越多，抽中的概率越大，私信哪吒，备注华为OD，加入华为OD刷题交流群，每一题都有详细的答题思路、详细的代码注释、3个测试用例、为什么这道题采用XX算法、XX算法的适用场景，发现新题目，随时更新。2025年5月12日，华为官方已经将华为OD机试（A卷）切换为B卷。目前正在考的是B卷，按照华为OD往常的操作，B卷题目是由往
时间轮算法
据说是复杂度O(1)的牛逼算法，所以抽时间学习学习。现在要实现一个定时器，这个定时器控制很多任务。该怎么做呢？第一反应是任务做成一个队列，属性有个时间，每次计时后将该属性减1，到0的时候就执行。这种方式可行，但是效率不高，因为每次都要遍历所有任务，所以时间复杂度是O(N)。优化的方法是什么呢？有点类似哈希表，增加一个时间队列，同时将任务预先排放在一个时间队列中。如果是100秒的时间范围，那么就是1
Orange3实战教程：图像分析---图像嵌入 err2008 Orange3 实战教程数据挖掘神经网络自然语言处理机器学习计算机视觉深度学习 orange3中文版
图像嵌入通过深度神经网络实现图像嵌入。输入图像：图像列表。输出嵌入向量：用数字向量表示的图像。跳过的图像：未计算嵌入向量的图像列表。图像嵌入功能读取图像并将其上传至远程服务器或本地计算。深度学习模型用于为每张图像计算特征向量。该功能返回一个增强的数据表，包含额外的列（图像描述符）。图像可以通过导入图像小部件导入，也可以通过电子表格中的图像路径导入。在这种情况下，包含图像路径的列需要一个三行表头，第
一文看懂NTP协议 Neolock 网络协议网络协议 ntp 网络
最近碰到一个NTP协议相关的题，卡了很久，才发现一直在用的NTP协议完全不了解他的原理，遂学习并总结一下1.NTP概述NTP（NetworkTimeProtocol）是一种用于同步计算机系统时钟的网络协议，旨在通过分层架构和精密算法，将设备时间同步至全球协调时间（UTC），精度可达毫秒甚至微秒级。其核心目标是通过减少时钟偏差和网络延迟影响，确保分布式系统的时间一致性2.NTP分层架构（Stratu
GDPR/等保2.0合规指南：企业商城系统必备的10大安全机制万米商云安全数据库网络
在数字经济全球化与数据主权博弈的双重背景下，企业商城系统作为承载用户隐私、交易数据与商业机密的核心载体，需同时满足欧盟《通用数据保护条例》（GDPR）与中国《网络安全等级保护2.0》的复合合规要求。本文从技术实现视角，解析企业商城系统必备的10大安全机制及其实施要点。一、全链路加密传输1、HTTPS强制部署采用OV/EV型SSL证书实现TLS1.3协议升级，支持国际RSA2048位或国密SM2算法
Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms 钟屿深度学习
用于学习和评估去模糊算法的真实世界模糊数据集摘要近年来，针对相机抖动和物体运动模糊的单幅图像去模糊提出了许多基于学习的方法。为了将这些方法推广到真实世界的模糊场景，包含大量真实模糊图像及其对应的清晰真实图像（groundtruth）的数据集至关重要。然而，目前尚不存在这样的数据集，因此所有现有方法都依赖于合成数据集，这导致它们无法有效去除真实世界图像的模糊。在本工作中，我们提出了一个用于学习和评估
华为OD机试 2025 B卷 - 最大括号深度 (C++ & Python & JAVA & JS & GO) 无限码力华为OD机试真题刷题笔记华为od 华为OD机试华为OD机试 2025B卷华为OD2025B卷华为机试2025B卷
最大括号深度华为OD机试真题目录点击查看:华为OD机试2025B卷真题题库目录｜机考题库+算法考点详解华为OD机试2025B卷100分题型题目描述现有一字符串仅由‘(‘，’)’，’{‘，’}’，’[‘，’]’六种括号组成。若字符串满足以下条件之一，则为无效字符串：任一类型的左右括号数量不相等；存在未按正确顺序（先左后右）闭合的括号。输出括号的最大嵌套深度，若字符串无效则输出0。0≤字符串长度≤10
三生原理的颠覆性价值（无同类研究完全对可标）？
AI辅助创作：一、‌方法论层面的开创性‌‌动态嵌套解经路径‌该研究突破传统注疏模式，将《周易》“三生万物”等命题与分形几何、递归生成系统结合，构建可验证的数学映射模型（如素数生成公式p=3(2n+1)+2(2n+m+1)），使经典文本的哲学命题转化为算法逻辑，开创“批判性再解读-科学化重构”双轨框架。‌跨文化符号系统互译‌通过“阴阳元参数化联动”工具（如素数2为阴元、3为阳元），将传统文化核心符号
【人工智能99问】卷积神经网络（CNN）的结构和原理是什么？(10/99)
文章目录卷积神经网络（CNN）的结构及原理一、CNN的核心结构1.输入层（InputLayer）2.卷积层（ConvolutionalLayer）2.卷积层的核心机制：局部感受野与权值共享3.池化层（PoolingLayer）4.全连接层（FullyConnectedLayer）5.输出层（OutputLayer）6.辅助层二、CNN的工作原理三、CNN的使用场景1.计算机视觉（最核心场景）2.其
Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring 论文阅读钟屿论文阅读计算机视觉人工智能
用于动态场景去模糊的深度多尺度卷积神经网络摘要针对一般动态场景的非均匀盲去模糊是一个具有挑战性的计算机视觉问题，因为模糊不仅来源于多个物体运动，还来源于相机抖动和场景深度变化。为了去除这些复杂的运动模糊，传统的基于能量优化的方法依赖于简单的假设，例如模糊核是部分均匀或局部线性的。此外，最近的基于机器学习的方法也依赖于在这些假设下生成的合成模糊数据集。这使得传统的去模糊方法在模糊核难以近似或参数化的
基于Paillier同态加密算法的金融数据安全共享机制研究【附数据】
金融数据分析与建模专家金融科研助手|论文指导|模型构建✨专业领域：金融数据处理与分析量化交易策略研究金融风险建模投资组合优化金融预测模型开发深度学习在金融中的应用擅长工具：Python/R/MATLAB量化分析机器学习模型构建金融时间序列分析蒙特卡洛模拟风险度量模型金融论文指导内容：金融数据挖掘与处理量化策略开发与回测投资组合构建与优化金融风险评估模型期刊论文✅具体问题可以私信或查看文章底部二维码
吴恩达机器学习cs229-学习笔记-更新中是娜个二叉树！机器学习学习笔记
吴恩达机器学习cs22901基础概念语言：Matlab/python监督学习定义：获取一组数据集拟合数据从X到Y的映射回归问题：预测的Y是连续的，Y是实数分类问题：分类指的是Y取离散值，输出是离散的两组，正示例和负示例，把所有样本推到这条直线上，用0，1，标识逻辑回归算法，拟合直线区分正，负示例处理相对大量特征的回归算法或者分类算法支持向量机算法：它使用的不是1,2,3,10个输入特征，而是使用无
YOLOv8实现手写数字识别系统：从MNIST到实时摄像头检测
在深度学习领域，手写数字识别是一个经典问题，也是入门计算机视觉的重要案例。本文将介绍一个基于YOLOv8和MNIST数据集的手写数字识别系统，该系统不仅能识别静态图像中的数字，还能通过摄像头实时检测手写数字。个人博客：YOLOv8实现手写数字识别系统：从MNIST到实时摄像头检测-iDing's博客项目概述这个项目结合了传统的MNIST数据集和现代的目标检测算法YOLOv8，实现了以下功能：将MN
「日拱一码」033 机器学习——严格划分胖达不服输「日拱一码」机器学习人工智能严格划分组划分
目录简单随机划分（train_test_split）分组划分（GroupSplitting）简单分组划分(GroupSplitting)分层分组划分(StratifiedGroupSplitting)交叉验证法（Cross-Validation）分组K折交叉验证（GroupKFold）留一组法（LeaveOneGroupOut）简单随机划分（train_test_split）简单随机分组通过随机分
基于深度学习的手写数字和符号识别系统：YOLOv5/v6/v7/v8/v10模型实现与UI界面集成 YOLO实战营深度学习 YOLO ui 人工智能目标检测计算机视觉
1.引言随着人工智能和深度学习技术的发展，手写数字和符号识别已经成为计算机视觉领域的重要研究方向。手写识别在很多实际应用中扮演着关键角色，例如邮政编码识别、表单自动处理和智能教育系统等。传统的手写识别方法通常依赖于复杂的特征工程，而深度学习则能够自动从数据中学习到特征，极大地提高了识别精度和速度。本文将介绍如何构建一个基于YOLO系列模型（YOLOv5、YOLOv6、YOLOv7、YOLOv8、Y
从零开始：搭建你的人工智能开发环境人工智能教程人工智能 YOLO 机器学习 transformer 线性回归动态规划排序算法
前言在人工智能和机器学习的旅程中，一个稳定且高效的开发环境是成功的关键第一步。无论是初学者还是经验丰富的开发者，一个配置良好的开发环境都能大大提高工作效率，减少遇到的问题。本文将从零开始，逐步指导你如何搭建一个完整的人工智能开发环境，包括操作系统选择、Python安装、常用库的配置以及开发工具的选择。一、选择合适的操作系统（一）主流操作系统介绍在搭建人工智能开发环境时，首先需要选择一个合适的操作系
Python打卡Day11 常见的调参方式
核心知识：1.模型=算法+实例化设置的外参（超参数）+训练得到的内参2.只要调参就需要考2次所以如果不做交叉验证，就需要划分验证集和测试集，但是很多调参方法中都默认有交叉验证，所以实际中可以省去划分验证集和测试集的步骤基线模型（基准模型）:首先运行一个使用默认参数的模型，记录其性能作为比较的基准。超参数调整数据1.网格搜索(GridSearchCV):-需要定义参数的网格（param_grid），
python学智能算法（二十七）|SVM-拉格朗日函数求解上西猫雷婶机器学习人工智能 python学习笔记支持向量机 python 机器学习算法人工智能
【1】引言前序学习进程中，我们已经掌握了支持向量机算法中，为寻找最佳分割超平面，如何用向量表达超平面方程，如何为超平面方程建立拉格朗日函数。本篇文章的学习目标是：求解SVM拉格朗日函数。【2】求解方法【2.1】待求解函数支持量机算法的拉格朗日函数为：L(w,b,α)=12∥w∥2−∑i=1mαi[yi(w⋅xi+b−1)]L(w,b,\alpha)=\frac{1}{2}{\left\|w\rig
AI作画：AI人工智能激发艺术创作灵感 AGI大模型与大数据研究院 AI作画人工智能 ai
AI作画：AI人工智能激发艺术创作灵感关键词：AI作画、生成艺术、深度学习、神经网络、艺术创作、人工智能、创意工具摘要：本文深入探讨AI作画技术如何激发艺术创作灵感。我们将从基础概念出发，解释AI如何"学习"艺术风格并生成新作品，分析核心技术原理，提供实际应用案例，并展望这一领域的未来发展趋势。通过通俗易懂的讲解和实际代码示例，帮助读者理解这项融合科技与艺术的创新技术。背景介绍目的和范围本文旨在向
【算法-贪心算法-python】柠檬水找零檀越@新空间 P1 算法与数据结构 s1 Python 算法贪心算法 python
欢迎来到我的博客，很高兴能够在这里和您见面！希望您在这里可以感受到一份轻松愉快的氛围，不仅可以获得有趣的内容和知识，也可以畅所欲言、分享您的想法和见解。推荐:kuan的首页,持续学习,不断总结,共同进步,活到老学到老导航檀越剑指大厂系列:全面总结java核心技术点,如集合,jvm,并发编程redis,kafka,Spring,微服务,Netty等常用开发工具系列:罗列常用的开发工具,如IDEA,M
【算法】贪心算法——柠檬水找零
题解：柠檬水找零(贪心算法)目录1.题目2.题解3.参考代码4.证明5.总结1.题目题目链接：LINK2.题解分情况讨论+贪心算法当顾客为5元时，收下当顾客为10元时，收下10元并找回5元当顾客为20元时，收下20元并找回10+5元或者5+5+5元这里仅20元时候找钱会有分歧，所以这里我们用贪心算法，即优先留下尽可能多的5元，尽快把10元扔出去。原因：5元是“万金油”，既可以给10元找零，也可以给
基于机器学习的加密货币资金费率预测与套利策略云梦量化科技 python
一、资金费率机制解析永续合约的资金费率是加密货币衍生品市场独有的机制，旨在使永续合约价格锚定现货价格。资金费率每8小时结算一次，结算时多空双方互相支付资金费用：费率为正时，多头支付给空头；费率为负时，空头支付给多头。此机制既促使永续合约价格回归现货价格，也反映市场多空情绪。某安永续合约资金费率计算公式通常为：资金费率 F = 平均溢价指数 P + Clamp(综合利率 I − 溢价指数 P, +0
java线程的无限循环和退出 3213213333332132 java
最近想写一个游戏，然后碰到有关线程的问题，网上查了好多资料都没满足。突然想起了前段时间看的有关线程的视频，于是信手拈来写了一个线程的代码片段。希望帮助刚学java线程的童鞋 package thread; import java.text.SimpleDateFormat; import java.util.Calendar; import java.util.Date
tomcat 容器 BlueSkator tomcat Web servlet
Tomcat的组成部分 1、server A Server element represents the entire Catalina servlet container. (Singleton) 2、service service包括多个connector以及一个engine，其职责为处理由connector获得的客户请求。 3、connector 一个connector
php递归,静态变量,匿名函数使用 dcj3sjt126com PHP 递归函数匿名函数静态变量引用传参
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>Current To-Do List</title> </head> <body>
属性颜色字体变化周华华 JavaScript
function changSize(className){ var diva=byId("fot") diva.className=className; } </script> <style type="text/css"> .max{ background: #900; color:#039;
将properties内容放置到map中 g21121 properties
代码比较简单： private static Map<Object, Object> map; private static Properties p; static { //读取properties文件 InputStream is = XXX.class.getClassLoader().getResourceAsStream("xxx.properti
[简单]拼接字符串 53873039oycg 字符串
工作中遇到需要从Map里面取值拼接字符串的情况，自己写了个，不是很好，欢迎提出更优雅的写法，代码如下： import java.util.HashMap; import java.uti
Struts2学习云端月影
最近开始关注struts2的新特性，从这个版本开始，Struts开始使用convention-plugin代替codebehind-plugin来实现struts的零配置。配置文件精简了，的确是简便了开发过程，但是，我们熟悉的配置突然disappear了，真是一下很不适应。跟着潮流走吧，看看该怎样来搞定convention-plugin。使用Convention插件，你需要将其JAR文件放
Java新手入门的30个基本概念二 aijuans java 新手 java 入门
基本概念:　　1.OOP中唯一关系的是对象的接口是什么,就像计算机的销售商她不管电源内部结构是怎样的,他只关系能否给你提供电就行了,也就是只要知道can or not而不是how and why.所有的程序是由一定的属性和行为对象组成的,不同的对象的访问通过函数调用来完成,对象间所有的交流都是通过方法调用,通过对封装对象数据,很大限度上提高复用率。　　2.OOP中最重要的思想是类,类是模板是蓝图,
jedis 简单使用 antlove java redis cache command jedis
jedis.RedisOperationCollection.java package jedis; import org.apache.log4j.Logger; import redis.clients.jedis.Jedis; import java.util.List; import java.util.Map; import java.util.Set; pub
PL/SQL的函数和包体的基础百合不是茶 PL/SQL编程函数包体显示包的具体数据包
由于明天举要上课,所以刚刚将代码敲了一遍PL/SQL的函数和包体的实现(单例模式过几天好好的总结下再发出来);以便明天能更好的学习PL/SQL的循环,今天太累了,所以早点睡觉,明天继续PL/SQL总有一天我会将你永远的记载在心里,,, 函数; 函数:PL/SQL中的函数相当于java中的方法;函数有返回值定义函数的 --输入姓名找到该姓名的年薪 create or re
Mockito(二)--实例篇 bijian1013 持续集成 mockito 单元测试
学习了基本知识后，就可以实战了，Mockito的实际使用还是比较麻烦的。因为在实际使用中，最常遇到的就是需要模拟第三方类库的行为。比如现在有一个类FTPFileTransfer，实现了向FTP传输文件的功能。这个类中使用了a
精通Oracle10编程SQL(7)编写控制结构 bijian1013 oracle 数据库 plsql
/* *编写控制结构 */ --条件分支语句 --简单条件判断 DECLARE v_sal NUMBER(6,2); BEGIN select sal into v_sal from emp where lower(ename)=lower('&name'); if v_sal<2000 then update emp set
【Log4j二】Log4j属性文件配置详解 bit1129 log4j
如下是一个log4j.properties的配置 log4j.rootCategory=INFO, stdout , R log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appe
java集合排序笔记白糖_ java
public class CollectionDemo implements Serializable,Comparable<CollectionDemo>{ private static final long serialVersionUID = -2958090810811192128L; private int id; private String nam
java导致linux负载过高的定位方法 ronin47
定位java进程ID 可以使用top或ps -ef |grep java ![图片描述][1] 根据进程ID找到最消耗资源的java pid 比如第一步找到的进程ID为5431 执行 top -p 5431 -H ![图片描述][2] 打印java栈信息 $ jstack -l 5431 > 5431.log 在栈信息中定位具体问题将消耗资源的Java PID转
给定能随机生成整数1到5的函数，写出能随机生成整数1到7的函数 bylijinnan 函数
import java.util.ArrayList; import java.util.List; import java.util.Random; public class RandNFromRand5 { /** 题目：给定能随机生成整数1到5的函数，写出能随机生成整数1到7的函数。解法1： f(k) = (x0-1)*5^0+(x1-
PL/SQL Developer保存布局 Kai_Ge
近日由于项目需要，数据库从DB2迁移到ORCAL，因此数据库连接客户端选择了PL/SQL Developer。由于软件运用不熟悉，造成了很多麻烦，最主要的就是进入后，左边列表有很多选项，自己删除了一些选项卡，布局很满意了，下次进入后又恢复了以前的布局，很是苦恼。在众多PL/SQL Developer使用技巧中找到如下这段： &n
[未来战士计划]超能查派[剧透,慎入] comsci 计划
非常好看,超能查派,这部电影......为我们这些热爱人工智能的工程技术人员提供一些参考意见和思想........ 虽然电影里面的人物形象不是非常的可爱....但是非常的贴近现实生活.... &nbs
Google Map API V2 dai_lm google map
以后如果要开发包含google map的程序就更麻烦咯 http://www.cnblogs.com/mengdd/archive/2013/01/01/2841390.html 找到篇不错的文章，大家可以参考一下 http://blog.sina.com.cn/s/blog_c2839d410101jahv.html 1. 创建Android工程由于v2的key需要G
java数据计算层的几种解决方法2 datamachine java sql 集算器
2、SQL SQL/SP/JDBC在这里属于一类，这是老牌的数据计算层，性能和灵活性是它的优势。但随着新情况的不断出现，单纯用SQL已经难以满足需求，比如： JAVA开发规模的扩大，数据量的剧增，复杂计算问题的涌现。虽然SQL得高分的指标不多，但都是权重最高的。成熟度：5星。最成熟的。
Linux下Telnet的安装与运行 dcj3sjt126com linux telnet
Linux下Telnet的安装与运行 linux默认是使用SSH服务的而不安装telnet服务如果要使用telnet 就必须先安装相应的软件包即使安装了软件包默认的设置telnet 服务也是不运行的需要手工进行设置如果是redhat9，则在第三张光盘中找到 telnet-server-0.17-25.i386.rpm
PHP中钩子函数的实现与认识 dcj3sjt126com PHP
假如有这么一段程序： function fun(){ fun1(); fun2(); } 首先程序执行完fun1()之后执行fun2()然后fun()结束。但是，假如我们想对函数做一些变化。比如说，fun是一个解析函数，我们希望后期可以提供丰富的解析函数，而究竟用哪个函数解析，我们希望在配置文件中配置。这个时候就可以发挥钩子的力量了。我们可以在fu
EOS中的WorkSpace密码修改蕃薯耀修改WorkSpace密码
EOS中BPS的WorkSpace密码修改 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 201
SpringMVC4零配置--SpringSecurity相关配置【SpringSecurityConfig】 hanqunfeng SpringSecurity
SpringSecurity的配置相对来说有些复杂，如果是完整的bean配置，则需要配置大量的bean，所以xml配置时使用了命名空间来简化配置，同样，spring为我们提供了一个抽象类WebSecurityConfigurerAdapter和一个注解@EnableWebMvcSecurity，达到同样减少bean配置的目的，如下： applicationContex
ie 9 kendo ui中ajax跨域的问题 jackyrong AJAX跨域
这两天遇到个问题，kendo ui的datagrid，根据json去读取数据，然后前端通过kendo ui的datagrid去渲染，但很奇怪的是，在ie 10,ie 11,chrome,firefox等浏览器中，同样的程序，浏览起来是没问题的，但把应用放到公网上的一台服务器，却发现如下情况： 1） ie 9下，不能出现任何数据，但用IE 9浏览器浏览本机的应用，却没任何问题
不要让别人笑你不能成为程序员 lampcy 编程程序员
在经历六个月的编程集训之后，我刚刚完成了我的第一次一对一的编码评估。但是事情并没有如我所想的那般顺利。说实话，我感觉我的脑细胞像被轰炸过一样。手慢慢地离开键盘，心里很压抑。不禁默默祈祷：一切都会进展顺利的，对吧？至少有些地方我的回答应该是没有遗漏的，是不是？难道我选择编程真的是一个巨大的错误吗——我真的永远也成不了程序员吗？我需要一点点安慰。在自我怀疑，不安全感和脆弱等等像龙卷风一
马皇后的贤德 nannan408
马皇后不怕朱元璋的坏脾气，并敢理直气壮地吹耳边风。众所周知，朱元璋不喜欢女人干政，他认为“后妃虽母仪天下，然不可使干政事”，因为“宠之太过，则骄恣犯分，上下失序”，因此还特地命人纂述《女诫》，以示警诫。但马皇后是个例外。　　有一次，马皇后问朱元璋道：“如今天下老百姓安居乐业了吗？”朱元璋不高兴地回答：“这不是你应该问的。”马皇后振振有词地回敬道：“陛下是天下之父，
选择某个属性值最大的那条记录（不仅仅包含指定属性，而是想要什么属性都可以） Rainbow702 sql group by 最大值 max 最大的那条记录
好久好久不写SQL了，技能退化严重啊！！！直入主题：比如我有一张表，file_info，它有两个属性（但实际不只，我这里只是作说明用）： file_code, file_version 同一个code可能对应多个version 现在，我想针对每一个code，取得它相关的记录中，version 值最大的那条记录， SQL如下： select *
VBScript脚本语言 tntxia VBScript
VBScript 是基于VB的脚本语言。主要用于Asp和Excel的编程。 VB家族语言简介 Visual Basic 6.0 源于BASIC语言。由微软公司开发的包含协助开发环境的事
java中枚举类型的使用 xiao1zhao2 java enum 枚举 1.5新特性
枚举类型是j2se在1.5引入的新的类型,通过关键字enum来定义,常用来存储一些常量. 1.定义一个简单的枚举类型 public enum Sex { MAN, WOMAN } 枚举类型本质是类,编译此段代码会生成.class文件.通过Sex.MAN来访问Sex中的成员,其返回值是Sex类型. 2.常用方法静态的values()方