faster r-cnn实现过程

目录
- faster rcnn论文备注
- caffe代码框架简介
- faster rcnn代码分析
- 后记
faster rcnn论文备注
- 引言
  faster rcnn paper是Ross Girshick在基于CNN生成region proposal提速识别方案, 主要体现在复用前面卷积后的feature map和多框一次出, feature map一路生成框结合另一路做分类.尤其是测试时计算出proposal时间消耗极小(By sharing convolutions at test-time, the marginal cost for computing proposals is small e.g., 10ms per image).
- 主要组件
  
  这个图摘自faster rcnn的论文
  重要包含如下几个组件:
  1. 输入层,仅在训练时有用.每次按照配置从一个epoch的图片拿一个批次的图片,最短边缩放到600像素.每次一个epoch完成后shuffle图片排序
  2. CNN层, 接收resize的图片,经过卷积和池化,通过加pad使得每次卷积后大小不变,池化后减半,最后feature map和输入图成比例关系,被后面RPN(Region Proposal Network)和ROI层复用
  3. RPN层(Region Proposal Network), 输入是一个feature map n×n的滑窗(论文中n = 3),输出是一组框和对应框的得分,对应VGG16网络结构一个滑窗可以覆盖228像素区域辅助上锚点(Anchor),可以翻译成9个区域.这层拆出2个loss,将框送入ROI层
  4. ROI层,接收RPN的输入和CNN的输入获取proposal的feature map的输入送入分类器
  5. 分类层,接收ROI层的feature输入给出分类的结果,这层有两个loss一个是分类的loss一个是框的loss
- CNN层,卷基层的网络接口如下:
  
  faster RCNN卷积
  
  共有13个卷积层后置一个relu的激活, 4个池化.这是CNN部分的caffe prototxt
```
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}
layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv1_2"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu1_2"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_2"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
#中间层此处省略 #
layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}
```
  可以看出每次卷积核(kernel)大小是3,垫置(pad)大小是1,从cs231n#conv中可以看出卷积后大小关系: (W - 3 + 2)/1 + 1 = W,卷积的输入宽高和输出层的宽高大小不变.池化层的参数kernel size = 2, stride = 2以极大值池化,每次池化宽高减半
  总共4个池化,最后卷积输出的通道数512(VGG16),feature map大小和输入的缩放图映射对应比例是1/16,卷基层的最终输出是'conv5_3',输入一路送入RPN算出对应的框,一路送入ROI算出对应feature map进行分类
- Region Proposal Networks(RPN)
  
  模型中负责生成'框'的网络, 输入是CNN中feature map中n×n的一个滑窗,输出是认为有物体的框和对应得分.一个滑窗的有效覆盖范围是228x228,经过锚点的映射后(缺省scale 和radio都是[0.5:1, 1:1, 2:1])成为9个框,下图出资论文原图针对VGG
  可以看出anchor给出的框大小和横纵的适应性,通常一幅图像滑动feature map滑动窗大小是2400,anchor的总数约为20K左右(For a convolutional feature map of a size W � H (typically �2,400), there are WHk anchors intotal.) anchor设计是一个关键点,不用每次将图片resize到不同大小重新计算特征值,所有anchor的预测都是基于同一份feature(The design of multiscale anchors is a key component for sharing features without extra cost for addressing scales.)
  RPN接收一个512xHxW的feature map,经过一次卷积之后甩出2路,一路用于生成K个框(2值cls, FG和BG得分),一路生成对应得分(4值bbox标识矩形框),网络结构如下:
```
layer {
  name: "rpn_conv/3x3"
  type: "Convolution"
  bottom: "conv5_3"
  top: "rpn/output"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  convolution_param {
    num_output: 512
    kernel_size: 3 pad: 1 stride: 1
    weight_filler { type: "gaussian" std: 0.01 }
    bias_filler { type: "constant" value: 0 }
  }
}
layer {
  name: "rpn_relu/3x3"
  type: "ReLU"
  bottom: "rpn/output"
  top: "rpn/output"
}
```
  假设原始训练图片的shape(3,h_origianl,w_origianl),每个批次一张图片,经过resize后==>(1, 3, h_resized,w_resized)经过CNN卷积池化操作之后==>(1,512,h_conv,w_conv) w_resized/16 = w_conv,h_resized/16 = h_conv ,经过'rpn_conv/3x3'(F = 3, P = 1, S = 1)后大小依然不变==>(1,512,h_conv,w_conv)但是内容已经图像卷积的feature map运算为RPN的基值(适应RPN loss从CNN的feature map做了一次转化),滑动窗的个数就等于w_conv×h_conv,所有anchor的数目是w_conv×h_conv×k(9)也就说一次rpn的卷积就完成了对全图的feature map生成proposal的过程借助GPU的并行运算能力非常省时,'rpn_conv/3x3'的输出作为'rpn_bbox_pred'和'rpn_cls_score'的输入,'rpn_cls_score'输出shape(1, 18, w_conv, h_conv), 18对应9个anchor的2个得分,因为输入blob shape(N, C, H, W)中NxHxW要等于预测/label的个数,所以这里要reshape一下(参数是shape { dim: 0 dim: 2 dim: -1 dim: 0 } ),再计算cls loss和softmax之前shape变为(1,2,9×h_conv,w_conv)可以参见softmax_loss_layer.cpp的解释:
  
  得出图形所有的anchor scores一路送入计算loss一路走softmax算出FG和BG的概率.'rpn_cls_prob'输出是(1,2, 9*h_conv, w_conv),再reshape回(1,18,h_conv, w_conv)每一个window的9个anchor的概率就出来了,结合对应框送入proposal层;'rpn_conv/3x3'的另一路输出送入了'rpn_bbox_pred'算出对应的框(1,36, h_conv, w_conv),'rpn_bbox_pred'一路计算框的loss另一路送入proposal层;proposal层集合输入的概率和框生成proposal送入ROI层,整体流程如下:
  
  RPN network
  
  这块儿比较容易乱,尤其里面层的实现还是基于python的层和c++实现的loss,对照prototxt图理解起来好很多
- Loss计算和训练
  RPN loss包含两部分: score的loss和bbox的loss,引子原文
  L({pi,ti}) = 1/Ncls×ΣLcls(pi, pi) + λ×1/Lreg×Σpi×Lreg(ti,ti), 其中i在mini-batch中anchor的序号,pi是第i个anchor预测是物体的概率,pi = 1 if ith anchor is ground true else 0.ti是预测的正例中矩形4值.Lcls是2值的log loss, Lreg(ti,ti) = R(ti,ti)其中R代表的是RobustLoss, λ是用于平衡两个loss的参数默认是10.其中矩形框的回归应用:ti预测框展开 tx = (x - xa)=wa; ty = (y - ya)=ha; tw = log(w/wa); th = log(h/ha); ti ground true矩形展开是 tx = (x�- xa)=wa; ty = (y* - ya)=ha;tw = log(w/wa); th = log(h/ha),其中x,y标识矩形中心坐标,w,h表示宽高,x标识预测坐标,xa标识anchor的坐标,x标识ground true的坐标,y,w,h类似.如论文所属这样的目的是'This can be thought of as bounding-box regression from an
  anchor box to a nearby ground-truth box.' bounding-box regression基于同一份feature map,每个scale和radio不共享参数,独立回归一个对应的框.基于不同大小比例和横纵比的原始框和regressors卷积后得到k(9)近似ground true的框.原文如下:
  
  图片中的anchor图像多数都是反例,造成数据不平衡,还有20k左右的anchor数目太多,随机128正例anchor和128反例,假如正例数目不够128用反例填充.论文中和代码中用的是每次训练一张图,用SGD训练.训练可以是RPN和RCNN交替训练迭代往复,也可以是合成一个大网络各自计算各自的loss,作者实验表明使用大网络训练在准确度差不多的情况下快1~1.5倍.
  再有就是剔除anchor越出图片边界的,对于同一个ground true区域多个anchor都有覆盖交集(IoU)阈值设置为0.7,在采用非极大值抑制(NMS)一个图剩下的anchor大约还有2k,作者有提到NMS没有显著影响准确率而显著提升了效率.后面作者给出了切割实验给出了每一个point的效果,比如RPN和RCNN是否共享卷积层影响对比实验,再比如RPN的效果验证,把RPN替换成SS后面接上ZF/VGG16看准确率.这种类似可插拔式的实验组装思路非常好,可以验证每一个点实际cover的作用,但是往往改造起来切割实验的实现成本比较大.论文只是给了思路和点,实际在工程中具体细节还是要看代码.

caffe代码框架简介

caffe整体结构

要了解faster rcnn的实现细节就要了解caffe的结构,以及如何定制自己的层(layer)

源码结构

主要目录结构如下:
- include目录是暴露的cpp接口&class
- python是python的接口,基于封装的python和boost python将python调用翻译成cpp调用
- matlab是matlab接口层
- src是caffe的实现层
  
  结构如下:

Solver和Net的构造
Solver是一个基础类,封装caffe对外的训练和测试操作,类似tensorflow的optimizer,上面架着sgd,adam等等solver,反向传播更新参数时有些差异,除了直接构造SGDSolver类也可以通过python来创建: self.solver = caffe.SGDSolver(solver_prototxt),公共的基础操作都维护在Solver类中
以一个SGDSolver的构造过程看一下里面的结构和操作SGDSolver的构造器实现直接放进了头文件里,主要是清理一下历史,更新,临时备份的参数,主要工作都在Solver中完成

template 
   class SGDSolver : public Solver {
    public:
     explicit SGDSolver(const SolverParameter& param)
         : Solver(param) { PreSolve(); }
     explicit SGDSolver(const string& param_file)
         : Solver(param_file) { PreSolve(); }
     virtual inline const char* type() const { return "SGD"; }
void SGDSolver::PreSolve() {
     // Initialize the history
     const vector*>& net_params = this->net_->learnable_params();
     history_.clear();
     update_.clear();
     temp_.clear();
     for (int i = 0; i < net_params.size(); ++i) {
       const vector& shape = net_params[i]->shape();
       history_.push_back(shared_ptr >(new Blob(shape)));
       update_.push_back(shared_ptr >(new Blob(shape)));
       temp_.push_back(shared_ptr >(new Blob(shape)));
     }
   }

// history maintains the historical momentum data.
// update maintains update related data and is not needed in snapshots.
// temp maintains other information that might be needed in computation
//   of gradients/updates and is not needed in snapshots
vector > > history_, update_, temp_;

再看Solver的构造, 默认root_solver = nullptr, void ReadSolverParamsFromTextFileOrDie(const string& param_file,SolverParameter* param) 主要是从proto反序列化为SolverParameter对象,针对历史版本做兼容,主要代码在Init中

Solver::Solver(const string& param_file, const Solver* root_solver)
: net_(), callbacks_(), root_solver_(root_solver),
  requested_early_exit_(false) {
SolverParameter param;
 ReadSolverParamsFromTextFileOrDie(param_file, ¶m);
  Init(param);
}

Init()中做了必要的初始化和检查,比如iter_和current_step_,两者关系是:this->current_step_ = this->iter_ / this->param_.stepsize();stepsize是在solver.prototxt中指定,关联学习率的修改

    void Solver::Init(const SolverParameter& param) {
      CHECK(Caffe::root_solver() || root_solver_)
          << "root_solver_ needs to be set for all non-root solvers";
      LOG_IF(INFO, Caffe::root_solver()) << "Initializing solver from parameters: "
        << std::endl << param.DebugString();
      param_ = param;
      CHECK_GE(param_.average_loss(), 1) << "average_loss should be non-negative.";
      CheckSnapshotWritePermissions();
      if (Caffe::root_solver() && param_.random_seed() >= 0) {
        Caffe::set_random_seed(param_.random_seed());
      }
      // Scaffolding code
      InitTrainNet();
      if (Caffe::root_solver()) {
        InitTestNets();
        LOG(INFO) << "Solver scaffolding done.";
      }
      iter_ = 0;
      current_step_ = 0;
    }

往下再看InitTrainNet()函数,这里写伪代码突出重点和流向,依照这log可以看出代码的流向:

solver.cpp:81] Creating training net from train_net file: models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt

        void Solver::InitTrainNet() {
      //训练部分参数的检查,包含有训练的网络参数,是否指定训练文件等等
      deserialize train net file -> net_param
        net_.reset(new Net(net_param));
      }

重点部分在Net的初始化,抽取的伪代码如下:

void Net::Init(const NetParameter& in_param) {
  //过滤参数
  FilterNet(in_param, &filtered_param);
   // Create a copy of filtered_param with splits added where necessary.
  NetParameter param;
  InsertSplits(filtered_param, ¶m);
    memory_used_ = 0;
    // set the input blobs
    for (int input_id = 0; input_id < param.input_size(); ++input_id) {
const int layer_id = -1;  
      // inputs have fake layer ID -1,设置输入数据blob
      // Helper for Net::Init: add a new input or top blob to the net.  (Inputs have
      // layer_id == -1, tops have layer_id >= 0.)
     //构造设置关键的变量,vector > > blobs_( @brief the blobs storing intermediate results between the layer.)  blob_names_, blob_need_backward_, net_input_blob_indices_, net_input_blobs_等等
      AppendTop(param, layer_id, input_id, &available_blobs, &blob_name_to_idx);
      for (int layer_id = 0; layer_id < param.layer_size(); ++layer_id) {
        //构造每一层的layer, 这里使用类工厂的设计模型,通过宏来控制把构造函数放进注册中心,里面会设置blobs_,后面blobs_会伸出来在net以不同纬度共享引用
        layers_.push_back(LayerRegistry::CreateLayer(layer_param));
        // Figure out this layer's input and output
        for (int bottom_id = 0; bottom_id < layer_param.bottom_size();
     ++bottom_id) {
          //构造每一层input blob,此处bottom_vecs_和blobs_通过指针共享blob对象
          const int blob_id = AppendBottom(param, layer_id, bottom_id,&available_blobs, &blob_name_to_idx);
          // If a blob needs backward, this layer should provide it.
          need_backward |= blob_need_backward_[blob_id];
        }
        //设置每一个layer的输出, top_vecs_和blobs_通过指针共享blob对象
        for (int top_id = 0; top_id < num_top; ++top_id) {
          AppendTop(param, layer_id, top_id, &available_blobs,&blob_name_to_idx);
        }
        //根据网络设置layer->AutoTopBlobs(),创建自动输出的top的blob对象
        //调用每一层的初始化函数
        layers_[layer_id]->SetUp(bottom_vecs_[layer_id], top_vecs_[layer_id]);
        //根据每层内的参数是否设置了learning rate设置反向传播标致,构造每层的参数
        for (int param_id = 0; param_id < num_param_blobs; ++param_id) {
            layers_[layer_id]->set_param_propagate_down(param_id, param_need_backward);
            AppendParam(param, layer_id, param_id);
        }
      }
      // Handle force_backward if needed.
      for (int layer_id = layers_.size() - 1; layer_id >= 0; --layer_id) {
          set layer_contributes_loss flag
          set layer_need_backward_
      }
      // In the end, all remaining blobs are considered output blobs.
      for (set::iterator it = available_blobs.begin();
          it != available_blobs.end(); ++it) {
             net_output_blobs_.push_back(blobs_[blob_name_to_idx[*it]].get());
net_output_blob_indices_.push_back(blob_name_to_idx[*it]);
       }
  LOG_IF(INFO, Caffe::root_solver()) << "Network initialization done.";
}

至此solver -> net -> layer的初始化构造就完成了, 至于每一个layer定制的实现(卷积,池化,定制层)如何耦合进入框架稍后会有分析,整个过程图解如下:

SGDSolver构造

训练一次的step
网络构造完成后,就可以训练了, 一般的训练过程是:读入一批数据数据 -> 正向传播 -> 基于ground true计算loss ->反向求偏导映射到每个可以训练的layer上根据训练策略更新参数.

  while (cur < max_repeat){
    data, result_group_true = read_data()
    result_calc = front_propagation(data);
    loss = calc_loss(result_calc, result_group_true);
    dws = compute_partial_derivative_4w(loss)
    update_w_by_strategy()
  }

caffe把一次训练封装成一次step, SGDSolver直接调用Solver的step.抽取关键部分,代码如下:

void Solver::Step(int iters) {
    end_iter  = cur + iters
    while (cur < end_iter){
        clear_up()
        insert_test_if_need()
        hookup_before()
        Dtype loss = 0;
        for (int i = 0; i < param_.iter_size(); ++i) {
            loss += net_->ForwardBackward(bottom_vec);
        }
        loss /= param_.iter_size();
        // average the loss across iterations for smoothed reporting,若average_loss为n：loss_容器里面就会存储前n个loss的值，而smooth_loss_相当于做了一个loss平均
        UpdateSmoothedLoss(loss, start_iter, average_loss);
        hookup_after()
        ApplyUpdate();
        take_snapshot_if_necessary()
    }
}

显而易见重点就是net_的ForwardBackward(const vector* > & bottom)和ApplyUpdate().
首先看下Net的ForwardBackward(const vector* > & bottom),代码非常简单:

 Dtype ForwardBackward(const vector* > & bottom) {
    Dtype loss;
    Forward(bottom, &loss);
    Backward();
    return loss;
 }

这里有一个点有些奇怪, Step(int iter)中声明的vector*> bottom_vec;没有做任何输入直接传入了做正向传播,捋着代码看竟然把空的数据喂进了网络的输入blob 'net_input_blobs_'中,这里以faster rcnn训练网络为例, 网络里面包含了数据输入层(包括封装lmdb和做shuffle等等操作),看了下ForwardBackward()在所有测试用例里都没有额外的初始化.
net_input_blobs_等于啥都没放

const vector*>& Net::Forward(
const vector*> & bottom, Dtype* loss) {
  // Copy bottom to internal bottom
  for (int i = 0; i < bottom.size(); ++i) {
      net_input_blobs_[i]->CopyFrom(*bottom[i]);
  }
  return ForwardPrefilled(loss);
}

其中ForwardPrefilled(Dtype* loss)调用了ForwardFromTo(int start, int end),这里要做全网络的FP, 所以是*loss = ForwardFromTo(0, layers_.size() - 1);去除冗余的检查和debug信息后,代码非常凝练,这里就完成各个layer之间按照层级FG加loss的组织,各个层只要实现好自己Forward函数就好了

Dtype Net::ForwardFromTo(int start, int end) {
  for (int i = start; i <= end; ++i) {
    // LOG(ERROR) << "Forwarding " <<       layer_names_[i];
    Dtype layer_loss = layers_[i]->Forward(bottom_vecs_[i], top_vecs_[i]);
    loss += layer_loss;
  }
  return loss;
}

在Forward(bottom, &loss);完成后接着进行反向传播Backward(),Backward()除了打了debug信息就调用了BackwardFromTo(layers_.size() - 1, 0);

void Net::BackwardFromTo(int start, int end) {
  for (int i = start; i >= end; --i) {
      if (layer_need_backward_[i]) {
        layers_[i]->Backward(top_vecs_[i], bottom_need_backward_[i], bottom_vecs_[i]);
       }
  }
}

每一层实现的函数原型是自己定制caffe layer Backward函数,从上面的loss偏导(error gradient)求出本层输入对应的偏导,propagate_down标识对应'bottom'是否计算loss偏导,标识函数原型如下:

/**
 * @brief Given the top blob error gradients, compute the bottom blob error
 *        gradients.
 *
 * @param top
 *     the output blobs, whose diff fields store the gradient of the error
 *     with respect to themselves
 * @param propagate_down
 *     a vector with equal length to bottom, with each index indicating
 *     whether to propagate the error gradients down to the bottom blob at
 *     the corresponding index
 * @param bottom
 *     the input blobs, whose diff fields will store the gradient of the error
 *     with respect to themselves after Backward is run
 *
 * The Backward wrapper calls the relevant device wrapper function
 * (Backward_cpu or Backward_gpu) to compute the bottom blob diffs given the
 * top blob diffs.
 *
 * Your layer should implement Backward_cpu and (optionally) Backward_gpu.
 */
inline void Backward(const   vector*>& top,
  const vector& propagate_down,
  const vector*>& bottom);

这样反向转一遍,bottom_vecs_中就保存着偏导信息.有一点值得注意,net_中包含全量信息(偏导,参数,中间的输入输出),bottom_vecs_指向的blobs_的某些块儿

 /// @brief the blobs storing intermediate results between the layer. 
vector > > blobs_;

  /// bottom_vecs stores the vectors containing the input for each layer.
 /// They don't actually host the blobs (blobs_ does), so we simply store
 /// pointers.
 vector*> > bottom_vecs_;    
 bottom_vecs_[layer_id].push_back(blobs_[blob_id].get());

至此一次正向传播算loss,一次反向传播算error gradient就完成了,剩下的就是如何更新参数了,以简单的SGD为例

void SGDSolver::ApplyUpdate() {
    Dtype rate = GetLearningRate();
    ClipGradients();
    for (int param_id = 0; param_id < this->net_->learnable_params().size();
   ++param_id) {
        Normalize(param_id);
        Regularize(param_id);
        ComputeUpdateValue(param_id, rate);
    }
    this->net_->Update();
}

此处caffe里的clip gradient是什么意思？可以参考一下,大概的意思是限速,这不妨碍主流程.
对于每一个learnable的参数都是进行了一次Normalize, Regularize,然后更新参数.之前在Init时有在每一层AppendParam(net_param, layer_id, param_id);进行映射

params_.push_back(layers_[layer_id]->blobs()[param_id]);
if (xx condition){
    ...
    const int learnable_param_id = learnable_params_.size();
    learnable_params_.push_back(params_[net_param_id].get());
    ...
}

更新参数时就是对learnable的那些blob进行axpy操作,一般在CPU模式下是调用BLAS的cblas_daxpy(N, alpha, X, 1, Y, 1),如果是GPU模式下是cublasSaxpy(Caffe::cublas_handle(), N, &alpha, X, 1, Y, 1).操作data = A*diff + data,完成参数更新:

blob基于error gradient更新参数

至此一次迭代FG->loss&BG->update就大体清楚了

caffe定制自己的层

cpp定制层嵌入
之前将Solver Init的时候提到过Layer的实例化是通过类工厂里注册自己Layer的构造函数指针实现的,在Solver里只是通过一行layers_.push_back(LayerRegistry::CreateLayer(layer_param));就实现了
简单看下LayerRegistry的结构

class LayerRegistry {
public:
  //函数指针类型定义
  typedef shared_ptr > (*Creator)(const LayerParameter&);
  typedef std::map CreatorRegistry;

  static CreatorRegistry& Registry() {
    //全局通过name找到构造layer函数指针
    static CreatorRegistry* g_registry_ = new CreatorRegistry();
    return *g_registry_;
  }

  // Adds a creator. 添加layer类型
  static void AddCreator(const string& type,     Creator creator) {
      //check exist ... 
      registry[type] = creator;
  }

  // Get a layer using a LayerParameter.构造一个新的layer对象
  static shared_ptr >     CreateLayer(const LayerParameter& param) {
 //例行检查
return registry[type](param);
 }
private:
//确保单例
 LayerRegistry() {}  
};

LayerRegistry是注册条目,有LayerRegisterer管理,代码如下:

 class LayerRegisterer {
 public:
   LayerRegisterer(const string& type,
              shared_ptr > (*creator)(const LayerParameter&)) {
       LayerRegistry::AddCreator(type, creator);
  }
};
#define REGISTER_LAYER_CREATOR(type, creator)                                  \
static LayerRegisterer     g_creator_f_##type(#type, creator);     \
static LayerRegisterer   g_creator_d_##type(#type, creator)    \

#define REGISTER_LAYER_CLASS(type)                                             \
template                                                     \
shared_ptr >   Creator_##type##Layer(const LayerParameter& param) \
{                                                                            \
  return shared_ptr >(new type##Layer(param));           \
 }                                                                              \
REGISTER_LAYER_CREATOR(type,   Creator_##type##Layer)

只要是调到了LayerRegisterer的构造器就LayerRegistry放入了类工厂,后面就可以实例化对象了.caffe就是通过宏动态生成的代码,把customer的层加入到框架里的,可以参考layer_factory.hpp的注释

layer_factory.hpp

也就是在实现层cpp加入REGISTER_LAYER_CLASS宏就可以了,之前ngx build自己添加的plug in 指定cover那几个circle也是通过类似的宏手段控制编译的代码.

roi_pooling_layer.cp

REGISTER_LAYER_CLASS(ROIPooling);翻译过来的代码:

template                                                       
shared_ptr > Creator_ROIPoolingLayer(const LayerParameter& param) 
{                                                                            
    return shared_ptr >(new ROIPoolingLayer(param));           
}                 
//这里就调用了LayerRegisterer的构造器进而创建了LayerRegistry,这里创建一个float,一个double的                                             
static LayerRegisterer g_creator_f_ROIPooling(ROIPooling, creator);
static LayerRegisterer g_creator_d_ROIPooling(ROIPooling, creator)

定制python层, caffe原生有一类的类型就'Python',为了方便python程序员定制自己的layer.实现的代码在PythonLayer中.通过boost python实现的,首先看一下faster rcnn中一个简单python层的定义:

layer {
      name: 'input-data'
      #指定类型
      type: 'Python'
      top: 'data'
      top: 'im_info'
      top: 'gt_boxes'
      python_param {
        #python文件
        module: 'roi_data_layer.layer'
        #对应的class
        layer: 'RoIDataLayer'
        #传递给python的参数
        param_str: "'num_classes': 21"
     }
}

以上就是一个加单的python层的定义,不涉及具体含义,先看下接口定义,和c++层一样需要实现forward,backward,setup,reshape

class RoIDataLayer(caffe.Layer):
    def setup(self, bottom, top):
        """Setup the RoIDataLayer."""
        layer_params = yaml.load(self.param_str_)
        #prototxt中定义参数传递到代码中
        self._num_classes = layer_params['num_classes']
        ...
   def forward(self, bottom, top):
      """Get blobs and copy them into this layer's   top blob vector."""
      blobs = self._get_next_minibatch()

      for blob_name, blob in blobs.iteritems():
        top_ind = self._name_to_top_map[blob_name]
        # Reshape net's input blobs
        top[top_ind].reshape(*(blob.shape))
        # Copy data into net's input blobs
        top[top_ind].data[...] = blob.astype(np.float32, copy=False)

  def backward(self, top, propagate_down, bottom):
    """This layer does not propagate gradients."""
        pass

  def reshape(self, bottom, top):
    """Reshaping happens during the call to forward."""
        pass

当然python层只能在cpu模式下运行,不能高效的使用GPU,使用中还是要做适当的trade off

faster rcnn代码分析

训练
把卷积层合并后,训练部分网络结构如下:总共loss有4部分组成RPN部分对应论文中的:L({pi,ti}) = 1/Ncls×ΣLcls(pi, pi) + λ×1/Lreg×Σpi×Lreg(ti,t*i),除了内置卷积,池化,relu激活,还有定制的python层和cpp层.
数据从input(python实现)层开始,读lmdb一个batch的图片,卷积后形成feature map一路送入RPN网络,一路送入ROI层(cpp定制实现),ROI层通RPN层送过来的proposal抽取对应proposal的feature map进行分类给出分类的loss和二次回归bbox的loss

以python为入口的代码分析
faster rcnn训练分为stage交替训练和一个大网络统一训练,因为两者精度相仿而后者速度是前者1~1.5倍,所以本文都是一个大网络分析的.训练和测试方法在基于python+caffe的faster rcnn训练识别有过描述.首先看下训练过程是如何走进caffe的内部.训练的入口是faster_rcnn_end2end.sh脚本,主要代码如下:

time ./tools/train_net.py --gpu ${GPU_ID} \
--solver   models/${PT_DIR}/${NET}/faster_rcnn_end2end/  solver.prototxt \
--weights data/imagenet_models/${NET}.v2.caffemodel \
--imdb ${TRAIN_IMDB} \
--iters ${ITERS} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}

time ./tools/test_net.py --gpu ${GPU_ID} \
--def   models/${PT_DIR}/${NET}/faster_rcnn_end2end/t    est.prototxt \
--net ${NET_FINAL} \
--imdb ${TEST_IMDB} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}

训练入口在train_net.py中,测试入口在test_net.py中.抽取重要逻辑train_net.py中逻辑如下

import caffe
self.solver = caffe.SGDSolver(solver_prototxt)
while self.solver.iter < max_iters:
        # Make one SGD update
        self.solver.step(1)
         take_snapshot_if_necessary()
return model_paths

之前我们已经讲过了SGDSolver的初始化过程和Step流程.import caffe这一句已经包含所有需要的东西了,但是遍历caffe的python目录,也没有caffe.py这个文件, 其实import不仅可以import py文件也可以import目录,只要这个目录有__init__.py(不学习caffe还真不知道python有这个用法,可以参考下what-is-init-py-for)

python/caffe的目录结构

python/caffe

看下__init__.py

from .pycaffe import Net, SGDSolver,   NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver
from ._caffe import set_mode_cpu, set_mode_gpu, set_device, Layer, get_solver, layer_type_list, set_random_seed
from ._caffe import __version__
from .proto.caffe_pb2 import TRAIN, TEST
from .classifier import Classifier
from .detector import Detector
from . import io
from .net_spec import layers, params,     NetSpec, to_proto

可以看出SGDSolver是从pycaffe中取得的

from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
    RMSPropSolver, AdaDeltaSolver, AdamSolver

_caffe.so是从_caffe.cpp编译出来的,看下caffe.cpp的代码,是基于boost python编译出来的python module将python函数&类映射成c++的函数&类,关键部分代码如下:

namespace bp = boost::python;
// Selecting mode.
void set_mode_gpu() { Caffe::set_mode(Caffe::GPU); }
//所以编译出来是_caffe.so的python模块
BOOST_PYTHON_MODULE(_caffe) {
  //import的caffe模块属性映射
  bp::scope().attr("__version__") = AS_STRING(CAFFE_VERSION);
  //函数映射
  bp::def("set_mode_gpu", &set_mode_gpu);
  //类映射,python端使用默认构造器
  bp::class_, shared_ptr >, boost::noncopyable>(
  "Solver", bp::no_init)
    //属性映射
    .add_property("net", &Solver::net)
    .add_property("test_nets", bp::make_function(&Solver::test_nets,
      bp::return_internal_reference<>()))
    .add_property("iter", &Solver::iter)
    .def("solve", static_cast::*)(const char*)>(
      &Solver::Solve), SolveOverloads())
    //关键函数
    .def("step", &Solver::Step)
    .def("restore", &Solver::Restore)
    .def("snapshot", &Solver::Snapshot);
  //SGDSolver继承Solver,需要一个string参数构造器,explicit SGDSolver(const string& param_file) : Solver(param_file) { PreSolve(); }
  bp::class_,   bp::bases >,
    shared_ptr >,   boost::noncopyable>(
      "SGDSolver", bp::init());
}

这样整个流程从python到c++的串联就完成了
boost python的使用可以参考boost_python_tutorial

python layer部分

input-data层

这层的目的是读入数据,做预处理,输出:图片内容(index:0, name:'data');图像宽高,缩放比例(index:1, name:'im_info'); label和ground true框信息(index:2, name:'gt_box')如图所示

input输出
, data/im_info/gt_box送入'rpn-data'层出score的loss,data送入卷基层,gt_boxes送入'roi-data'层(集合proposal输出roi),im_info送入'proposal'层生成proposal

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
  module: 'roi_data_layer.layer'
  layer: 'RoIDataLayer'
  param_str: "'num_classes': N"
  }  
}

代码在roi_data_layer/layer.py中

def forward(self, bottom, top):
    """Get blobs and copy them into this layer's top blob vector."""
    # 获得blob数据,key-value形式,按照name 设置top的输出顺序.
    blobs = self._get_next_minibatch()

    for blob_name, blob in blobs.iteritems():
        top_ind = self._name_to_top_map[blob_name]
        # Reshape net's input blobs
        top[top_ind].reshape(*(blob.shape))
        # Copy data into net's input blobs
        top[top_ind].data[...] = blob.astype(np.float32, copy=False)

在_get_next_minibatch中,USE_PREFETCH默认是不开启的,作者发现没有太大作用('So far I haven't found this useful; likely more engineering work is required').当前拿的batch图片是否是一个新的epoch,如果是就shuffle一下,为了更好的性能shuffle的时候按照横图和纵图分组.拿到的是lmdb的项,minibatch.py中的get_minibatch获得完整数据, 这里有一个点需要注意一下, config.py和在脚本中指定的experiments/cfgs/faster_rcnn_end2end.yml融合成的配置,实际生效的配置需要再检查一下log('IMS_PER_BATCH': 1)

    def _get_next_minibatch_inds(self):
    """Return the roidb indices for the next minibatch."""
    if self._cur + cfg.TRAIN.IMS_PER_BATCH >= len(self._roidb):
        self._shuffle_roidb_inds()
    #_perm保存的是排序的索引
    db_inds = self._perm[self._cur:self._cur + cfg.TRAIN.IMS_PER_BATCH]
    self._cur += cfg.TRAIN.IMS_PER_BATCH
    return db_inds

def _get_next_minibatch(self):
    """Return the blobs to be used for the next minibatch.

    If cfg.TRAIN.USE_PREFETCH is True, then blobs will be computed in a
    separate process and made available through self._blob_queue.
    """
    if cfg.TRAIN.USE_PREFETCH:
        return self._blob_queue.get()
    else:
        #获得这个batch的lmdb索引
        db_inds = self._get_next_minibatch_inds()
        #lmdb记录
        minibatch_db = [self._roidb[i] for i in db_inds]
        #从对应lmdb记录转成图像数据输出,框信息 label信息,图片大小信息&缩放信息
        return get_minibatch(minibatch_db, self._num_classes)
    
  def _shuffle_roidb_inds(self):
    """Randomly permute the training roidb."""
    # Make minibatches from images that have similar aspect ratios (i.e. both tall and thin or both short and wide) in order to avoid wasting computation on zero-padding.通过横纵group避免zero padding
    if cfg.TRAIN.ASPECT_GROUPING:
        widths = np.array([r['width'] for r in self._roidb])
        heights = np.array([r['height'] for r in self._roidb])
        horz = (widths >= heights)
        vert = np.logical_not(horz)
        #横图
        horz_inds = np.where(horz)[0]
        #纵图
        vert_inds = np.where(vert)[0]
        inds = np.hstack((
            np.random.permutation(horz_inds),
            np.random.permutation(vert_inds)))
        # 2个一组,绝大多数同一组的形状一致
        inds = np.reshape(inds, (-1, 2))
        row_perm = np.random.permutation(np.arange(inds.shape[0]))
        #以2个一组打算为单元重排,拉倒一层里,相邻的形状一致,之所以是两个一组,猜想是默认的__C.TRAIN.IMS_PER_BATCH = 2
        inds = np.reshape(inds[row_perm, :], (-1,))
        self._perm = inds
    else:
        self._perm = np.random.permutation(np.arange(len(self._roidb)))
    self._cur = 0

这是基础输出的log辅助理解代码:

    horz = [ True  True  True ...,  True  True  True], horz = [False False False ..., False False False]
    horz_inds = [     0      1      2 ..., 186205 186206 186207], vert_inds  = [     6     43     65 ..., 186176 186186 186194]
    inds = [163257  59770  49424 ...,  56475  31817 126653]
    inds = [[163257  59770]
     [ 49424  41168]
     [156295   1803]
     ...,
     [ 99367  20315]
     [142904  56475]
     [ 31817 126653]]
    row_perm  = [77629 51661 58201 ..., 91810 47169 48787]
    inds = [118195 143322 121405 ...,  19415  18933  26468]

这样就返回了一batch的lmdb记录的索引,从_roi中找到对应lmdb记录,get_minibatch负责读取,以下是伪代码

def get_minibatch(roidb, num_classes):
        """Given a roidb, construct a minibatch sampled from it."""
        num_images = len(roidb)
        # Sample random scales to use for each image in this batch
        #其实SCALES只有一个是600,这么写是为了支持缩放到多个尺寸
        random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
                                        size=num_images)
        #这里BATCH_SIZE =  num_images, 在yml指定为1
        rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
        fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)

        # Get the input image blob, formatted for caffe
        # 传入lmdb记录和比例的索引
        im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
        #数据 batch序号:C:H:W
        blobs = {'data': im_blob}
        #faster rcnn主要就是使用RPN
        if cfg.TRAIN.HAS_RPN:
            gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
            gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
            #label框乘以缩放比例 = 统一缩放输入的框大小
            gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
            #对应分类一起赋值
            gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
            blobs['gt_boxes'] = gt_boxes
            #'im_info' = (H,W, im_scale)
            blobs['im_info'] = np.array(
                [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],
                dtype=np.float32)

_get_image_blob在minibatch.py中, 处理缩放和把opencv imread的image数据转换成blob

    def _get_image_blob(roidb, scale_inds):
        """Builds an input blob from the images in the roidb at the specified
        scales.
        """
        num_images = len(roidb)
        processed_ims = []
        im_scales = []
        for i in xrange(num_images):
            im = cv2.imread(roidb[i]['image'])
            #target_size = 600
            target_size = cfg.TRAIN.SCALES[scale_inds[i]]
            #做缩放 返回图像&比例
            im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
                                            cfg.TRAIN.MAX_SIZE)
            im_scales.append(im_scale)
            processed_ims.append(im)

        # Create a blob to hold the input images
        #做格式转换
        blob = im_list_to_blob(processed_ims)
        return blob, im_scales

prep_im_for_blob和im_list_to_blob都是util下blob的方法

    def im_list_to_blob(ims):
        """Convert a list of images into a network input.

        Assumes images are already prepared (means subtracted, BGR order, ...).
        """
        图像的shape是H * W * 通道数, 取图像中最大的shape(np.array([(100, 5, 3), (110, 4, 3)]).max(axis=0) --> array([110,   5,   3]))
        max_shape = np.array([im.shape for im in ims]).max(axis=0)
        num_images = len(ims)
        blob = np.zeros((num_images, max_shape[0], max_shape[1], 3),
                        dtype=np.float32)
        for i in xrange(num_images):
            im = ims[i]
            #序号:H:W:C
            blob[i, 0:im.shape[0], 0:im.shape[1], :] = im
        # Move channels (axis 3) to axis 1
        # Axis order will become: (batch elem, channel, height, width)
        channel_swap = (0, 3, 1, 2)
        #交换shape的维度内的内容
        blob = blob.transpose(channel_swap)
        return blob

    def prep_im_for_blob(im, pixel_means, target_size, max_size):
        """Mean subtract and scale an image for use in a blob."""
        # type(im) = numpy array, uint8 -> float
        im = im.astype(np.float32, copy=False)
        # 减均值预处理
        im -= pixel_means
        im_shape = im.shape
        im_size_min = np.min(im_shape[0:2])
        im_size_max = np.max(im_shape[0:2])
        #缩放比率 原图W/H * scale = 目标图像大小,短边缩放的600
        im_scale = float(target_size) / float(im_size_min)
        # Prevent the biggest axis from being more than MAX_SIZE
        # 图像有最大限制,默认1000, 以上面的缩放比率是否超限,假如超限就用最大允许大小缩放
        if np.round(im_scale * im_size_max) > max_size:
            im_scale = float(max_size) / float(im_size_max)
        im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,
                        interpolation=cv2.INTER_LINEAR)

        return im, im_scale

至此input层就大体清晰了,为什么之前看到前向传播时没有赋值input的blob(Dtype ForwardBackward(const vector* > & bottom)),因为在input python layer已经完成了read + shuffle + translate blob + scale + box info的处理

rpn-data层
rpn-data层接收的数据有:rpn_cls_score(来自rpn_cls_score层, 框的得分), gt_boxes(来自input层标注框信息), im_info(来自input层H*W,和原图缩放比例关系), proto和流向图如下:

    layer {
      name: 'rpn-data'
      type: 'Python'
      bottom: 'rpn_cls_score'
      bottom: 'gt_boxes'
      bottom: 'im_info'
      bottom: 'data'
      top: 'rpn_labels'
      top: 'rpn_bbox_targets'
      top: 'rpn_bbox_inside_weights'
      top: 'rpn_bbox_outside_weights'
      python_param {
        module: 'rpn.anchor_target_layer'
        layer: 'AnchorTargetLayer'
        param_str: "'feat_stride': 16"
      }
    }

rpn-data

参数只有一个是步长, class是anchor_target_layer, 实现接口setup,forward, 这层是输出框和label,为下面计算loss所用,不可训练所以backward和reshape都是空实现,依次看setup代码如下:

    def setup(self, bottom, top):
        layer_params = yaml.load(self.param_str_)
        # prototxt没指定, 默认的anchor缩放比例大小
        anchor_scales = layer_params.get('scales', (8, 16, 32))
        #对应一个卷积的K(9)个框, (左上坐标,右下坐标)
        self._anchors = generate_anchors(scales=np.array(anchor_scales))
        self._num_anchors = self._anchors.shape[0]
        self._feat_stride = layer_params['feat_stride']

        # allow boxes to sit over the edge by a small amount
        self._allowed_border = layer_params.get('allowed_border', 0)

        height, width = bottom[0].data.shape[-2:]

        A = self._num_anchors
        # labels
        top[0].reshape(1, 1, A * height, width)
        # bbox_targets
        top[1].reshape(1, A * 4, height, width)
        # bbox_inside_weights
        top[2].reshape(1, A * 4, height, width)
        # bbox_outside_weights
        top[3].reshape(1, A * 4, height, width)

其中generate_anchor在generate_anchor.py中,借助numpy完成

    def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                         scales=2**np.arange(3, 6)):
        """
        Generate anchor (reference) windows by enumerating aspect ratios X
        scales wrt a reference (0, 0, 15, 15) window.
        """
        # base anchor :np array [0,0, 15, 15]
        base_anchor = np.array([1, 1, base_size, base_size]) - 1
        # 宽高比扩展:纵框,平框,横框
        ratio_anchors = _ratio_enum(base_anchor, ratios)
        # 在base anchor大小的基础上针对大小扩展: x8, x16, x32 
        anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                             for i in xrange(ratio_anchors.shape[0])])
        return anchors
    def _ratio_enum(anchor, ratios):
        """
        Enumerate a set of anchors for each aspect ratio wrt an anchor.
        """
        #转换成w,h,中心坐标
        w, h, x_ctr, y_ctr = _whctrs(anchor)
        #原始面积
        size = w * h
        #base anchor是一个正方形,假设边长为n, new w = n/(√radio), new h = n*√radio,新的边长具有如下特点:面积大体不变(忽略上下round的损失),w/h = radio,也就说这样计算完在面积大体不变的情况下:实现宽高按照raio设定的比例走,有点像拉长和压扁
        size_ratios = size / ratios
        ws = np.round(np.sqrt(size_ratios))
        hs = np.round(ws * ratios)
        #转成坐标形式,_whctrs的逆操作
        anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
        return anchors
    #按照面积比例扩展,实际是scales元素的平方扩展
    def _scale_enum(anchor, scales):
        """
        Enumerate a set of anchors for each scale wrt an anchor.
        """

        w, h, x_ctr, y_ctr = _whctrs(anchor)
        ws = w * scales
        hs = h * scales
        anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
        return anchors

接下来是forward,代码比较复杂,抽取伪代码看思路和方法.

    def forward(self, bottom, top):
        # Algorithm:
        #
        # for each (H, W) location i
        #   generate 9 anchor boxes centered on cell i
        #   apply predicted bbox deltas at cell i to each of the 9 anchors
        # filter out-of-image anchors
        # measure GT overlap

        assert bottom[0].data.shape[0] == 1, \
            'Only single item batches are supported'

        # map of shape (..., H, W),此处是框的得分,reshape = (1,18,H,W)
        height, width = bottom[0].data.shape[-2:]
        # GT boxes (x1, y1, x2, y2, label)
        gt_boxes = bottom[1].data
        # im_info
        im_info = bottom[2].data[0, :]
        
        # 1. Generate proposals from bbox deltas and shifted anchors
        # 这块的思路是生成一系列的shift, 然后每一个shift和9个anchor想加,迭代出每一个位置的9个框
        shift_x = np.arange(0, width) * self._feat_stride
        shift_y = np.arange(0, height) * self._feat_stride
        shift_x, shift_y = np.meshgrid(shift_x, shift_y)
        #经过meshgrid shift_x = [[  0  16  32 ..., 560 576 592] [  0  16  32 ..., 560 576 592] [  0  16  32 ..., 560 576 592] ..., [  0  16  32 ..., 560 576 592] [  0  16  32 ..., 560 576 592] [  0  16  32 ..., 560 576 592]]
        #shift_y = [[  0   0   0 ...,   0   0   0] [ 16  16  16 ...,  16  16  16] [ 32  32  32 ...,  32  32  32]  ..., [560 560 560 ..., 560 560 560] [576 576 576 ..., 576 576 576] [592 592 592 ..., 592 592 592]]
        shifts = np.vstack((shift_x.ravel(), shift_y.ravel(),
                            shift_x.ravel(), shift_y.ravel())).transpose()
        #转至之后形成所有位移
        # add A anchors (1, A, 4) to
        # cell K shifts (K, 1, 4) to get
        # shift anchors (K, A, 4)
        # reshape to (K*A, 4) shifted anchors
        A = self._num_anchors
        K = shifts.shape[0]
        # numpy array + 操作_anchors中每一个anchor和每一个shift想加等出结果
        all_anchors = (self._anchors.reshape((1, A, 4)) +
                       shifts.reshape((1, K, 4)).transpose((1, 0, 2)))
        #K个位移,每个位移A个框
        all_anchors = all_anchors.reshape((K * A, 4))
        total_anchors = int(K * A)

        # only keep anchors inside the image,框在图片内
        inds_inside = np.where(
            (all_anchors[:, 0] >= -self._allowed_border) & 
            (all_anchors[:, 1] >= -self._allowed_border) &
            (all_anchors[:, 2] < im_info[1] + self._allowed_border) &  # width
            (all_anchors[:, 3] < im_info[0] + self._allowed_border)    # height
        )[0]

        # keep only inside anchors
        anchors = all_anchors[inds_inside, :]

        # label: 1 is positive, 0 is negative, -1 is dont care
        labels = np.empty((len(inds_inside), ), dtype=np.float32)
        labels.fill(-1)

        # overlaps between the anchors and the gt boxes
        # overlaps (ex, gt), 每个框对应每个box的重合面积,overlaps [anchor数目,box数目]
        overlaps = bbox_overlaps(
            np.ascontiguousarray(anchors, dtype=np.float),
            np.ascontiguousarray(gt_boxes, dtype=np.float))
        # 针对每一个anchor内覆盖率最高的索引
        argmax_overlaps = overlaps.argmax(axis=1)
        # 从索引取覆盖率, 每一个anchor覆盖最大的box的覆盖率
        max_overlaps = overlaps[np.arange(len(inds_inside)), argmax_overlaps]
        # 从box出发覆盖最好的anchor的索引
        gt_argmax_overlaps = overlaps.argmax(axis=0)
        #取覆盖最好的anchor全部box的覆盖值
        gt_max_overlaps = overlaps[gt_argmax_overlaps,
                                   np.arange(overlaps.shape[1])]
        #match的anchor
        gt_argmax_overlaps = np.where(overlaps == gt_max_overlaps)[0]

        if not cfg.TRAIN.RPN_CLOBBER_POSITIVES:
            # assign bg labels first so that positive labels can clobber them
            labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0

        # fg label: for each gt, anchor with highest overlap
        labels[gt_argmax_overlaps] = 1

        # fg label: above threshold IOU
        labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1

        if cfg.TRAIN.RPN_CLOBBER_POSITIVES:
            # assign bg labels last so that negative labels can clobber positives
            labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0

        # subsample positive labels if we have too many
        #最好是各FG,BG占一半,FG不足BG补充
        num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)
        fg_inds = np.where(labels == 1)[0]
        if len(fg_inds) > num_fg:
            disable_inds = npr.choice(
                fg_inds, size=(len(fg_inds) - num_fg), replace=False)
            labels[disable_inds] = -1

        # subsample negative labels if we have too many
        num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)
        bg_inds = np.where(labels == 0)[0]
        if len(bg_inds) > num_bg:
            disable_inds = npr.choice(
                bg_inds, size=(len(bg_inds) - num_bg), replace=False)
            labels[disable_inds] = -1
     
        # 算出anchor和ground true box的dx,dy, dw,dh的偏差 
        bbox_targets = _compute_targets(anchors, gt_boxes[argmax_overlaps, :])

        bbox_inside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
        bbox_inside_weights[labels == 1, :] = np.array(cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS)

        bbox_outside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
        if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:
            # uniform weighting of examples (given non-uniform sampling)
            num_examples = np.sum(labels >= 0)
            positive_weights = np.ones((1, 4)) * 1.0 / num_examples
            negative_weights = np.ones((1, 4)) * 1.0 / num_examples
        else:
            assert ((cfg.TRAIN.RPN_POSITIVE_WEIGHT > 0) &
                    (cfg.TRAIN.RPN_POSITIVE_WEIGHT < 1))
            positive_weights = (cfg.TRAIN.RPN_POSITIVE_WEIGHT /
                                np.sum(labels == 1))
            negative_weights = ((1.0 - cfg.TRAIN.RPN_POSITIVE_WEIGHT) /
                                np.sum(labels == 0))
        bbox_outside_weights[labels == 1, :] = positive_weights
        bbox_outside_weights[labels == 0, :] = negative_weights

        # map up to original set of anchors
        labels = _unmap(labels, total_anchors, inds_inside, fill=-1)
        bbox_targets = _unmap(bbox_targets, total_anchors, inds_inside, fill=0)
        bbox_inside_weights = _unmap(bbox_inside_weights, total_anchors, inds_inside, fill=0)
        bbox_outside_weights = _unmap(bbox_outside_weights, total_anchors, inds_inside, fill=0)

        # labels
        labels = labels.reshape((1, height, width, A)).transpose(0, 3, 1, 2)
        labels = labels.reshape((1, 1, A * height, width))
        top[0].reshape(*labels.shape)
        top[0].data[...] = labels

        # bbox_targets
        bbox_targets = bbox_targets \
            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)
        top[1].reshape(*bbox_targets.shape)
        top[1].data[...] = bbox_targets

        # bbox_inside_weights
        bbox_inside_weights = bbox_inside_weights \
            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)
        assert bbox_inside_weights.shape[2] == height
        assert bbox_inside_weights.shape[3] == width
        top[2].reshape(*bbox_inside_weights.shape)
        top[2].data[...] = bbox_inside_weights

        # bbox_outside_weights
        bbox_outside_weights = bbox_outside_weights \
            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)
        assert bbox_outside_weights.shape[2] == height
        assert bbox_outside_weights.shape[3] == width
        top[3].reshape(*bbox_outside_weights.shape)
        top[3].data[...] = bbox_outside_weights

* proposal层
* roi-data层

c++ layer & loss(未完待续...)
- SmoothL1LossLayer层
- ROIPoolingLayer层
测试

后记
看到讲解faster rcnn的文章无一都要陌拜一下Ross Girshick大神,这里我也膜拜一下,确实厉害.论文写得非常有深度
该算法不是一蹴而就的,经历了rcnn -> fast rcnn ->faser rcnn. faster最大的特点是anchor的设计,不用resize基于相同feature map的regressor出不同,一次运算就出了所有的proposal.
在学习RL的时候就有点惊讶,他们那CNN出来的东西想让它是啥就是啥,然后用loss去修饰它,它就有了合理的解释,把网络拆分,不同部分有不同的含义,还是用不同loss去修饰它们
feature map从原图开始W,H在翻倍减小,维度在翻倍增加,然后map回头映射到输入点阵上,从输入图像上去预测框感觉有点玄妙,因为一个随便图可以有各式各样,给它合理的loss它就合理了
最后作者还给除了切割实验,把算法中的component替换验证其必要性着实严禁
也借着学习faster过程,窥探了一下caffe的结构,caffe代码框架清晰,比较干净不求大而全,代码也比较简洁,对有深度学习知识的人非常容易上手,这大概就是为啥Ross Girshick要基于caffe写faster rcnn的demo.初次学习一个陌生的框架还是要着眼全局不要过分计较一个局部的细节,全局通顺会带来更多的信息,信息的增多会细节的了解更加有帮助.
caffe在cpu环境下加速运算也是一个非常有意思而且有意义的问题,因为很多情况下GPU设置太大太贵在很多环境不合适
后面还有faster rcnn定制的python和cpp层的备注还没有写,抽空赶紧补上

作者：db24cc
链接：https://www.jianshu.com/p/00a6a6efd83d
來源：简书
简书著作权归作者所有，任何形式的转载都请联系作者获得授权并注明出处。

你可能感兴趣的:(faster r-cnn实现过程)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
element实现动态路由+面包屑软件技术NINI vue案例 vue.js 前端
el-breadcrumb是ElementUI组件库中的一个面包屑导航组件，它用于显示当前页面的路径，帮助用户快速理解和导航到应用的各个部分。在Vue.js项目中，如果你已经安装了ElementUI，就可以很方便地使用el-breadcrumb组件。以下是一个基本的使用示例：安装ElementUI（如果你还没有安装的话）:你可以通过npm或yarn来安装ElementUI。bash复制代码npmi
地推话术，如何应对地推过程中家长的拒绝校师学
相信校长们在做地推的时候经常遇到这种情况：市场专员反馈家长不接单，咨询师反馈难以邀约这些家长上门，校区地推疲软，招生难。为什么？仅从地推层面分析，一方面因为家长受到的信息轰炸越来越多，对信息越来越“免疫”；而另一方面地推人员的专业能力和营销话术没有提高，无法应对家长的拒绝，对有意向的家长也不知如何跟进，眼睁睁看着家长走远；对于家长的疑问，更不知道如何有技巧地回答，机会白白流失。由于回答没技巧和专业
微服务下功能权限与数据权限的设计与实现 nbsaas-boot 微服务 java 架构
在微服务架构下，系统的功能权限和数据权限控制显得尤为重要。随着系统规模的扩大和微服务数量的增加，如何保证不同用户和服务之间的访问权限准确、细粒度地控制，成为设计安全策略的关键。本文将讨论如何在微服务体系中设计和实现功能权限与数据权限控制。1.功能权限与数据权限的定义功能权限：指用户或系统角色对特定功能的访问权限。通常是某个用户角色能否执行某个操作，比如查看订单、创建订单、修改用户资料等。数据权限：
2021年12月19日，春蕾教育集团团建活动感受——黄晓丹黄错错加油
感受:1.从陌生到熟悉的过程。游戏环节让我们在轻松的氛围中得到了锻炼，也增长了不少知识。2.游戏过程中，我们贡献的是个人力量，展现的是团队的力量。它磨合的往往不止是工作的熟悉，更是观念上契合度的贴近。3.这和工作是一样的道理。在各自的岗位上，每个人摆正自己的位置、各司其职充分发挥才能，并团结一致劲往一处使，才能实现最大的成功。新知:1.团队精神需要不断地创新。过去，人们把创新看作是冒风险，现在人们
店群合一模式下的社区团购新发展——结合链动 2+1 模式、AI 智能名片与 S2B2C 商城小程序源码说私域人工智能小程序
摘要：本文探讨了店群合一的社区团购平台在当今商业环境中的重要性和优势。通过分析店群合一模式如何将互联网社群与线下终端紧密结合，阐述了链动2+1模式、AI智能名片和S2B2C商城小程序源码在这一模式中的应用价值。这些创新元素的结合为社区团购带来了新的机遇，提升了用户信任感、拓展了营销渠道，并实现了线上线下的完美融合。一、引言随着互联网技术的不断发展，社区团购作为一种新兴的商业模式，在满足消费者日常需
消息中间件有哪些常见类型 xmh-sxh-1314 java
消息中间件根据其设计理念和用途，可以大致分为以下几种常见类型：点对点消息队列（Point-to-PointMessagingQueues）：在这种模型中，消息被发送到特定的队列中，消费者从队列中取出并处理消息。队列中的消息只能被一个消费者消费，消费后即被删除。常见的实现包括IBM的MQSeries、RabbitMQ的部分使用场景等。适用于任务分发、负载均衡等场景。发布/订阅消息模型（Pub/Sub
本周第二次约练 2cfbdfe28a51
中原焦点团队中24初26刘霞2021.12.3约练161次，分享第368天当事人虽然是带着问题来的，但是咨询过程中发现，她是经过自己不断地调整和努力才走到现在的，看到当事人的不容易，找到例外，发现资源，力量感也就随之而来。增强画面感，或者说重温，会给当事人带来更深刻的感受。
腾讯云技术深度探索：构建高效云原生微服务架构我的运维人生云原生架构腾讯云运维开发技术共享
腾讯云技术深度探索：构建高效云原生微服务架构在当今快速发展的技术环境中，云原生技术已成为企业数字化转型的关键驱动力。腾讯云作为行业领先的云服务提供商，不断推出创新的产品和技术，助力企业构建高效、可扩展的云原生微服务架构。本文将深入探讨腾讯云在微服务领域的最新进展，并通过一个实际案例展示如何在腾讯云平台上构建云原生应用。腾讯云微服务架构概览腾讯云微服务架构基于云原生理念，旨在帮助企业快速实现应用的容
2019-12-22-22:30 涓涓1016
今天是冬至，写下我的日更，是因为这两天的学习真的是能量的满满，让我看到了自己，未来另外一种可能性，也让我看到了这两年这几年的过程中我所接受那些痛苦的来源。一切的根源和痛苦都来自于人生，家庭，而你的原生家庭，你的爸爸和妈妈，是因为你这个灵魂在那一刻选择他们作为你的爸爸和妈妈来的，所以你得接受他，你得接纳他，他就是因为他的存在而给你的学习和成长带来这些痛苦，那其实是你必然要经历的这个过程，当你去接纳的
第四天旅游线路预览——从换乘中心到喀纳斯湖陟彼高冈yu 基于Google earth studio 的旅游规划和预览旅游
第四天：从贾登峪到喀纳斯风景区入口，晚上住宿贾登峪；换乘中心有4路车，喀纳斯①号车，去喀纳斯湖，路程时长约5分钟；将上面的的行程安排进行动态展示，具体步骤见”Googleearthstudio进行动态轨迹显示制作过程“、“Googleearthstudio入门教程”和“Googleearthstudio进阶教程“相关内容，得到行程如下所示：Day4-2-480p
LLM 词汇表落难Coder LLMs NLP 大语言模型大模型 llama 人工智能
Contextwindow“上下文窗口”是指语言模型在生成新文本时能够回溯和参考的文本量。这不同于语言模型训练时所使用的大量数据集，而是代表了模型的“工作记忆”。较大的上下文窗口可以让模型理解和响应更复杂和更长的提示，而较小的上下文窗口可能会限制模型处理较长提示或在长时间对话中保持连贯性的能力。Fine-tuning微调是使用额外的数据进一步训练预训练语言模型的过程。这使得模型开始表示和模仿微调数
2020.11.19 隆非凡
日精进，今日体验：在维修过程中遇到的问题，把源头找到，在进行下一步开始。不要停留在一个点上，合理调整心态，把当下事做好。
拥有断舍离的心态，过精简生活--《断舍离》读书笔记爱吃丸子的小樱桃
不知不觉间房间里的东西越来越多，虽然摆放整齐，但也时常会觉得空间逼仄，令人心生烦闷。抱着断舍离的态度，我开始阅读《断舍离》这本书，希望从书中能找到一些有效的方法，帮助我实现空间、物品上的断舍离。《断舍离》是日本作家山下英子通过自己的经历、思考和实践总结而成的，整体内涵也从刚开始的私人生活哲学的“断舍离”升华成了“人生实践哲学”，接着又成为每个人都能实行的“改变人生的断舍离”，从“哲学”逐渐升华成“
2020-04-12每天三百字之连接与替代冷眼看潮
不知道是不是好为人师，有时候还真想和别人分享一下我对某些现象的看法或者解释。人类社会不断发展进步的过程，就是不断连接与替代的过程。人类发现了火并应用火以后，告别了茹毛饮血的野兽般的原始生活（火烧、烹饪替代了生食）人类用石器代替了完全手工，工具的使用使人类进步一大步。类似这样的替代还有很多，随着科技的发展，有更多的原始的事物被替代，代之以更高效、更先进的技术。在近现代，汽车替代了马车，高速公路和铁路
【加密社】Solidity 中的事件机制及其应用加密社闲侃区块链智能合约区块链
加密社引言在Solidity合约开发过程中，事件（Events）是一种非常重要的机制。它们不仅能够让开发者记录智能合约的重要状态变更，还能够让外部系统（如前端应用）监听这些状态的变化。本文将详细介绍Solidity中的事件机制以及如何利用不同的手段来触发、监听和获取这些事件。事件存储的地方当我们在Solidity合约中使用emit关键字触发事件时，该事件会被记录在区块链的交易收据中。具体而言，事件
从0到500+，我是如何利用自媒体赚钱？一列脚印
运营公众号半个多月，从零基础的小白到现在慢慢懂了一些运营的知识。做好公众号是很不容易的，要做很多事情；排版、码字、引流…通通需要自己解决，业余时间全都花费在这上面涨这么多粉丝是真的不容易，对比知乎大佬来说，我们这种没资源，没人脉，还没钱的小透明来说，想要一个月涨粉上万，怕是今天没睡醒（不过你有的方法，算我piapia打脸）至少我是清醒的，自己慢慢努力，实现我的万粉目标！大家快来围观、支持我吧！孩子
使用LLaVa和Ollama实现多模态RAG示例 llzwxh888 python 人工智能开发语言
本文将详细介绍如何使用LLaVa和Ollama实现多模态RAG（检索增强生成），通过提取图像中的结构化数据、生成图像字幕等功能来展示这一技术的强大之处。安装环境首先，您需要安装以下依赖包：!pipinstallllama-index-multi-modal-llms-ollama!pipinstallllama-index-readers-file!pipinstallunstructured!p
利用LangChain的StackExchange组件实现智能问答系统 nseejrukjhad langchain microsoft 数据库 python
利用LangChain的StackExchange组件实现智能问答系统引言在当今的软件开发世界中，StackOverflow已经成为程序员解决问题的首选平台之一。而LangChain作为一个强大的AI应用开发框架，提供了StackExchange组件，使我们能够轻松地将StackOverflow的海量知识库集成到我们的应用中。本文将详细介绍如何使用LangChain的StackExchange组件
GitHub上克隆项目 bigbig猩猩 github
从GitHub上克隆项目是一个简单且直接的过程，它允许你将远程仓库中的项目复制到你的本地计算机上，以便进行进一步的开发、测试或学习。以下是一个详细的步骤指南，帮助你从GitHub上克隆项目。一、准备工作1.安装Git在克隆GitHub项目之前，你需要在你的计算机上安装Git工具。Git是一个开源的分布式版本控制系统，用于跟踪和管理代码变更。你可以从Git的官方网站（https://git-scm.
MongoDB Oplog 窗口喝醉酒的小白 MongoDB 运维
在MongoDB中，oplog（操作日志）是一个特殊的日志系统，用于记录对数据库的所有写操作。oplog允许副本集成员（通常是从节点）应用主节点上已经执行的操作，从而保持数据的一致性。它是MongoDB副本集实现数据复制的基础。MongoDBOplog窗口oplog窗口是指在MongoDB副本集中，从节点可以用来同步数据的时间范围。这个窗口通常由以下因素决定：Oplog大小：oplog的大小是有限
Faiss Tips：高效向量搜索与聚类的利器焦习娜Samantha
FaissTips：高效向量搜索与聚类的利器faiss_tipsSomeusefultipsforfaiss项目地址:https://gitcode.com/gh_mirrors/fa/faiss_tips项目介绍Faiss是由FacebookAIResearch开发的一个用于高效相似性搜索和密集向量聚类的库。它支持多种硬件平台，包括CPU和GPU，能够在海量数据集上实现快速的近似最近邻搜索（AN
ARM中断处理过程落汤老狗嵌入式linux
一、前言本文主要以ARM体系结构下的中断处理为例，讲述整个中断处理过程中的硬件行为和软件动作。具体整个处理过程分成三个步骤来描述：1、第二章描述了中断处理的准备过程2、第三章描述了当发生中的时候，ARM硬件的行为3、第四章描述了ARM的中断进入过程4、第五章描述了ARM的中断退出过程本文涉及的代码来自3.14内核。另外，本文注意描述ARM指令集的内容，有些sourcecode为了简短一些，删除了T
node.js学习小猿L node.js node.js 学习 vim
node.js学习实操及笔记温故node.js，node.js学习实操过程及笔记~node.js学习视频node.js官网node.js中文网实操笔记githubcsdn笔记为什么学node.js可以让别人访问我们编写的网页为后续的框架学习打下基础，三大框架vuereactangular离不开node.jsnode.js是什么官网：node.js是一个开源的、跨平台的运行JavaScript的运行
Python 实现图片裁剪（附代码） | Python工具剑客阿良_ALiang
前言本文提供将图片按照自定义尺寸进行裁剪的工具方法，一如既往的实用主义。环境依赖ffmpeg环境安装，可以参考我的另一篇文章：windowsffmpeg安装部署_阿良的博客-CSDN博客本文主要使用到的不是ffmpeg，而是ffprobe也在上面这篇文章中的zip包中。ffmpy安装：pipinstallffmpy-ihttps://pypi.douban.com/simple代码不废话了，上代码
数据仓库——维度表一致性墨染丶eye 背诵数据仓库
数据仓库基础笔记思维导图已经整理完毕，完整连接为：数据仓库基础知识笔记思维导图维度一致性问题从逻辑层面来看，当一系列星型模型共享一组公共维度时，所涉及的维度称为一致性维度。当维度表存在不一致时，短期的成功难以弥补长期的错误。维度时确保不同过程中信息集成起来实现横向钻取货活动的关键。造成横向钻取失败的原因维度结构的差别，因为维度的差别，分析工作涉及的领域从简单到复杂，但是都是通过复杂的报表来弥补设计
第六集如何安装CentOS7.0，3分钟学会centos7安装教程 date分享
从光盘引导系统按回车键继续进入引导程序安装界面，选择语言这里选择简体中文版点击继续选择桌面安装下面给系统分区选择磁盘，点击完成选择基本分区，点击加号swap分区,大小填内存的两倍在选择根分区，使用所有可用的磁盘空间选择文件系统ext4点击完成，点击开始安装设置root密码，点击完成设置普通用户和密码，点击完成整个过程持续八分钟左右根据个人配置不同，时间长短不同好，现在点击重启系统进入重启状态点击本
ARM驱动学习之5 LEDS驱动 JT灬新一嵌入式 C 底层 arm开发学习单片机
ARM驱动学习之5LEDS驱动知识点：•linuxGPIO申请函数和赋值函数–gpio_request–gpio_set_value•三星平台配置GPIO函数–s3c_gpio_cfgpin•GPIO配置输出模式的宏变量–S3C_GPIO_OUTPUT注意点：DRIVER_NAME和DEVICE_NAME匹配。实现步骤：1.加入需要的头文件：//Linux平台的gpio头文件#include//三
Low Power概念介绍-Voltage Area 飞奔的大虎
随着智能手机，以及物联网的普及，芯片功耗的问题最近几年得到了越来越多的重视。为了实现集成电路的低功耗设计目标，我们需要在系统设计阶段就采用低功耗设计的方案。而且，随着设计流程的逐步推进，到了芯片后端设计阶段，降低芯片功耗的方法已经很少了，节省的功耗百分比也不断下降。芯片的功耗主要由静态功耗（staticleakagepower）和动态功耗(dynamicpower)构成。静态功耗主要是指电路处于等
这个世界为何对女性这么苛刻遇见知见
图片发自App当今社会的女性，简直用金刚侠来形容都不为过。虽然早已过了男尊女卑的时代，但是这个世界并没有平等的对待女性。新时代的女性标准：上得了厅堂，下得了厨房，杀得了木马，翻得了围墙，开得起好车，买得起新房，斗得过二奶，打得过流氓，生得了孩子，养得了家庭。这个社会对女性有太多的不公平，既要求女性经济独立，又要求女性贤良淑德。所有的女性的在成长过程中没有任何一项是因为你是女性而给你开绿灯的。图片发
rust的指针作为函数返回值是直接传递，还是先销毁后创建？ wudixiaotie 返回值
这是我自己想到的问题，结果去知呼提问，还没等别人回答，我自己就想到方法实验了。。 fn main() { let mut a = 34; println!("a's addr:{:p}", &a); let p = &mut a; println!("p's addr:{:p}", &a
java编程思想 -- 数据的初始化百合不是茶 java 数据的初始化
1.使用构造器确保数据初始化 /* *在ReckInitDemo类中创建Reck的对象 */ public class ReckInitDemo { public static void main(String[] args) { //创建Reck对象 new Reck(); } }
[航天与宇宙]为什么发射和回收航天器有档期 comsci
地球的大气层中有一个时空屏蔽层,这个层次会不定时的出现,如果该时空屏蔽层出现,那么将导致外层空间进入的任何物体被摧毁,而从地面发射到太空的飞船也将被摧毁... 所以,航天发射和飞船回收都需要等待这个时空屏蔽层消失之后,再进行 &
linux下批量替换文件内容商人shang linux 替换
1、网络上现成的资料　　格式: sed -i "s/查找字段/替换字段/g" `grep 查找字段 -rl 路径` 　　linux sed 批量替换多个文件中的字符串　　sed -i "s/oldstring/newstring/g" `grep oldstring -rl yourdir` 　　例如：替换/home下所有文件中的www.admi
网页在线天气预报 oloz 天气预报
网页在线调用天气预报 <%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transit
SpringMVC和Struts2比较杨白白 springMVC
1. 入口 spring mvc的入口是servlet，而struts2是filter（这里要指出，filter和servlet是不同的。以前认为filter是servlet的一种特殊），这样就导致了二者的机制不同，这里就牵涉到servlet和filter的区别了。参见：http://blog.csdn.net/zs15932616453/article/details/8832343 2
refuse copy, lazy girl! 小桔子 copy
妹妹坐船头啊啊啊啊！都打算一点点琢磨呢。文字编辑也写了基本功能了。。今天查资料，结果查到了人家写得完完整整的。我清楚的认识到： 1.那是我自己觉得写不出的高度 2.如果直接拿来用，很快就能解决问题 3.然后就是抄咩~~ 4.肿么可以这样子，都不想写了今儿个，留着作参考吧！拒绝大抄特抄，慢慢一点点写！
apache与php整合 aichenglong php apache web
一 apache web服务器 1 apeche web服务器的安装 1)下载Apache web服务器 2)配置域名(如果需要使用要在DNS上注册) 3)测试安装访问http://localhost/验证是否安装成功 2 apache管理 1)service.msc进行图形化管理 2)命令管理，配
Maven常用内置变量 AILIKES maven
Built-in properties ${basedir} represents the directory containing pom.xml ${version} equivalent to ${project.version} (deprecated: ${pom.version}) Pom/Project properties Al
java的类和对象百合不是茶 JAVA面向对象类对象
java中的类： java是面向对象的语言，解决问题的核心就是将问题看成是一个类，使用类来解决 java使用 class 类名来创建类，在Java中类名要求和构造方法，Java的文件名是一样的创建一个A类： class A{ } java中的类：将某两个事物有联系的属性包装在一个类中，再通
JS控制页面输入框为只读 bijian1013 JavaScript
在WEB应用开发当中，增、删除、改、查功能必不可少，为了减少以后维护的工作量，我们一般都只做一份页面，通过传入的参数控制其是新增、修改或者查看。而修改时需将待修改的信息从后台取到并显示出来，实际上就是查看的过程，唯一的区别是修改时，页面上所有的信息能修改，而查看页面上的信息不能修改。因此完全可以将其合并，但通过前端JS将查看页面的所有信息控制为只读，在信息量非常大时，就比较麻烦。
AngularJS与服务器交互 bijian1013 JavaScript AngularJS $http
对于AJAX应用（使用XMLHttpRequests）来说，向服务器发起请求的传统方式是：获取一个XMLHttpRequest对象的引用、发起请求、读取响应、检查状态码，最后处理服务端的响应。整个过程示例如下： var xmlhttp = new XMLHttpRequest(); xmlhttp.onreadystatechange
[Maven学习笔记八]Maven常用插件应用 bit1129 maven
常用插件及其用法位于：http://maven.apache.org/plugins/ 1. Jetty server plugin 2. Dependency copy plugin 3. Surefire Test plugin 4. Uber jar plugin 1. Jetty Pl
【Hive六】Hive用户自定义函数(UDF) bit1129 自定义函数
1. 什么是Hive UDF Hive是基于Hadoop中的MapReduce，提供HQL查询的数据仓库。Hive是一个很开放的系统，很多内容都支持用户定制，包括：文件格式：Text File，Sequence File 内存中的数据格式： Java Integer/String, Hadoop IntWritable/Text 用户提供的 map/reduce 脚本：不管什么
杀掉nginx进程后丢失nginx.pid，如何重新启动nginx ronin47 nginx 重启 pid丢失
nginx进程被意外关闭，使用nginx -s reload重启时报如下错误：nginx: [error] open() “/var/run/nginx.pid” failed (2: No such file or directory)这是因为nginx进程被杀死后pid丢失了，下一次再开启nginx -s reload时无法启动解决办法：nginx -s reload 只是用来告诉运行中的ng
UI设计中我们为什么需要设计动效 brotherlamp UI ui教程 ui视频 ui资料 ui自学
随着国际大品牌苹果和谷歌的引领，最近越来越多的国内公司开始关注动效设计了，越来越多的团队已经意识到动效在产品用户体验中的重要性了，更多的UI设计师们也开始投身动效设计领域。但是说到底，我们到底为什么需要动效设计？或者说我们到底需要什么样的动效？做动效设计也有段时间了，于是尝试用一些案例，从产品本身出发来说说我所思考的动效设计。一、加强体验舒适度嗯，就是让用户更加爽更加爽的用你的产品。
Spring中JdbcDaoSupport的DataSource注入问题 bylijinnan java spring
参考以下两篇文章： http://www.mkyong.com/spring/spring-jdbctemplate-jdbcdaosupport-examples/ http://stackoverflow.com/questions/4762229/spring-ldap-invoking-setter-methods-in-beans-configuration Sprin
数据库连接池的工作原理 chicony 数据库连接池
随着信息技术的高速发展与广泛应用，数据库技术在信息技术领域中的位置越来越重要，尤其是网络应用和电子商务的迅速发展，都需要数据库技术支持动态Web站点的运行，而传统的开发模式是：首先在主程序（如Servlet、Beans）中建立数据库连接；然后进行SQL操作，对数据库中的对象进行查询、修改和删除等操作；最后断开数据库连接。使用这种开发模式，对
java 关键字 CrazyMizzz java
关键字是事先定义的，有特别意义的标识符，有时又叫保留字。对于保留字，用户只能按照系统规定的方式使用，不能自行定义。 Java中的关键字按功能主要可以分为以下几类：（1）访问修饰符 public,private,protected p
Hive中的排序语法 daizj 排序 hive order by DISTRIBUTE BY sort by
Hive中的排序语法 2014.06.22 ORDER BY hive中的ORDER BY语句和关系数据库中的sql语法相似。他会对查询结果做全局排序，这意味着所有的数据会传送到一个Reduce任务上，这样会导致在大数量的情况下，花费大量时间。与数据库中 ORDER BY 的区别在于在hive.mapred.mode = strict模式下，必须指定 limit 否则执行会报错。
单态设计模式 dcj3sjt126com 设计模式
单例模式（Singleton）用于为一个类生成一个唯一的对象。最常用的地方是数据库连接。使用单例模式生成一个对象后，该对象可以被其它众多对象所使用。 <?phpclass Example{ // 保存类实例在此属性中 private static&
svn locked dcj3sjt126com Lock
post-commit hook failed (exit code 1) with output: svn: E155004: Working copy 'D:\xx\xxx' locked svn: E200031: sqlite: attempt to write a readonly database svn: E200031: sqlite: attempt to write a
ARM寄存器学习 e200702084 数据结构 C++c C#F#
无论是学习哪一种处理器，首先需要明确的就是这种处理器的寄存器以及工作模式。 ARM有37个寄存器，其中31个通用寄存器，6个状态寄存器。 1、不分组寄存器（R0-R7）不分组也就是说说，在所有的处理器模式下指的都时同一物理寄存器。在异常中断造成处理器模式切换时，由于不同的处理器模式使用一个名字相同的物理寄存器，就是
常用编码资料 gengzg 编码
List<UserInfo> list=GetUserS.GetUserList(11); String json=JSON.toJSONString(list); HashMap<Object,Object> hs=new HashMap<Object, Object>(); for(int i=0;i<10;i++) {
进程 vs. 线程 hongtoushizi 线程 linux 进程
我们介绍了多进程和多线程，这是实现多任务最常用的两种方式。现在，我们来讨论一下这两种方式的优缺点。首先，要实现多任务，通常我们会设计Master-Worker模式，Master负责分配任务，Worker负责执行任务，因此，多任务环境下，通常是一个Master，多个Worker。如果用多进程实现Master-Worker，主进程就是Master，其他进程就是Worker。如果用多线程实现
Linux定时Job：crontab -e 与 /etc/crontab 的区别 Josh_Persistence linux crontab
一、linux中的crotab中的指定的时间只有5个部分：* * * * * 分别表示：分钟，小时，日，月，星期，具体说来：第一段代表分钟 0—59 第二段代表小时 0—23 第三段代表日期 1—31 第四段代表月份 1—12 第五段代表星期几，0代表星期日 0—6 如： */1 * * * * 每分钟执行一次。 *
KMP算法详解 hm4123660 数据结构 C++算法字符串 KMP
字符串模式匹配我们相信大家都有遇过，然而我们也习惯用简单匹配法（即Brute-Force算法)，其基本思路就是一个个逐一对比下去，这也是我们大家熟知的方法，然而这种算法的效率并不高，但利于理解。假设主串s="ababcabcacbab",模式串为t="
枚举类型的单例模式 zhb8015 单例模式
E.编写一个包含单个元素的枚举类型[极推荐]。代码如下： public enum MaYun {himself; //定义一个枚举的元素，就代表MaYun的一个实例private String anotherField;MaYun() {//MaYun诞生要做的事情//这个方法也可以去掉。将构造时候需要做的事情放在instance赋值的时候：/** himself = MaYun() {*
Kafka+Storm+HDFS ssydxa219 storm
cd /myhome/usr/stormbin/storm nimbus &bin/storm supervisor &bin/storm ui &Kafka+Storm+HDFS整合实践kafka_2.9.2-0.8.1.1.tgzapache-storm-0.9.2-incubating.tar.gzKafka安装配置我们使用3台机器搭建Kafk
Java获取本地服务器的IP 中华好儿孙 java Web 获取服务器ip地址
System.out.println("getRequestURL:"+request.getRequestURL()); System.out.println("getLocalAddr:"+request.getLocalAddr()); System.out.println("getLocalPort:&quo