RPN的实现细节

https://github.com/abbyQu/Mask_RCNN

RPN的输入是卷积层最后的feature map,所谓的sliding window其实还是做卷积,用n*n论文里说的是窗口大小,其实就是卷积核大小。这里取n=3。

上述卷积结果,得到了一个共享的层,代码里叫shared、这个共享层,指的是class(是否为roi)和regression(4个bbox数据的回归)共享。

论文里提到说shared到cls和reg是个fcn,代码里也是用一个con2d实现的。卷积后,reshape成对应的维度:reg头是[batchsize,anchorsize,4],cls头是[batchsize,anchorsize,2]。

def rpn_graph(feature_map, anchors_per_location, anchor_stride):
  
    shared = KL.Conv2D(512, (3, 3), padding='same', activation='relu',
                       strides=anchor_stride,
                       name='rpn_conv_shared')(feature_map)

    # Anchor Score. [batch, height, width, anchors per location * 2].
    x = KL.Conv2D(2 * anchors_per_location, (1, 1), padding='valid',
                  activation='linear', name='rpn_class_raw')(shared)

    # Reshape to [batch, anchors, 2]
    rpn_class_logits = KL.Lambda(
        lambda t: tf.reshape(t, [tf.shape(t)[0], -1, 2]))(x)

    # Softmax on last dimension of BG/FG.
    rpn_probs = KL.Activation(
        "softmax", name="rpn_class_xxx")(rpn_class_logits)

    # Bounding box refinement. [batch, H, W, anchors per location, depth]
    # where depth is [x, y, log(w), log(h)]
    x = KL.Conv2D(anchors_per_location * 4, (1, 1), padding="valid",
                  activation='linear', name='rpn_bbox_pred')(shared)

    # Reshape to [batch, anchors, 4]
    rpn_bbox = KL.Lambda(lambda t: tf.reshape(t, [tf.shape(t)[0], -1, 4]))(x)

    return [rpn_class_logits, rpn_probs, rpn_bbox]

 

你可能感兴趣的:(RPN的实现细节)