py-faster-rcnn---output_alt_opt

1、output_alt_opt

faster_rcnn_alt_opt.sh
train_faster_rcnn_alt_opt.py

Stage 1 RPN, init from ImageNet model

RPN训练过程:
train_rpn中:

cfg.TRAIN.PROPOSAL_METHOD = 'gt'模式设定,之后会调用pascal_voc.py中gt_roidb
cfg.TRAIN.IMS_PER_BATCH = 1
get_roidb准备roidb, imdb
train_net训练RPN

get_roidb中:

imdb = get_imdb(imdb_name)初始化imdb类,调用factory.py和pascal.py
训练RPN时rpn_file=None,只有ground truth的框
roidb = get_training_roidb(imdb)调用train.py中get_training_roidb函数,得到roidb

get_training_roidb中:

imdb.append_flipped_images()(imdb.py中)水平翻转可以看作一种数据扩充方式,将gt_roidb函数(pascal_voc.py)中得到的roidb[i]['boxes']翻转,图像索引加倍
rdl_roidb.prepare_roidb(imdb)(roidb.py中)得到roidb[i]['max_classes'],roidb[i]['max_overlaps'],roidb[i]['image'],roidb[i]['width'],roidb[i]['height']

gt_roidb中:

解析标注的xml文件(ground truth)/data/VOCdevkit2007/VOC2007/Annotations得到gt_roidb
gt_roidb包括:
{'boxes' : boxes,
'gt_classes': gt_classes,
'gt_overlaps' : overlaps,
'flipped' : False,
'seg_areas' : seg_areas}

gt_roidb中维度如下(函数_load_pascal_annotation)

boxes = np.zeros((num_objs, 4), dtype=np.uint16)
gt_classes = np.zeros((num_objs), dtype=np.int32)
overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
# "Seg" area for pascal is just the box area
seg_areas = np.zeros((num_objs), dtype=np.float32)
# Load object bounding boxes into a data frame.
for ix, obj in enumerate(objs):
     bbox = obj.find('bndbox')
      # Make pixel indexes 0-based
     x1 = float(bbox.find('xmin').text) - 1
     y1 = float(bbox.find('ymin').text) - 1
     x2 = float(bbox.find('xmax').text) - 1
     y2 = float(bbox.find('ymax').text) - 1
     cls = self._class_to_ind[obj.find('name').text.lower().strip()]
     boxes[ix, :] = [x1, y1, x2, y2]
     gt_classes[ix] = cls
     overlaps[ix, cls] = 1.0
     seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

train_rpn中get_roidb之后调用train_net训练:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:

roidb = filter_roidb(roidb)图片不满足至少有一个前景或至少有一个背景的条件,overlaps:0-0.5背景0.5-1前景,RPN训练时都是ground-truth,overlaps都是1,所以该函数无用
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)加载预训练模型和solver_prototxt等

stage1_rpn_train.pt

name: "ZF"
layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 21"
  }
}

调用roi_data_layer.layer层,该层就是调用程序minibatch.py中get_minibatch函数。
get_minibatch

rois_per_image:每个图像最多包含的boxes个数,这里取128/2=64 //use RPN这个参数没有
fg_rois_per_image:rois_per_image的0.25,前景16个 //use RPN这个参数没有
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)得到缩放系数,原图和网络输入图像的比例,im_blob网络的输入blob
if cfg.TRAIN.HAS_RPN:训练RPN的时候
gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]正样本
gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
blobs['gt_boxes'] = gt_boxes5列,前4列坐标,最后一列是类别,行数是正样本的数
blobs['im_info'] = np.array([[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],dtype=np.float32)im_scale = float(target_size) / float(im_size_min) = 600/min(P,Q);(im_blob.shape[2], im_blob.shape[3]) = (M,N);min(M,N) = 600
函数返回blob

layer {
  name: 'rpn-data'
  type: 'Python'
  bottom: 'rpn_cls_score'
  bottom: 'gt_boxes'
  bottom: 'im_info'
  bottom: 'data'
  top: 'rpn_labels'
  top: 'rpn_bbox_targets'
  top: 'rpn_bbox_inside_weights'
  top: 'rpn_bbox_outside_weights'
  python_param {
    module: 'rpn.anchor_target_layer'
    layer: 'AnchorTargetLayer'
    param_str: "'feat_stride': 16"
  }
}

generate_anchors(base_size=16, ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)):

在网络开始就得到了9个anchor的大小定义,这9个是feature_map第一个cell的anchor

def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                     scales=2**np.arange(3, 6)):
    """
    Generate anchor (reference) windows by enumerating aspect ratios X
    scales wrt a reference (0, 0, 15, 15) window.
    """

    base_anchor = np.array([1, 1, base_size, base_size]) - 1  # [0, 0, 15, 15]
    ratio_anchors = _ratio_enum(base_anchor, ratios) 
'''[[ -3.5,   2. ,  18.5,  13. ],
    [  0. ,   0. ,  15. ,  15. ],
    [  2.5,  -3. ,  12.5,  18. ]]'''
    anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                         for i in xrange(ratio_anchors.shape[0])])
'''
[[ -84.  -40.   99.   55.]
 [-176.  -88.  191.  103.]
 [-360. -184.  375.  199.]
 [ -56.  -56.   71.   71.]
 [-120. -120.  135.  135.]
 [-248. -248.  263.  263.]
 [ -36.  -80.   51.   95.]
 [ -80. -168.   95.  183.]
 [-168. -344.  183.  359.]]
'''
    return anchors

anchor_target_layer.py:

生成每个锚点的训练目标和标签,将其分类为1 (object),0 (not object) , -1 (ignore)。当label>0,也就是有object时,将会进行box的回归。
forward函数:在每一个cell中,生成9个锚点,提供这9个锚点的细节信息,过滤掉超过图像的锚点,测量同GT的overlap。
1、产生proposal,A个anchors,K个shifts,这里A=9K=H*W,W、H代表featuremap的长宽,一张图中均匀地取了61 x 36个点,shift_x和shift_y分别是这些点在图中的偏移位置,通过对九个anchor坐标偏移可以使feature_map的每个cell都有9个anchor。H x feat_stride以及W x feat_stride正好约等于rescale以后的每张图的大小,feat_stride=16
2、去除超过图像边界的anchors(裁减掉了2/3左右)
3、labels全-1
4、anchors和gt_boxes的overlaps
5、labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0 ==>0.3
labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1 ==>0.7
6、对正样本和负样本采样(正样本与负样本保持1:1)
num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)num_fg不大于0.5*256

num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)负样本数目
7、得到bbox_targets((len(inds_inside), 4):负样本是全是0、bbox_inside_weights((len(inds_inside), 4)z正样本四个数都赋值为cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS=1、bbox_outside_weights((len(inds_inside), 4)如下:

    if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:#(-1)
        # uniform weighting of examples (given non-uniform sampling)
        num_examples = np.sum(labels >= 0)
        positive_weights = np.ones((1, 4)) * 1.0 / num_examples
        negative_weights = np.ones((1, 4)) * 1.0 / num_examples
    bbox_outside_weights[labels == 1, :] = positive_weights
    bbox_outside_weights[labels == 0, :] = negative_weights

8、_unmap:all_anchors裁减掉了2/3左右,仅仅保留在图像内的anchor,这里就是将其复原作为下一层的输入了,并reshape成相应的格式


Stage 1 RPN, generate proposals

rpn_generate中:

cfg.TEST.RPN_PRE_NMS_TOP_N = -1 # no pre NMS filtering
cfg.TEST.RPN_POST_NMS_TOP_N = 2000最后得到的proposal不超过2000
rpn_net = caffe.Net(rpn_test_prototxt, rpn_model_path, caffe.TEST)使用上面rpn训练的模型,prototxt:rpn_test.pt进行测试得到proposal

rpn_test.pt中:

layer {
  name: 'proposal'
  type: 'Python'
  bottom: 'rpn_cls_prob_reshape'
  bottom: 'rpn_bbox_pred'
  bottom: 'im_info'
  top: 'rois'
  top: 'scores'
  python_param {
    module: 'rpn.proposal_layer'
    layer: 'ProposalLayer'
    param_str: "'feat_stride': 16"
  }
}

rpn.proposal_layer-->proposal_layer.py:这个函数是用来将RPN的输出转变为object proposals的。作者新增了ProposalLayer类,这个类中,重新了set_up和forward函数

forward:
生成锚点box、对于每个锚点提供box的参数细节
将预测框切成图像,删除宽、高小于阈值(16 * im_info[2])的框
将所有的(proposal, score) 对排序
获取 pre_nms_topN proposals(这里不执行,因为cfg.TEST.RPN_PRE_NMS_TOP_N = -1)
获取NMS(阈值0.7)
获取 after_nms_topN proposals,这里cfg.TEST.RPN_POST_NMS_TOP_N = 2000,取前2000个(原来没有2000个可能出错)


Stage 1 Fast R-CNN using RPN proposals, init from ImageNet model

train_fast_rcnn中:

cfg.TRAIN.PROPOSAL_METHOD = 'rpn'将调用pascal_voc.py中rpn_roidb
cfg.TRAIN.IMS_PER_BATCH = 2
get_roidb准备roidb, imdb,train_net训练RPN

get_roidb中:

imdb = get_imdb(imdb_name)初始化imdb类,调用factory.py和pascal.py
训练RPN时rpn_file=None,之后ground truth的框
roidb = get_training_roidb(imdb)调用train.py中get_training_roidb函数,得到roidb

get_training_roidb中:

imdb.append_flipped_images()(imdb.py中)水平翻转可以看作一种数据扩充方式,将rpn_roidb函数(pascal_voc.py)中得到的roidb[i]['boxes']翻转,图像索引加倍
rdl_roidb.prepare_roidb(imdb)(roidb.py中)得到roidb[i]['max_classes'],roidb[i]['max_overlaps'],roidb[i]['image'],roidb[i]['width'],roidb[i]['height']

rpn_roidb中:

gt_roidb = self.gt_roidb()首先得到gt_roidb,ground truth
rpn_roidb = self._load_rpn_roidb(gt_roidb)得到从rpn_file得到rpn_roidb, 'gt_overlaps' 是和gt_roidb的最大重叠值,num_classes列只有一列有值其余为0
roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)叠加到一起

train_fast_rcnn中get_roidb之后调用train_net训练:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:

roidb = filter_roidb(roidb)图片不满足至少有一个前景或至少有一个背景的条件,overlaps:0-0.5背景0.5-1前景
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)加载预训练模型,以及加载回归框参数rdl_roidb.add_bbox_regression_targets(roidb)

roidb.py中add_bbox_regression_targets:

roidb[im_i]['bbox_targets'] = _compute_targets(rois, max_overlaps, max_classes)
(均值方差计算)未知

_compute_targets:

gt_inds:ground-truth ROIs
ex_inds:fg ROIs(这里判断overlaps阈值大于0.5,所以包含了gt_inds,因为ground-truth的overlaps=1)
ex_gt_overlaps:ex ROI和gt ROI的overlaps,返回num_ex * num_gt
gt_assignment:每个ex ROI和gt ROI overlaps最大的gt ROI索引
gt_rois、ex_rois:gt_inds和ex_inds对应的box

targets:rois.shape[0]* 5,第一列是labels,之后ex_inds处写label,其他行是0。后四列是4个方位偏移,也是ex_inds处才写,之前说ex_inds包含gt_inds,但偏移量是0,所以不进行回归

def _compute_targets(rois, overlaps, labels):
    """Compute bounding-box regression targets for an image."""
    # Indices of ground-truth ROIs
    gt_inds = np.where(overlaps == 1)[0]
    if len(gt_inds) == 0:
        # Bail if the image has no ground-truth ROIs
        return np.zeros((rois.shape[0], 5), dtype=np.float32)
    # Indices of examples for which we try to make predictions
    ex_inds = np.where(overlaps >= cfg.TRAIN.BBOX_THRESH)[0]

    # Get IoU overlap between each ex ROI and gt ROI
    ex_gt_overlaps = bbox_overlaps(
        np.ascontiguousarray(rois[ex_inds, :], dtype=np.float),
        np.ascontiguousarray(rois[gt_inds, :], dtype=np.float))

    # Find which gt ROI each ex ROI has max overlap with:
    # this will be the ex ROI's gt target
    gt_assignment = ex_gt_overlaps.argmax(axis=1)
    gt_rois = rois[gt_inds[gt_assignment], :]
    ex_rois = rois[ex_inds, :]

    targets = np.zeros((rois.shape[0], 5), dtype=np.float32)
    targets[ex_inds, 0] = labels[ex_inds]
    targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)
    return targets

stage1_fast_rcnn_train.pt文件中:

name: "ZF"
layer {
  name: 'data'
  type: 'Python'
  top: 'data'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 21"
  }
}

调用roi_data_layer.layer层,该层就是调用程序minibatch.py中get_minibatch函数。(end2end方法中ProposalTargetLayer层起相同作用)
get_minibatch

rois_per_image:每个图像最多包含的boxes个数,这里取128/1=128
fg_rois_per_image:rois_per_image的0.25,前景32个(正负样本比:1:3)
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)得到缩放系数,原图和网络输入图像的比例,im_blob网络的输入blob
if cfg.TRAIN.HAS_RPN:训练RPN的时候
else:训练fast-rcnn
主要调用_sample_rois函数得到labels, overlaps, im_rois, bbox_targets, bbox_inside_weights,roidb中坐标都是对应原图(P*Q),所以im_rois也是
rois = _project_im_rois(im_rois, im_scales[im_i])得到符合网络输入的roi (M*N)
rois_blob:5列的二维数组,第一列代表这个box是batch中第几个图像,后四列是坐标,所有batch的rois都叠加在一起,赋值给blobs['rois']
labels_blob,bbox_targets_blob,bbox_inside_blob也是batch叠加在一起,赋值给blobs['labels'],blobs['bbox_targets'],blobs['bbox_inside_weights'],blobs['bbox_outside_weights']
blobs['bbox_outside_weights'] = np.array(bbox_inside_blob > 0).astype(np.float32)bbox_outside_weights计算方法(用途未知)
函数返回blob

_sample_rois

fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]大于0.5为前景(有一个问题是这个阈值会筛选出ground-truth,overlaps=1,之后随机选取正样本也可能选出gt,对于gt回归的偏移量为0)
fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)取32和前景数最小的值
fg_inds = npr.choice(fg_inds, size=fg_rois_per_this_image, replace=False)随机取fg_rois_per_this_image个前景
bg_rois_per_this_image:背景个数,64-fg_rois_per_this_image和overlaps小于0.5的最小值
bg_inds:随机取的背景数的索引
keep_inds = np.append(fg_inds, bg_inds)前景和背景索引叠加
labels = labels[keep_inds]得到labels
labels[fg_rois_per_this_image:] = 0背景的labels=0
overlaps = overlaps[keep_inds]overlaps
rois = rois[keep_inds]rois
bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(roidb['bbox_targets'][keep_inds, :], num_classes)bbox_targets维度keep_inds*84,(84 = 4 * num_classes),由原来的5列变成84列,将其中的前景行对应类别的偏移量赋值(4个参数),其他列是0;bbox_inside_weights与bbox_targets维度相同,bbox_targets赋值四个偏移量的位置,bbox_inside_weights全赋值为1

"""Compute minibatch blobs for training a Fast R-CNN network."""
def get_minibatch(roidb, num_classes):
    """Given a roidb, construct a minibatch sampled from it."""
    num_images = len(roidb)
    # Sample random scales to use for each image in this batch
    random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
                                    size=num_images)
    assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
        'num_images ({}) must divide BATCH_SIZE ({})'. \
        format(num_images, cfg.TRAIN.BATCH_SIZE)
    rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
    # fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
    fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

    # Get the input image blob, formatted for caffe
    im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)

    blobs = {'data': im_blob}

    if cfg.TRAIN.HAS_RPN:
        assert len(im_scales) == 1, "Single batch only"
        assert len(roidb) == 1, "Single batch only"
        # gt boxes: (x1, y1, x2, y2, cls)
        gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
        gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
        gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
        gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
        blobs['gt_boxes'] = gt_boxes
        blobs['im_info'] = np.array(
            [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],
            dtype=np.float32)
    else: # not using RPN
        # Now, build the region of interest and label blobs
        rois_blob = np.zeros((0, 5), dtype=np.float32)
        labels_blob = np.zeros((0), dtype=np.float32)
        bbox_targets_blob = np.zeros((0, 4 * num_classes), dtype=np.float32)
        bbox_inside_blob = np.zeros(bbox_targets_blob.shape, dtype=np.float32)
        # all_overlaps = []
        for im_i in xrange(num_images):
            labels, overlaps, im_rois, bbox_targets, bbox_inside_weights \
                = _sample_rois(roidb[im_i], fg_rois_per_image, rois_per_image,
                               num_classes)

            # Add to RoIs blob
            rois = _project_im_rois(im_rois, im_scales[im_i])
            batch_ind = im_i * np.ones((rois.shape[0], 1))
            rois_blob_this_image = np.hstack((batch_ind, rois))
            rois_blob = np.vstack((rois_blob, rois_blob_this_image))

            # Add to labels, bbox targets, and bbox loss blobs
            labels_blob = np.hstack((labels_blob, labels))
            bbox_targets_blob = np.vstack((bbox_targets_blob, bbox_targets))
            bbox_inside_blob = np.vstack((bbox_inside_blob, bbox_inside_weights))
            # all_overlaps = np.hstack((all_overlaps, overlaps))

        # For debug visualizations
        # _vis_minibatch(im_blob, rois_blob, labels_blob, all_overlaps)

        blobs['rois'] = rois_blob
        blobs['labels'] = labels_blob

        if cfg.TRAIN.BBOX_REG:
            blobs['bbox_targets'] = bbox_targets_blob
            blobs['bbox_inside_weights'] = bbox_inside_blob
            blobs['bbox_outside_weights'] = \
                np.array(bbox_inside_blob > 0).astype(np.float32)

    return blobs

def _sample_rois(roidb, fg_rois_per_image, rois_per_image, num_classes):
    """Generate a random sample of RoIs comprising foreground and background
    examples.
    """
    # label = class RoI has max overlap with
    labels = roidb['max_classes']
    overlaps = roidb['max_overlaps']
    rois = roidb['boxes']

    # Select foreground RoIs as those with >= FG_THRESH overlap
    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
    # Guard against the case when an image has fewer than fg_rois_per_image
    # foreground RoIs
    fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)
    # Sample foreground regions without replacement
    if fg_inds.size > 0:
        fg_inds = npr.choice(
                fg_inds, size=fg_rois_per_this_image, replace=False)

    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
                       (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
    # Compute number of background RoIs to take from this image (guarding
    # against there being fewer than desired)
    bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image
    bg_rois_per_this_image = np.minimum(bg_rois_per_this_image,
                                        bg_inds.size)
    # Sample foreground regions without replacement
    if bg_inds.size > 0:
        bg_inds = npr.choice(
                bg_inds, size=bg_rois_per_this_image, replace=False)

    # The indices that we're selecting (both fg and bg)
    keep_inds = np.append(fg_inds, bg_inds)
    # Select sampled values from various arrays:
    labels = labels[keep_inds]
    # Clamp labels for the background RoIs to 0
    labels[fg_rois_per_this_image:] = 0
    overlaps = overlaps[keep_inds]
    rois = rois[keep_inds]

    bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(
            roidb['bbox_targets'][keep_inds, :], num_classes)

    return labels, overlaps, rois, bbox_targets, bbox_inside_weights

def _get_image_blob(roidb, scale_inds):
    """Builds an input blob from the images in the roidb at the specified
    scales.
    """
    num_images = len(roidb)
    processed_ims = []
    im_scales = []
    for i in xrange(num_images):
        im = cv2.imread(roidb[i]['image'])
        if roidb[i]['flipped']:
            im = im[:, ::-1, :]
        target_size = cfg.TRAIN.SCALES[scale_inds[i]]
        im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
                                        cfg.TRAIN.MAX_SIZE)
        im_scales.append(im_scale)
        processed_ims.append(im)

    # Create a blob to hold the input images
    blob = im_list_to_blob(processed_ims)

    return blob, im_scales

def _project_im_rois(im_rois, im_scale_factor):
    """Project image RoIs into the rescaled training image."""
    rois = im_rois * im_scale_factor
    return rois

def _get_bbox_regression_labels(bbox_target_data, num_classes):
    """Bounding-box regression targets are stored in a compact form in the
    roidb.

    This function expands those targets into the 4-of-4*K representation used
    by the network (i.e. only one class has non-zero targets). The loss weights
    are similarly expanded.

    Returns:
        bbox_target_data (ndarray): N x 4K blob of regression targets
        bbox_inside_weights (ndarray): N x 4K blob of loss weights
    """
    clss = bbox_target_data[:, 0]
    bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)
    bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)
    inds = np.where(clss > 0)[0]
    # for ind in inds:
    #     cls = clss[ind]
    #     start = 4 * cls
    #     end = start + 4
    #     bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
    #     bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    # return bbox_targets, bbox_inside_weights
    for ind in inds:
        ind = int(ind)
        cls = clss[ind]
        start = int(4 * cls)
        end = int(start + 4)
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights

你可能感兴趣的:(py-faster-rcnn---output_alt_opt)