1、output_alt_opt
faster_rcnn_alt_opt.sh
train_faster_rcnn_alt_opt.py
Stage 1 RPN, init from ImageNet model
RPN训练过程:
train_rpn中:
cfg.TRAIN.PROPOSAL_METHOD = 'gt'
模式设定,之后会调用pascal_voc.py中gt_roidb
cfg.TRAIN.IMS_PER_BATCH = 1
get_roidb准备roidb, imdb
train_net训练RPN
get_roidb中:
imdb = get_imdb(imdb_name)
初始化imdb类,调用factory.py和pascal.py
训练RPN时rpn_file=None,只有ground truth的框
roidb = get_training_roidb(imdb)
调用train.py中get_training_roidb函数,得到roidb
get_training_roidb中:
imdb.append_flipped_images()
(imdb.py中)水平翻转可以看作一种数据扩充方式,将gt_roidb函数(pascal_voc.py)中得到的roidb[i]['boxes']翻转,图像索引加倍
rdl_roidb.prepare_roidb(imdb)
(roidb.py中)得到roidb[i]['max_classes'],roidb[i]['max_overlaps'],roidb[i]['image'],roidb[i]['width'],roidb[i]['height']
gt_roidb中:
解析标注的xml文件(ground truth)
/data/VOCdevkit2007/VOC2007/Annotations
得到gt_roidb
gt_roidb包括:
{'boxes' : boxes,
'gt_classes': gt_classes,
'gt_overlaps' : overlaps,
'flipped' : False,
'seg_areas' : seg_areas}
gt_roidb中维度如下(函数_load_pascal_annotation)
boxes = np.zeros((num_objs, 4), dtype=np.uint16)
gt_classes = np.zeros((num_objs), dtype=np.int32)
overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
# "Seg" area for pascal is just the box area
seg_areas = np.zeros((num_objs), dtype=np.float32)
# Load object bounding boxes into a data frame.
for ix, obj in enumerate(objs):
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = float(bbox.find('xmin').text) - 1
y1 = float(bbox.find('ymin').text) - 1
x2 = float(bbox.find('xmax').text) - 1
y2 = float(bbox.find('ymax').text) - 1
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
boxes[ix, :] = [x1, y1, x2, y2]
gt_classes[ix] = cls
overlaps[ix, cls] = 1.0
seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)
train_rpn中get_roidb之后调用train_net训练:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:
roidb = filter_roidb(roidb)
图片不满足至少有一个前景或至少有一个背景的条件,overlaps:0-0.5背景0.5-1前景,RPN训练时都是ground-truth,overlaps都是1,所以该函数无用
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)
加载预训练模型和solver_prototxt等
stage1_rpn_train.pt
name: "ZF"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 21"
}
}
调用roi_data_layer.layer
层,该层就是调用程序minibatch.py中get_minibatch函数。
get_minibatch
rois_per_image:
每个图像最多包含的boxes个数,这里取128/2=64 //use RPN这个参数没有
fg_rois_per_image:
rois_per_image的0.25,前景16个 //use RPN这个参数没有
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
得到缩放系数,原图和网络输入图像的比例,im_blob网络的输入blob
if cfg.TRAIN.HAS_RPN:
训练RPN的时候
gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
正样本
gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
blobs['gt_boxes'] = gt_boxes
5列,前4列坐标,最后一列是类别,行数是正样本的数
blobs['im_info'] = np.array([[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],dtype=np.float32)
im_scale = float(target_size) / float(im_size_min) = 600/min(P,Q);(im_blob.shape[2], im_blob.shape[3]) = (M,N);min(M,N) = 600
函数返回blob
layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer'
layer: 'AnchorTargetLayer'
param_str: "'feat_stride': 16"
}
}
generate_anchors(base_size=16, ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)):
在网络开始就得到了9个anchor的大小定义,这9个是feature_map第一个cell的anchor
def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
scales=2**np.arange(3, 6)):
"""
Generate anchor (reference) windows by enumerating aspect ratios X
scales wrt a reference (0, 0, 15, 15) window.
"""
base_anchor = np.array([1, 1, base_size, base_size]) - 1 # [0, 0, 15, 15]
ratio_anchors = _ratio_enum(base_anchor, ratios)
'''[[ -3.5, 2. , 18.5, 13. ],
[ 0. , 0. , 15. , 15. ],
[ 2.5, -3. , 12.5, 18. ]]'''
anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
for i in xrange(ratio_anchors.shape[0])])
'''
[[ -84. -40. 99. 55.]
[-176. -88. 191. 103.]
[-360. -184. 375. 199.]
[ -56. -56. 71. 71.]
[-120. -120. 135. 135.]
[-248. -248. 263. 263.]
[ -36. -80. 51. 95.]
[ -80. -168. 95. 183.]
[-168. -344. 183. 359.]]
'''
return anchors
anchor_target_layer.py:
生成每个锚点的训练目标和标签,将其分类为1 (object),0 (not object) , -1 (ignore)。当label>0,也就是有object时,将会进行box的回归。
forward函数:在每一个cell中,生成9个锚点,提供这9个锚点的细节信息,过滤掉超过图像的锚点,测量同GT的overlap。
1、产生proposal,A个anchors,K个shifts,这里A=9
,K=H*W
,W、H代表featuremap的长宽,一张图中均匀地取了61 x 36个点,shift_x和shift_y分别是这些点在图中的偏移位置,通过对九个anchor坐标偏移可以使feature_map的每个cell都有9个anchor。H x feat_stride
以及W x feat_stride
正好约等于rescale以后的每张图的大小,feat_stride=16
2、去除超过图像边界的anchors(裁减掉了2/3左右)
3、labels全-1
4、anchors和gt_boxes的overlaps
5、labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0
==>0.3
labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1
==>0.7
6、对正样本和负样本采样(正样本与负样本保持1:1)
num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)
num_fg不大于0.5*256
num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)
负样本数目
7、得到bbox_targets((len(inds_inside), 4):负样本是全是0、bbox_inside_weights((len(inds_inside), 4)z正样本四个数都赋值为cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS=1、bbox_outside_weights((len(inds_inside), 4)如下:
if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:#(-1)
# uniform weighting of examples (given non-uniform sampling)
num_examples = np.sum(labels >= 0)
positive_weights = np.ones((1, 4)) * 1.0 / num_examples
negative_weights = np.ones((1, 4)) * 1.0 / num_examples
bbox_outside_weights[labels == 1, :] = positive_weights
bbox_outside_weights[labels == 0, :] = negative_weights
8、
_unmap:
all_anchors裁减掉了2/3左右,仅仅保留在图像内的anchor,这里就是将其复原作为下一层的输入了,并reshape成相应的格式
Stage 1 RPN, generate proposals
rpn_generate中:
cfg.TEST.RPN_PRE_NMS_TOP_N = -1 # no pre NMS filtering
cfg.TEST.RPN_POST_NMS_TOP_N = 2000
最后得到的proposal不超过2000
rpn_net = caffe.Net(rpn_test_prototxt, rpn_model_path, caffe.TEST)
使用上面rpn训练的模型,prototxt:rpn_test.pt进行测试得到proposal
rpn_test.pt中:
layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rois'
top: 'scores'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
rpn.proposal_layer
-->proposal_layer.py:这个函数是用来将RPN的输出转变为object proposals的。作者新增了ProposalLayer类,这个类中,重新了set_up和forward函数
forward:
生成锚点box、对于每个锚点提供box的参数细节
将预测框切成图像,删除宽、高小于阈值(16 * im_info[2])的框
将所有的(proposal, score) 对排序
获取 pre_nms_topN proposals(这里不执行,因为cfg.TEST.RPN_PRE_NMS_TOP_N = -1)
获取NMS(阈值0.7)
获取 after_nms_topN proposals,这里cfg.TEST.RPN_POST_NMS_TOP_N = 2000,取前2000个(原来没有2000个可能出错)
Stage 1 Fast R-CNN using RPN proposals, init from ImageNet model
train_fast_rcnn中:
cfg.TRAIN.PROPOSAL_METHOD = 'rpn'
将调用pascal_voc.py中rpn_roidb
cfg.TRAIN.IMS_PER_BATCH = 2
get_roidb
准备roidb, imdb,train_net
训练RPN
get_roidb中:
imdb = get_imdb(imdb_name)
初始化imdb类,调用factory.py和pascal.py
训练RPN时rpn_file=None,之后ground truth的框
roidb = get_training_roidb(imdb)
调用train.py中get_training_roidb函数,得到roidb
get_training_roidb中:
imdb.append_flipped_images()
(imdb.py中)水平翻转可以看作一种数据扩充方式,将rpn_roidb函数(pascal_voc.py)中得到的roidb[i]['boxes']翻转,图像索引加倍
rdl_roidb.prepare_roidb(imdb)
(roidb.py中)得到roidb[i]['max_classes'],roidb[i]['max_overlaps'],roidb[i]['image'],roidb[i]['width'],roidb[i]['height']
rpn_roidb中:
gt_roidb = self.gt_roidb()
首先得到gt_roidb,ground truth
rpn_roidb = self._load_rpn_roidb(gt_roidb)
得到从rpn_file得到rpn_roidb, 'gt_overlaps' 是和gt_roidb的最大重叠值,num_classes列只有一列有值其余为0
roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)
叠加到一起
train_fast_rcnn中get_roidb之后调用train_net训练:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:
roidb = filter_roidb(roidb)
图片不满足至少有一个前景或至少有一个背景的条件,overlaps:0-0.5背景0.5-1前景
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)
加载预训练模型,以及加载回归框参数rdl_roidb.add_bbox_regression_targets(roidb)
roidb.py中add_bbox_regression_targets:
roidb[im_i]['bbox_targets'] = _compute_targets(rois, max_overlaps, max_classes)
(均值方差计算)未知
_compute_targets:
gt_inds:
ground-truth ROIs
ex_inds:
fg ROIs(这里判断overlaps阈值大于0.5,所以包含了gt_inds,因为ground-truth的overlaps=1)
ex_gt_overlaps:
ex ROI和gt ROI的overlaps,返回num_ex * num_gt
gt_assignment:
每个ex ROI和gt ROI overlaps最大的gt ROI索引
gt_rois、ex_rois:
gt_inds和ex_inds对应的box
targets:
rois.shape[0]* 5,第一列是labels,之后ex_inds处写label,其他行是0。后四列是4个方位偏移,也是ex_inds处才写,之前说ex_inds包含gt_inds,但偏移量是0,所以不进行回归
def _compute_targets(rois, overlaps, labels):
"""Compute bounding-box regression targets for an image."""
# Indices of ground-truth ROIs
gt_inds = np.where(overlaps == 1)[0]
if len(gt_inds) == 0:
# Bail if the image has no ground-truth ROIs
return np.zeros((rois.shape[0], 5), dtype=np.float32)
# Indices of examples for which we try to make predictions
ex_inds = np.where(overlaps >= cfg.TRAIN.BBOX_THRESH)[0]
# Get IoU overlap between each ex ROI and gt ROI
ex_gt_overlaps = bbox_overlaps(
np.ascontiguousarray(rois[ex_inds, :], dtype=np.float),
np.ascontiguousarray(rois[gt_inds, :], dtype=np.float))
# Find which gt ROI each ex ROI has max overlap with:
# this will be the ex ROI's gt target
gt_assignment = ex_gt_overlaps.argmax(axis=1)
gt_rois = rois[gt_inds[gt_assignment], :]
ex_rois = rois[ex_inds, :]
targets = np.zeros((rois.shape[0], 5), dtype=np.float32)
targets[ex_inds, 0] = labels[ex_inds]
targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)
return targets
stage1_fast_rcnn_train.pt文件中:
name: "ZF"
layer {
name: 'data'
type: 'Python'
top: 'data'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 21"
}
}
调用roi_data_layer.layer
层,该层就是调用程序minibatch.py中get_minibatch函数。(end2end方法中ProposalTargetLayer
层起相同作用)
get_minibatch
rois_per_image:
每个图像最多包含的boxes个数,这里取128/1=128
fg_rois_per_image:
rois_per_image的0.25,前景32个(正负样本比:1:3)
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
得到缩放系数,原图和网络输入图像的比例,im_blob网络的输入blob
if cfg.TRAIN.HAS_RPN:
训练RPN的时候
else:
训练fast-rcnn
主要调用_sample_rois函数
得到labels, overlaps, im_rois, bbox_targets, bbox_inside_weights,roidb中坐标都是对应原图(P*Q),所以im_rois也是
rois = _project_im_rois(im_rois, im_scales[im_i])
得到符合网络输入的roi (M*N)
rois_blob:
5列的二维数组,第一列代表这个box是batch中第几个图像,后四列是坐标,所有batch的rois都叠加在一起,赋值给blobs['rois']
labels_blob,bbox_targets_blob,bbox_inside_blob
也是batch叠加在一起,赋值给blobs['labels'],blobs['bbox_targets'],blobs['bbox_inside_weights'],blobs['bbox_outside_weights']
blobs['bbox_outside_weights'] = np.array(bbox_inside_blob > 0).astype(np.float32)
bbox_outside_weights计算方法(用途未知)
函数返回blob
_sample_rois
fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
大于0.5为前景(有一个问题是这个阈值会筛选出ground-truth,overlaps=1,之后随机选取正样本也可能选出gt,对于gt回归的偏移量为0)
fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)
取32和前景数最小的值
fg_inds = npr.choice(fg_inds, size=fg_rois_per_this_image, replace=False)
随机取fg_rois_per_this_image个前景
bg_rois_per_this_image:
背景个数,64-fg_rois_per_this_image和overlaps小于0.5的最小值
bg_inds:
随机取的背景数的索引
keep_inds = np.append(fg_inds, bg_inds)
前景和背景索引叠加
labels = labels[keep_inds]
得到labels
labels[fg_rois_per_this_image:] = 0
背景的labels=0
overlaps = overlaps[keep_inds]
overlaps
rois = rois[keep_inds]
rois
bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(roidb['bbox_targets'][keep_inds, :], num_classes)
bbox_targets维度keep_inds*84,(84 = 4 * num_classes),由原来的5列变成84列,将其中的前景行对应类别的偏移量赋值(4个参数),其他列是0;bbox_inside_weights与bbox_targets维度相同,bbox_targets赋值四个偏移量的位置,bbox_inside_weights全赋值为1
"""Compute minibatch blobs for training a Fast R-CNN network."""
def get_minibatch(roidb, num_classes):
"""Given a roidb, construct a minibatch sampled from it."""
num_images = len(roidb)
# Sample random scales to use for each image in this batch
random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
size=num_images)
assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
'num_images ({}) must divide BATCH_SIZE ({})'. \
format(num_images, cfg.TRAIN.BATCH_SIZE)
rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
# fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)
# Get the input image blob, formatted for caffe
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
blobs = {'data': im_blob}
if cfg.TRAIN.HAS_RPN:
assert len(im_scales) == 1, "Single batch only"
assert len(roidb) == 1, "Single batch only"
# gt boxes: (x1, y1, x2, y2, cls)
gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
blobs['gt_boxes'] = gt_boxes
blobs['im_info'] = np.array(
[[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],
dtype=np.float32)
else: # not using RPN
# Now, build the region of interest and label blobs
rois_blob = np.zeros((0, 5), dtype=np.float32)
labels_blob = np.zeros((0), dtype=np.float32)
bbox_targets_blob = np.zeros((0, 4 * num_classes), dtype=np.float32)
bbox_inside_blob = np.zeros(bbox_targets_blob.shape, dtype=np.float32)
# all_overlaps = []
for im_i in xrange(num_images):
labels, overlaps, im_rois, bbox_targets, bbox_inside_weights \
= _sample_rois(roidb[im_i], fg_rois_per_image, rois_per_image,
num_classes)
# Add to RoIs blob
rois = _project_im_rois(im_rois, im_scales[im_i])
batch_ind = im_i * np.ones((rois.shape[0], 1))
rois_blob_this_image = np.hstack((batch_ind, rois))
rois_blob = np.vstack((rois_blob, rois_blob_this_image))
# Add to labels, bbox targets, and bbox loss blobs
labels_blob = np.hstack((labels_blob, labels))
bbox_targets_blob = np.vstack((bbox_targets_blob, bbox_targets))
bbox_inside_blob = np.vstack((bbox_inside_blob, bbox_inside_weights))
# all_overlaps = np.hstack((all_overlaps, overlaps))
# For debug visualizations
# _vis_minibatch(im_blob, rois_blob, labels_blob, all_overlaps)
blobs['rois'] = rois_blob
blobs['labels'] = labels_blob
if cfg.TRAIN.BBOX_REG:
blobs['bbox_targets'] = bbox_targets_blob
blobs['bbox_inside_weights'] = bbox_inside_blob
blobs['bbox_outside_weights'] = \
np.array(bbox_inside_blob > 0).astype(np.float32)
return blobs
def _sample_rois(roidb, fg_rois_per_image, rois_per_image, num_classes):
"""Generate a random sample of RoIs comprising foreground and background
examples.
"""
# label = class RoI has max overlap with
labels = roidb['max_classes']
overlaps = roidb['max_overlaps']
rois = roidb['boxes']
# Select foreground RoIs as those with >= FG_THRESH overlap
fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
# Guard against the case when an image has fewer than fg_rois_per_image
# foreground RoIs
fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)
# Sample foreground regions without replacement
if fg_inds.size > 0:
fg_inds = npr.choice(
fg_inds, size=fg_rois_per_this_image, replace=False)
# Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
(overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
# Compute number of background RoIs to take from this image (guarding
# against there being fewer than desired)
bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image
bg_rois_per_this_image = np.minimum(bg_rois_per_this_image,
bg_inds.size)
# Sample foreground regions without replacement
if bg_inds.size > 0:
bg_inds = npr.choice(
bg_inds, size=bg_rois_per_this_image, replace=False)
# The indices that we're selecting (both fg and bg)
keep_inds = np.append(fg_inds, bg_inds)
# Select sampled values from various arrays:
labels = labels[keep_inds]
# Clamp labels for the background RoIs to 0
labels[fg_rois_per_this_image:] = 0
overlaps = overlaps[keep_inds]
rois = rois[keep_inds]
bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(
roidb['bbox_targets'][keep_inds, :], num_classes)
return labels, overlaps, rois, bbox_targets, bbox_inside_weights
def _get_image_blob(roidb, scale_inds):
"""Builds an input blob from the images in the roidb at the specified
scales.
"""
num_images = len(roidb)
processed_ims = []
im_scales = []
for i in xrange(num_images):
im = cv2.imread(roidb[i]['image'])
if roidb[i]['flipped']:
im = im[:, ::-1, :]
target_size = cfg.TRAIN.SCALES[scale_inds[i]]
im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
cfg.TRAIN.MAX_SIZE)
im_scales.append(im_scale)
processed_ims.append(im)
# Create a blob to hold the input images
blob = im_list_to_blob(processed_ims)
return blob, im_scales
def _project_im_rois(im_rois, im_scale_factor):
"""Project image RoIs into the rescaled training image."""
rois = im_rois * im_scale_factor
return rois
def _get_bbox_regression_labels(bbox_target_data, num_classes):
"""Bounding-box regression targets are stored in a compact form in the
roidb.
This function expands those targets into the 4-of-4*K representation used
by the network (i.e. only one class has non-zero targets). The loss weights
are similarly expanded.
Returns:
bbox_target_data (ndarray): N x 4K blob of regression targets
bbox_inside_weights (ndarray): N x 4K blob of loss weights
"""
clss = bbox_target_data[:, 0]
bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)
bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)
inds = np.where(clss > 0)[0]
# for ind in inds:
# cls = clss[ind]
# start = 4 * cls
# end = start + 4
# bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
# bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
# return bbox_targets, bbox_inside_weights
for ind in inds:
ind = int(ind)
cls = clss[ind]
start = int(4 * cls)
end = int(start + 4)
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
return bbox_targets, bbox_inside_weights