faster_rcnn:
SimpsonRecognition分析
config.py
定义anchor尺寸:
self.anchor_box_scales = [64, 128, 256, 512]
self.anchor_box_ratio = [[1, 1], [1, 2], [2, 1]]
定义图片最短边尺寸:self.im_size = 300
图像RGB通道均值(ImageNet):self.img_channel_mean = [103.939, 116.779, 123.68]
RPN网络stride : self.rpn_stride = 16#并非每个像素都取一套bbox,这里是每隔16个像素取一套 bbox
RPN overlap threshold:
self.rpn_min_overlap = 0.3
self.rpn_max_overlap = 0.7
分类器ROI overlap:
self.classifier_min_overlap = 0.1
self.classifier_max_overlap = 0.5
self.std_scalling = 4.0
self.classifier_regr_std = [80.0, 8.0, 4.0, 4.0]
parser.py
def get_data
定义三个dict:
all_imgs = {}
if filename not in all_imgs:
all_imgs[filename] = {}
all_imgs为字典,其中包含图像宽高,bounding box的位置,以及图像对应的class
all_imgs[filename]['bboxes'].append({'class':class_name,
'x1':int(x1),
'x2':int(x2),
'y1':int(y1),
'y2':int(y2)})
classes_count = {}#每类图片的数量
class_mapping = {}
if class_name not in classes_mapping:
classes_mapping[class_name] = len(class_mapping)
对于数据集中的图片,以5/6的概率选为训练集,剩下1/6为测试集:
if np.random.randint(0, 6) > 0:
all_imgs[filename]['imageset'] = 'trainval'
else:
all_imgs[filename]['imageset'] = 'test'
定义background:
classes_count['bg'] = 0
class_mapping['bg'] = len(class_mapping)#在图片全部读入后,最后添加背景
定义list: all_data = []
保存all_imgs的value: ['filepath'] = filename, ['width'] = cols, ['height'] = rows, ['bboxes'], ['imageset'] = 'trainval'/'test'
for key in all_imgs:
all_data.append(all_imgs[key])
随机打乱all_data:
random.shuffle(all_data)
return all_data, classes_count, class_mapping
def get_anchor_gt(all_img_data, class_count, C, mode='train')
获取图中anchor及对应的bbox的各种参数,
此函数返回:
x_img, [y_rpn_cls, y_rpn_regr], img_data_aug
以元组形式赋给data_gen_train
进入base_layer部分!!!
采用resnet.