文字检测trick

数据扩增方面:
FOTS: 在synth800k 训练+MLT 训练+验证集上预训练,然后再继续finetune。旋转-10~10度数据扩增。

多尺度测试方面:
FOTS:短边1260,长边2240.(ICDAR2015),实时版本是原图尺度1280720. 图片的缩放很厉害,从长边640到长边2560 像素。宽不变的同时,高缩放0.8~1.2 。 训练时裁剪640640的小patch。
FSTN: Data augmentation Multi-scale training, rotation and color
jittering are applied during training. Scales are randomly
chosen from [600,720,960,1100] and each number represents
the short edge of input images. Rotation with 15◦ ,30◦ and 45◦
are applied with horizontal flip. Consequently, it enlarges 8x
dataset size than the original one. Random brightness, contrast
and saturation jittering are applied for input images.

其他参数设置:
FOTS 采用了困难样本挖掘,每次分类用512 随机负样本和512 困难负样本和所有正样本。回归方面,也进行单独的困难样本挖掘。

PSENET

  1. the images are rescaled with ratio f0:5; 1:0; 2:0; 3:0g randomly;
  2. the images are horizontally fliped and rotated in range [−10◦; 10◦] randomly; 3) 640 × 640
    random samples are cropped from the transformed images; 4) the images are normalized using the
    channel means and standard deviations

你可能感兴趣的:(文字检测trick)