
FOTS: 在synth800k 训练+MLT 训练+验证集上预训练,然后再继续finetune。旋转-10~10度数据扩增。

FOTS:短边1260,长边2240.(ICDAR2015),实时版本是原图尺度1280720. 图片的缩放很厉害,从长边640到长边2560 像素。宽不变的同时,高缩放0.8~1.2 。 训练时裁剪640640的小patch。
FSTN: Data augmentation Multi-scale training, rotation and color
jittering are applied during training. Scales are randomly
chosen from [600,720,960,1100] and each number represents
the short edge of input images. Rotation with 15◦ ,30◦ and 45◦
are applied with horizontal flip. Consequently, it enlarges 8x
dataset size than the original one. Random brightness, contrast
and saturation jittering are applied for input images.

FOTS 采用了困难样本挖掘,每次分类用512 随机负样本和512 困难负样本和所有正样本。回归方面,也进行单独的困难样本挖掘。


  1. the images are rescaled with ratio f0:5; 1:0; 2:0; 3:0g randomly;
  2. the images are horizontally fliped and rotated in range [−10◦; 10◦] randomly; 3) 640 × 640
    random samples are cropped from the transformed images; 4) the images are normalized using the
    channel means and standard deviations
