目录
背景
文字为什么重要?
问题定义
那么会有那些挑战呢?
近期前沿和有代表性算法
Holistic, Multi-Channel Prediction
TextBoxes
Rotation Proposals
Corner Localization and Region Segmentation(A Megvii work in CVPR 2018)
Simpler Pipelines
EAST (A Megvii work in CVPR 2017)
任意形状的文字检测
TextSnake (A Megvii work in ECCV 2018)
Mask TextSpotter (A Megvii work in ECCV 2018)
文字识别
CRNN
ASTER
FAN
资源推荐
因为人类创造了文字,它具有两种特点:
同时文字在自然场景中可以作为一种视觉线索,具有互补的作用,比如边缘,纹理等等。
文字检测是指通过算法判断文字的位置以及检测字符的过程。
与传统的OCR不同,
自然场景更杂乱,OCR 更规整
文字类型千变万化,格式,颜色等
具体的挑战分为三类:
有一些算法从目标检测和语义分割中得到灵感启发:
Yao et al.. Scene Text Detection via Holistic, Multi-Channel Prediction. 2016. arXiv preprint arXiv:1606.09002
Liao et al.. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. AAAI, 2017.
Ma et al.. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. arxiv, 2017.
Lyu et al.. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. CVPR, 2018.
Zhou et al.. EAST: An Efficient and Accurate Scene Text Detector. CVPR, 2017.
lmain idea: predict location, scale and orientation of text with a single model and multiple loss functions (multi-task training)
ladvantanges:
(a). accuracy: allow for end-to-end training and optimization
(b). efficiency: remove redundant stages and processings
Long et al.. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes, ECCV, 2018.
la text instance is described as a sequence of ordered, overlapping disks centered at symmetric axes, each of which is associated with potentially variable radius and orientation
Lyu et al.. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes, ECCV, 2018.
Shi et al.. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition, TPAMI, 2017.
Shi et al.. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification, TPAMI, 2018.
•Survey
•Scene Text Detection and Recognition: The Deep Learning Era
•arXiv: https://arxiv.org/abs/1811.04256 (draft version)
•Github: https://github.com/Jyouhou/SceneTextPapers (compiled papers, datasets & codes)
•Laboratories and Papers
•https://github.com/chongyangtao/Awesome-Scene-Text-Recognition
•Datasets and Codes
•https://github.com/seungwooYoo/Curated-scene-text-recognition-analysis
•Projects and Products
•https://github.com/wanghaisheng/awesome-ocr