【论文整理】字符识别论文集合!

ocr papers

some papers and datasets links collected from:

  • [1] wanghaisheng/awesome-ocr
  • [2] kba/awesome-ocr
  • [3] chongyangtao/Awesome-Scene-Text-Recognition
  • [4] whitelok/image-text-localization-recognition
  • [5] 文字检测与识别资源
  • [6] OCR material
  • [7] handong1587
  • [8] hs105/Deep-Learning-for-OCR
  • [9] 文字检测与识别资料整理
  • [10] hwalsuklee/awesome-deep-text-detection-recognition

you can access the website ICDAR, and see some awesome ocr models on the “Ranking Table” of each competition’s result page


2009

  • 【Synthetic data】de T. Campos, B. R. Babu, and M. Varma. Character recognition in natural images. In VISAPP, 2009

2010

  • Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform[C]//Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2963-2970.

         code:[code]

2011

  • Rusinol M, Aldavert D, Toledo R, et al. Browsing heterogeneous document collections by a segmentation-free word spotting method[C]//Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 63-67.
  • Neumann L, Matas J. Text localization in real-world images using efficiently pruned exhaustive search[C]//Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 687-691.

2012

  • 【Synthetic data】Wang T, Wu D J, Coates A, et al. End-to-end text recognition with convolutional neural networks[C]//Pattern Recognition (ICPR), 2012 21st International Conference on. IEEE, 2012: 3304-3308.

         code:[code]
  • Elagouni K, Garcia C, Mamalet F, et al. Text recognition in videos using a recurrent connectionist approach[C]//International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2012: 172-179.
  • Frinken V, Fischer A, Manmatha R, et al. A novel word spotting method based on recurrent neural networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2012, 34(2): 211-224.
  • Neumann L, Matas J. Real-time scene text localization and recognition[C]//Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 3538-3545.

         code:[code]
  • Mishra A, Alahari K, Jawahar C V. Top-down and bottom-up cues for scene text recognition[C]//Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 2687-2694.

2013

  • Yin X C, Yin X, Huang K, et al. Robust text detection in natural scene images[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(5): 970-983.
  • Bissacco A, Cummins M, Netzer Y, et al. Photoocr: Reading text in uncontrolled conditions[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 785-792.
  • Breuel T M, Ul-Hasan A, Al-Azawi M A, et al. High-performance OCR for printed English and Fraktur using LSTM networks[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 683-687.

         code:[code]
  • Milyaev S, Barinova O, Novikova T, et al. Image binarization for end-to-end text understanding in natural images[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 128-132.
  • Neumann L, Matas J. On combining multiple segmentations in scene text recognition[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 523-527.
  • Koo H I, Kim D H. Scene text detection via connected component clustering and nontext filtering[J]. IEEE transactions on image processing, 2013, 22(6): 2296-2305.
  • Shi C, Wang C, Xiao B, et al. Scene text recognition using part-based tree-structured character detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013: 2961-2968.
  • Halima M B, Karray H, Alimi A M. Arabic text recognition in video sequences[J]. arXiv preprint arXiv:1308.3243, 2013.
  • Zaghden N, Khelifi B, Alimi A M, et al. Text Recognition in both ancient and cartographic documents[J]. arXiv preprint arXiv:1308.6309, 2013.
  • Alsharif O, Pineau J. End-to-end text recognition with hybrid HMM maxout models[J]. arXiv preprint arXiv:1310.1811, 2013.
  • Louradour J, Kermorvant C. Curriculum learning for handwritten text line recognition[C]//Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on. IEEE, 2014: 56-60.
  • Goodfellow I J, Bulatov Y, Ibarz J, et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks[J]. arXiv preprint arXiv:1312.6082, 2013.

2014

  • Bušta M, Drtina T, Helekal D, et al. Efficient character skew rectification in scene text images[C]//Asian Conference on Computer Vision. Springer, Cham, 2014: 134-146.
  • Almazán J, Gordo A, Fornés A, et al. Word spotting and recognition with embedded attributes[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(12): 2552-2566.

         code:[code]
  • Jaderberg M, Vedaldi A, Zisserman A. Deep features for text spotting[C]//European conference on computer vision. Springer, Cham, 2014: 512-528.

         code:[code]
  • Bluche T, Ney H, Kermorvant C. A comparison of sequence-trained deep neural networks and recurrent neural networks optical modeling for handwriting recognition[C]//International Conference on Statistical Language and Speech Processing. Springer, Cham, 2014: 199-210.
  • Yao C, Bai X, Liu W. A unified framework for multioriented text detection and recognition[J]. IEEE Transactions on Image Processing, 2014, 23(11): 4737-4749.
  • Huang W, Qiao Y, Tang X. Robust scene text detection with convolution neural network induced mser trees[C]//European Conference on Computer Vision. Springer, Cham, 2014: 497-511.
  • Bhowmick S, Banerjee P. Bangla text recognition from video sequence: A new focus[J]. arXiv preprint arXiv:1401.1190, 2014.
  • 【Synthetic data】Jaderberg M, Simonyan K, Vedaldi A, et al. Synthetic data and artificial neural networks for natural scene text recognition[J]. arXiv preprint arXiv:1406.2227, 2014.

         code:[model;offical website]
  • Jaderberg M, Simonyan K, Vedaldi A, et al. Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016, 116(1): 1-20.

         offical website:[offical website]
  • Jaderberg M, Simonyan K, Vedaldi A, et al. Deep structured output learning for unconstrained text recognition[J]. arXiv preprint arXiv:1412.5903, 2014.

2015

  • Kim B S, Koo H I, Cho N I. Document dewarping via text-line based optimization[J]. Pattern Recognition, 2015, 48(11): 3600-3614.
  • Ye Q, Doermann D. Text detection and recognition in imagery: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(7): 1480-1500.
  • Jaderberg M. Deep learning for text spotting[D]. University of Oxford, 2015.
  • Ren X, Chen K, Yang X, et al. A new unsupervised convolutional neural network model for Chinese scene text detection[C]//Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on. IEEE, 2015: 428-432.
  • Wang Z, Yang J, Jin H, et al. Deepfont: Identify your font from an image[C]//Proceedings of the 23rd ACM international conference on Multimedia. ACM, 2015: 451-459.
  • Gomez L, Karatzas D. Object proposals for text extraction in the wild[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 206-210.[code]
  • Shi B, Yao C, Zhang C, et al. Automatic script identification in the wild[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 531-535.
  • Busta M, Neumann L, Matas J. Fastext: Efficient unconstrained scene text detector[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1206-1214.[code]
  • Zhang Z, Shen W, Yao C, et al. Symmetry-based text line detection in natural scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 2558-2567.

         code:[code]
  • Ray A, Rajeswar S, Chaudhury S. A hypothesize-and-verify framework for text recognition using deep recurrent neural networks[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 936-940.
  • Neumann L, Matas J. Efficient scene text localization and recognition with local character refinement[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 746-750.
  • Visin F, Kastner K, Cho K, et al. Renet: A recurrent neural network based alternative to convolutional networks[J]. arXiv preprint arXiv:1505.00393, 2015.
  • Zhong Z, Jin L, Xie Z. High performance offline handwritten chinese character recognition using googlenet and directional feature maps[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 846-850.

         code:[code]
  • 【CRNN】Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(11): 2298-2304.

         code:【1 - offical】; 【2 - crnn.pytorch】; 【3 - unfinished】; 【4 - crnn.pytorch-chinese】; 【5 - crnn+stn-tf】; 【6 - lstm+ctc】; 【7 - ctpn+crnn-merge-cannot-train】; 【8 - crnn-mnist-keras】; 【9 - crnn-tf】; 【10 - crnn-tf-could-be-better】; 【11 - crnn.mxnet】; 【12 - crnn-tf-estimators】; 【13 - crnn-attention-tf】; 【14 - crnn.caffe】; 【15 - chinese.ocr-ctpn+crnn-tf+pytorch】; 【16 - another.crnn-attentive pooling】; 【17 - crnn-tf-music】; 【18 - crnn-tf-developing】; 【19 - crnn-torch】; 【20 - crnn-tf-developing】; 【21 - chinese-ocr-keras】; 【22 - crnn-tf-developing】; 【23 - ctpn+crnn-cannot-train-7】; 【24 - crnn-pytorch】; 【25 - cnn+lstm+ctc-tf】; 【26 - crnn-tf-resnet]】;【27 - caffe_ocr】
  • He T, Huang W, Qiao Y, et al. Text-attentional convolutional neural network for scene text detection[J]. IEEE transactions on image processing, 2016, 25(6): 2529-2541.
  • Sahu D K, Sukhwani M. Sequence to sequence learning for optical character recognition[J]. arXiv preprint arXiv:1511.04176, 2015.
  • Hosseini-Asl E, Guha A. Similarity-based Text Recognition by Deeply Supervised Siamese Network[J]. arXiv preprint arXiv:1511.04397, 2015.
  • Wang D H, Wang H, Zhang D, et al. Robust Scene Text Recognition Using Sparse Coding based Features[J]. arXiv preprint arXiv:1512.08669, 2015.

2016

  • Yin X C, Zuo Z Y, Tian S, et al. Text detection, tracking and recognition in video: a comprehensive survey[J]. IEEE Transactions on Image Processing, 2016, 25(6): 2752-2773.
  • Zhu Y, Yao C, Bai X. Scene text detection and recognition: Recent advances and future trends[J]. Frontiers of Computer Science, 2016, 10(1): 19-36.
  • He P, Huang W, Qiao Y, et al. Reading Scene Text in Deep Convolutional Sequences[C]//AAAI. 2016: 3501-3508.

         code:[code]
  • Lee C Y, Osindero S. Recursive recurrent nets with attention modeling for OCR in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2231-2239.
  • 【Synthetic data】Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2315-2324.

         code:[offical;vgg;other]
  • Sivakorn S, Polakis J, Keromytis A D. I’m not a human: Breaking the Google reCAPTCHA[J]. Black Hat,(i), 2016: 1-12.
  • Sivakorn S, Polakis I, Keromytis A D. I am robot:(deep) learning to break semantic image captchas[C]//Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. IEEE, 2016: 388-403.
  • Lee C Y, Osindero S. Recursive recurrent nets with attention modeling for OCR in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2231-2239.
  • Neumann L, Matas J. Real-time lexicon-free scene text localization and recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(9): 1872-1885.
  • Zhang Z, Zhang C, Shen W, et al. Multi-oriented text detection with fully convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4159-4167.
  • Fabrizio J, Robert-Seidowsky M, Dubuisson S, et al. TextCatcher: a method to detect curved and challenging text in natural scenes[J]. International Journal on Document Analysis and Recognition (IJDAR), 2016, 19(2): 99-117.
  • Cho H, Sung M, Jun B. Canny text detector: Fast and robust scene text localization algorithm[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3566-3573.
  • Qiang G, Dan T, Guohui L, et al. Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition[J]. arXiv preprint arXiv:1601.01100, 2016.
  • Mishra A, Alahari K, Jawahar C V. Enhancing energy minimization framework for scene text recognition with top-down cues[J]. Computer Vision and Image Understanding, 2016, 145: 30-42.
  • Li H, Shen C. Reading car license plates using deep convolutional neural networks and lstms[J]. arXiv preprint arXiv:1601.05610, 2016.
  • Veit A, Matera T, Neumann L, et al. Coco-text: Dataset and benchmark for text detection and recognition in natural images[J]. arXiv preprint arXiv:1601.07140, 2016.
  • Huang W. Context modeling for semantic text matching and scene text detection[M]. The Pennsylvania State University, 2016.
  • Tian S, Pei W Y, Zuo Z Y, et al. Scene Text Detection in Video by Learning Locally and Globally[C]//IJCAI. 2016: 2647-2653.
  • Shi B, Wang X, Lyu P, et al. Robust scene text recognition with automatic rectification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4168-4176.
  • Shuye Zhang, Mude Lin, Tianshui Chen, Lianwen Jin, Liang Lin. Character Proposal Network for Robust Text Extraction. arXiv preprint arXiv:1602.04348, 2016.
  • Lluis Gomez, Dimosthenis Karatzas. A fine-grained approach to scene text script identification. arXiv preprint arXiv:1602.07475, 2016.
  • Lluis Gomez, Anguelos Nicolaou, Dimosthenis Karatzas. Improving patch-based scene text script identification with ensembles of conjoined networks. arXiv preprint arXiv:1602.07480, 2016.
  • He T, Huang W, Qiao Y, et al. Accurate text localization in natural image with cascaded convolutional text network[J]. arXiv preprint arXiv:1603.09423, 2016.
  • Hafemann L G, Sabourin R, Oliveira L S. Writer-independent feature learning for offline signature verification using deep convolutional neural networks[C]//Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016: 2576-2583.
  • Ren X, Chen K, Sun J. A CNN Based Scene Chinese Text Recognition Algorithm With Synthetic Data Engine[J]. arXiv preprint arXiv:1604.01891, 2016.
  • Xiaohang Ren, Kai Chen, Jun Sun. A Novel Scene Text Detection Algorithm Based On Convolutional Neural Network. arXiv preprint arXiv:1604.01894, 2016.
  • Gómez L, Karatzas D. Textproposals: a text-specific selective search algorithm for word spotting in the wild[J]. Pattern Recognition, 2017, 70: 60-74.[code]
  • Bluche T, Louradour J, Messina R. Scan, attend and read: End-to-end handwritten paragraph recognition with mdlstm attention[J]. arXiv preprint arXiv:1604.03286, 2016.
  • Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, Xiang Bai. Multi-Oriented Text Detection with Fully Convolutional Networks. arXiv preprint arXiv:1604.04018, 2016.
  • Xie Z, Sun Z, Jin L, et al. Fully convolutional recurrent network for handwritten Chinese text recognition[C]//Pattern Recognition (ICPR), 2016 23rd International Conference on. IEEE, 2016: 4011-4016.
  • Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan. Text Flow: A Unified Text Detection System in Natural Scene Images. arXiv preprint arXiv:1604.06877, 2016.
  • Zhong Z, Jin L, Zhang S, et al. Deeptext: A unified framework for text proposal generation and text detection in natural images[J]. arXiv preprint arXiv:1605.07314, 2016.
  • Zhang X Y, Yin F, Zhang Y M, et al. Drawing and recognizing chinese characters with recurrent neural network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
  • Yao C, Bai X, Sang N, et al. Scene text detection via holistic, multi-channel prediction[J]. arXiv preprint arXiv:1606.09002, 2016.
  • Hassanien A M A. Sequence to sequence learning for unconstrained scene text recognition[J]. arXiv preprint arXiv:1607.06125, 2016.
  • Nitigya Sambyal, Pawanesh Abrol. Automatic text extraction and character segmentation using maximally stable extremal regions. arXiv preprint arXiv:1608.03374, 2016.
  • 【Synthetic data】 Krishnan P, Jawahar C V. Generating Synthetic Data for Text Recognition[J]. arXiv preprint arXiv:1608.04224, 2016.
  • 【CTPN】Tian Z, Huang W, He T, et al. Detecting text in natural image with connectionist text proposal network[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 56-72.

         code:[code;cuda8-caffe;offical;ocr_detection_ctpn;keras_ocr]

         dataset:[ICDAR 2011; ICDAR 2013; ICDAR 2015; SWT; Multilingual dataset]
  • Xie Z, Sun Z, Jin L, et al. Learning spatial-semantic context with fully convolutional recurrent network for online handwritten chinese text recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2017.
  • Hu B, Liu X, Wu X, et al. Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition[J]. arXiv preprint arXiv:1610.04057, 2016.
  • Ahmed Ibrahim, A. Lynn Abbott, Mohamed E. Hussein. An Image Dataset of Text Patches in Everyday Scenes. arXiv preprint arXiv:1610.06494, 2016.
  • Lou X, Kansky K, Lehrach W, et al. Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data[C]//Advances in Neural Information Processing Systems. 2016: 2793-2801.
  • Xu Y, Shan S, Qiu Z, et al. End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance[J]. arXiv preprint arXiv:1611.06159, 2016.
  • Chengzhe Yan, Jie Hu, Changshui Zhang. A DNN Framework For Text Image Rectification From Planar Transformations. arXiv preprint arXiv:1611.04298, 2016.
  • Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. arXiv preprint arXiv:1611.06779, 2016.
  • Jie Mei, Aminul Islam, Yajing Wu, Abidalrahman Moh’d, Evangelos E. Milios. Statistical Learning for OCR Text Correction. arXiv preprint arXiv:1611.06950, 2016.
  • Yang X, He D, Huang W, et al. Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading[J]. arXiv preprint arXiv:1611.07385, 2016.
  • Junnan Yu, Xuna Ma, Ting Han. Usability Investigation on the Localization of Text CAPTCHAs: Take Chinese Characters as a Case Study. arXiv preprint arXiv:1612.01070, 2016.
  • Singh Vijendra, Nisha Vasudeva, Hem Jyotsana Parashar. Recognition of Text Image Using Multilayer Perceptron. arXiv preprint arXiv:1612.00625, 2016.
  • Zichuan Liu, Yixing Li, Fengbo Ren, Hao Yu. A Binary Convolutional Encoder-decoder Network for Real-time Natural Scene Text Processing. arXiv preprint arXiv:1612.03630, 2016.

2017

  • Kil T, Seo W, Koo H I, et al. Robust Document Image Dewarping Method Using Text-Lines and Line Segments[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2017, 1: 865-870.
    [code:xellows1305/Document-Image-Dewarping]
  • Raj D, SAHU S, Anand A. Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text[C]//Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). 2017: 311-321.

         code:[code]
  • Florian Fink, Klaus-U. Schulz, Uwe Springmann. Profiling of OCR’ed Historical Texts Revisited. arXiv preprint arXiv:1701.05377, 2017.
  • Cheang T K, Chong Y S, Tay Y H. Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN[J]. arXiv preprint arXiv:1701.06439, 2017.
  • Shahin A A. Printed Arabic Text Recognition using Linear and Nonlinear Regression[J]. arXiv preprint arXiv:1702.01444, 2017.
  • Smith R, Gu C, Lee D S, et al. End-to-end interpretation of the french street name signs dataset[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 411-426.

         code:[code]
  • Bazazian D, Gomez R, Nicolaou A, et al. Improving Text Proposals for Scene Images with Fully Convolutional Networks[J]. arXiv preprint arXiv:1702.05089, 2017.
  • 【synthetic Captcha】Le T A, Baydin A G, Zinkov R, et al. Using Synthetic Data to Train Neural Networks is Model-Based Reasoning[J]. arXiv preprint arXiv:1703.00868, 2017.
  • Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, Xiangyang Xue. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. arXiv preprint arXiv:1703.01086, 2017.
  • Liu Y, Jin L. Deep matching prior network: Toward tighter multi-oriented text detection[J]. arXiv preprint arXiv:1703.01425, 2017.
  • Shi B, Bai X, Belongie S. Detecting Oriented Text in Natural Images by Linking Segments[J]. arXiv preprint arXiv:1703.06520, 2017.

         code:[code]
  • Masood S Z, Shu G, Dehghan A, et al. License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks[J]. arXiv preprint arXiv:1703.07330, 2017.
  • Liao M, Shi B, Bai X, et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network[C]//AAAI. 2017: 4161-4167.

         code:[code;code]
  • He W, Zhang X Y, Yin F, et al. Deep Direct Regression for Multi-Oriented Scene Text Detection[J]. arXiv preprint arXiv:1703.08289, 2017.
  • Ma J, Shao W, Ye H, et al. Arbitrary-Oriented Scene Text Detection via Rotation Proposals[J]. arXiv preprint arXiv:1703.01086, 2017.
  • Qin S, Manduchi R. Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[J]. arXiv preprint arXiv:1704.00834, 2017.
  • Zhou X, Yao C, Wen H, et al. EAST: An Efficient and Accurate Scene Text Detector[J]. arXiv preprint arXiv:1704.03155, 2017.

         code:[code]
  • Wojna Z, Gorban A, Lee D S, et al. Attention-based Extraction of Structured Information from Street View Imagery[J]. arXiv preprint arXiv:1704.03549, 2017.
    :
         code:[offical;similar]
  • Moysset B, Kermorvant C, Wolf C. Full-Page Text Recognition: Learning Where to Start and When to Stop[J]. arXiv preprint arXiv:1704.08628, 2017.
  • Nakamura T, Zhu A, Yanai K, et al. Scene Text Eraser[J]. arXiv preprint arXiv:1705.02772, 2017.
  • Xiao X, Yang Y, Ahmad T, et al. Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling[J]. arXiv preprint arXiv:1705.05207, 2017.
  • Polzounov A, Ablavatski A, Escalera S, et al. WordFence: Text Detection in Natural Images with Border Awareness[J]. arXiv preprint arXiv:1705.05483, 2017.
  • Ghosh S K, Valveny E, Bagdanov A D. Visual attention models for scene text recognition[J]. arXiv preprint arXiv:1706.01487, 2017.
  • Lyu P, Bai X, Yao C, et al. Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis[J]. arXiv preprint arXiv:1706.04041, 2017.
  • Shervin Minaee, Yao Wang. Text Extraction From Texture Images Using Masked Signal Decomposition. arXiv preprint arXiv:1706.08789, 2017.
  • Jiang Y, Zhu X, Wang X, et al. R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[J]. arXiv preprint arXiv:1706.09579, 2017.
  • Ghosh S, Valveny E. R-PHOC: Segmentation-Free Word Spotting using CNN[J]. arXiv preprint arXiv:1707.01294, 2017.
  • Wang X, You M, Shen C. Adversarial generation of training examples for vehicle license plate recognition[J]. arXiv preprint arXiv:1707.03124, 2017.
  • Li H, Wang P, Shen C. Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks[J]. arXiv preprint arXiv:1707.03985, 2017.
  • Aneeshan Sain, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal. Multi-Oriented Text Detection and Verification in Video Frames and Scene Images. arXiv preprint arXiv:1707.07150, 2017.
  • Bhunia A K, Kumar G, Roy P P, et al. Text recognition in scene image and video frame using Color Channel selection[J]. Multimedia Tools and Applications, 2017: 1-28.
  • Partha Pratim Roy, Ayan Kumar Bhunia, Umapada Pal. Date-Field Retrieval in Scene Image and Video Frames using Text Enhancement and Shape Coding. arXiv preprint arXiv:1707.06833, 2017.
  • Bartz C, Yang H, Meinel C. STN-OCR: A single Neural Network for Text Detection and Text Recognition[J]. arXiv preprint arXiv:1707.08831, 2017.

         code:[code]
  • Jiang F, Hao Z, Liu X. Deep Scene Text Detection with Connected Component Proposals[J]. arXiv preprint arXiv:1708.05133, 2017.
  • Amarnath R, P. Nagabhushan. Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation. arXiv preprint arXiv:1708.05545, 2017.
  • P. Shivakumara, D. S. Guru, H.T. Basavaraju. Color and Gradient Features for Text Segmentation from Video Frames. arXiv preprint arXiv:1708.06561, 2017.
  • Hu H, Zhang C, Luo Y, et al. Wordsup: Exploiting word annotations for character based text detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017.
  • He P, Huang W, He T, et al. Single shot text detector with regional attention[C]//The IEEE International Conference on Computer Vision (ICCV). 2017.

         code:[code;code]
  • Yin F, Wu Y C, Zhang X Y, et al. Scene Text Recognition with Sliding Convolutional Character Models[J]. arXiv preprint arXiv:1709.01727, 2017.
  • Ekta Vats, Anders Hast. On-the-fly Historical Handwritten Text Annotation. arXiv preprint arXiv:1709.01775, 2017.
  • Cheng Z, Bai F, Xu Y, et al. Focusing Attention: Towards Accurate Text Recognition in Natural Images[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 5086-5094.
  • Dai Y, Huang Z, Gao Y, et al. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection[J]. arXiv preprint arXiv:1709.03272, 2017.
  • Teresa Nicole Brooks. Exploring Geometric Property Thresholds For Filtering Non-Text Regions In A Connected Component Based Text Detection Application. arXiv preprint arXiv:1709.03548, 2017.
  • Yunze Gao, Yingying Chen, Jinqiao Wang, Hanqing Lu .Reading Scene Text with Attention Convolutional Sequence Modeling. arXiv preprint arXiv:1709.04303, 2017.
  • Li H, Wang P, Shen C. Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks[J]. arXiv preprint arXiv:1709.08828, 2017.
  • Kazem Qazanfari, Saeed Shiri. Real time text localization for Indoor Mobile Robot Navigation. arXiv preprint arXiv:1709.09634, 2017.
  • Zhan H, Wang Q, Lu Y. Handwritten digit string recognition by combination of residual network and RNN-CTC[C]//International Conference on Neural Information Processing. Springer, Cham, 2017: 583-591.
  • Yang C, Yin X C, Li Z, et al. AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition[J]. arXiv preprint arXiv:1710.03425, 2017.
  • Tian S, Lu S, Li C. WeText: Scene Text Detection under Weak Supervision[J]. arXiv preprint arXiv:1710.04826, 2017.
  • Kheng Chng C, Chan C S. Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition[J]. arXiv preprint arXiv:1710.10400, 2017.
  • Jain M, Mathew M, Jawahar C V. Unconstrained scene text and video text recognition for Arabic script[C]//Arabic Script Analysis and Recognition (ASAR), 2017 1st International Workshop on. IEEE, 2017: 26-30.
  • Ren H, Wang W. A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition[J]. arXiv preprint arXiv:1711.02809, 2017.
  • Zhu X, Jiang Y, Yang S, et al. Deep Residual Text Detection Network for Scene Text[J]. arXiv preprint arXiv:1711.04147, 2017.
  • Cheng Z, Liu X, Bai F, et al. Arbitrarily-Oriented Text Recognition[J]. arXiv preprint arXiv:1711.04226, 2017.
  • Zhang S, Liu Y, Jin L, et al. Feature Enhancement Network: A Refined Scene Text Detector[J]. arXiv preprint arXiv:1711.04249, 2017.
  • Xing D, Li Z, Chen X, et al. ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene[J]. arXiv preprint arXiv:1711.11249, 2017.
  • Yuliang L, Lianwen J, Shuaitao Z, et al. Detecting Curve Text in the Wild: New Dataset and New Solution[J]. arXiv preprint arXiv:1712.02170, 2017.

         code:[code]
  • Jason Poulos, Rafael Valle. Attention networks for image-to-text. arXiv preprint arXiv:1712.04046, 2017.
  • Aarushi Agrawal, Prerana Mukherjee, Siddharth Srivastava, Brejesh Lall. Enhanced Characterness for Text Detection in the Wild. arXiv preprint arXiv:1712.04927, 2017.
  • Bartz C, Yang H, Meinel C. SEE: Towards Semi-Supervised End-to-End Scene Text Recognition[J]. arXiv preprint arXiv:1712.05404, 2017.
  • Kang C, Kim G, Yoo S I. Detection and Recognition of Text Embedded in Online Images via Neural Context Models[C]//AAAI. 2017: 4103-4110.

         code:[code]
  • Busta M, Neumann L, Matas J. Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2204-2212.[code]
  • Wu Y, Natarajan P. Self-organized Text Detection with Minimal Post-processing via Border Learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 5000-5009.
  • Rong X, Yi C, Tian Y. Unambiguous text localization and retrieval for cluttered scenes[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 3279-3287.

2018

  • Deng D, Liu H, Li X, et al. PixelLink: Detecting Scene Text via Instance Segmentation[J]. arXiv preprint arXiv:1801.01315, 2018.
  • Agnese Chiatti, Mu Jung Cho, Anupriya Gagneja, Xiao Yang, Miriam Brinberg, Katie Roehrick, Sagnik Ray Choudhury, Nilam Ram, Byron Reeves, C. Lee Giles. Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media. arXiv preprint arXiv:1801.01316, 2018.
  • Liu X, Liang D, Yan S, et al. FOTS: Fast Oriented Text Spotting with a Unified Network[J]. arXiv preprint arXiv:1801.01671, 2018.
  • Liao M, Shi B, Bai X. TextBoxes++: A Single-Shot Oriented Scene Text Detector[J]. arXiv preprint arXiv:1801.02765, 2018.
  • Anders Hast, Per Cullhed, Ekta Vats. TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation. arXiv preprint arXiv:1801.05367, 2018.
  • Yash Patel, Michal Bušta, Jiri Matas. E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text. arXiv preprint arXiv:1801.09919, 2018.
  • Yixing Zhu, Jun Du. Sliding Line Point Regression for Shape Robust Scene Text Detection. arXiv preprint arXiv:1801.09969, 2018.
  • Tobias Grüning, Gundram Leifert, Tobias Strauß, Roger Labahn. A Two-Stage Method for Text Line Detection in Historical Documents. arXiv preprint arXiv:1802.03345, 2018.
  • Congzheng Song, Vitaly Shmatikov. Fooling OCR Systems with Adversarial Text Images. arXiv preprint arXiv:1802.05385, 2018.
  • Pengyuan Lyu, Cong Yao, Wenhao Wu, Shuicheng Yan, Xiang Bai. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. arXiv preprint arXiv:1802.08948, 2018.
  • Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu. Chinese Text in the Wild. arXiv preprint arXiv:1803.00085, 2018.
  • Liao M, Zhu Z, Shi B, et al. Rotation-Sensitive Regression for Oriented Scene Text Detection. [C]arXiv preprint arXiv:1803.05265, 2018.
  • Carbonell M, Villegas M, Fornés A, et al. Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model[J]. arXiv preprint arXiv:1803.06252, 2018.
  • Goswami T, Barad Z, Desai P, et al. Text Detection and Recognition in images: A survey[J]. arXiv preprint arXiv:1803.07278, 2018.
  • José Carlos Aradillas, Juan José Murillo-Fuentes, Pablo M. Olmos. Boosting Handwriting Text Recognition in Small Databases with Transfer Learning[J]. arXiv preprint arXiv: 1803.01527, 2018.
  • Linjie Deng, Yanxiang Gong, Yi Lin, Jingwen Shuai, Xiaoguang Tu, Yufei Zhang, Zheng Ma, Mei Xie. Detecting Multi-Oriented Text with Corner-based Region Proposals[J]. arXiv preprint arXiv:1804.02690, 2018.
  • Partha Pratim Roy, Akash Mohta, Bidyut B. Chaudhuri. Synthetic data generation for Indic handwritten text recognition[J]. arXiv preprint arXiv:1804.06254, 2018.
  • Dafang He, Yeqing Li, Alexander Gorban, Derrall Heath, Julian Ibarz, Qian Yu, Daniel Kifer, C. Lee Giles. Guided Attention for Large Scale Scene Text Verification[J]. arXiv preprint arXiv:1804.08588, 2018.
  • Zhuoyao Zhong, Lei Sun, Qiang Huo. An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches[J]. arXiv preprint arXiv:1804.09003, 2018.
  • 【alibaba】Qiangpeng Yang, Mengli Cheng, Wenmeng Zhou, Yan Chen, Minghui Qiu, Wei Lin, Wei Chu. IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection[J]. arXiv preprint arXiv:1805.01167, 2018.
  • Francisco Cruz, Oriol Ramos Terrades. A probabilistic framework for handwritten text line segmentation[J]. arXiv preprint arXiv:1805.02536, 2018.
  • Fan Bai, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou. Edit Probability for Scene Text Recognition[J]. arXiv preprint arXiv:1805.03384, 2018.
  • Xiaoyu Yue, Zhanghui Kuang, Zhaoyang Zhang, Zhenfang Chen, Pan He, Yu Qiao, Wei Zhang. Boosting up Scene Text Detectors with Guided CNN[J]. arXiv preprint arXiv:1805.04132, 2018.
  • Zichuan Liu, Guosheng Lin, Sheng Yang, Jiashi Feng, Weisi Lin, Wang Ling Goh. Learning Markov Clustering Networks for Scene Text Detection[J]. arXiv preprint arXiv:1805.08365, 2018.
  • Yi-Chao Wu, Fei Yin, Xu-Yao Zhang, Li Liu, Cheng-Lin Liu. SCAN: Sliding Convolutional Attention Network for Scene Text Recognition[J]. arXiv preprint arXiv:1806.00578, 2018.
  • Fenfen Sheng, Zhineng Chen, Bo Xu. NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition[J]. arXiv preprint arXiv:1806.00926, 2018.
  • Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang. Shape Robust Text Detection with Progressive Scale Expansion Network[J]. arXiv preprint arXiv:1806.02559, 2018.
  • Sauradip Nag, Pallab Kumar Ganguly, Sumit Roy, Sourab Jha, Krishna Bose, Abhishek Jha, Kousik Dasgupta. Offline Extraction of Indic Regional Language from Natural Scene Image using Text Segmentation and Deep Convolutional Sequence[J]. arXiv preprint arXiv:1806.06208, 2018.
  • Arka Ujjal dey, Suman K. Ghosh, Ernest Valveny. Don’t only Feel Read: Using Scene text to understand advertisements[J]. arXiv preprint arXiv:1806.08279, 2018.
  • Shangbang Long, Jiaqiang Ruan, Wenjie Zhang, Xin He, Wenhao Wu, Cong Yao. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes[J]. arXiv preprint arXiv:1807.01544, 2018.
  • Qi Yuan, Bingwang Zhang, Haojie Li, Zhihui Wang, Zhongxuan Luo. A Single Shot Text Detector with Scale-adaptive Anchors[J]. arXiv preprint arXiv:1807.01884, 2018.
  • Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, Xiang Bai. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes[J]. arXiv preprint arXiv:1807.02242, 2018.
  • Fangneng Zhan, Shijian Lu, Chuhui Xue. Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes[J]. arXiv preprint arXiv:1807.03021, 2018.
  • Xiaoyong Yuan, Pan He, Xiaolin Andy Li. Adaptive Adversarial Attack on Scene Text Recognition[J]. arXiv preprint arXiv:1807.03326, 2018.
  • Chuhui Xue, Shijian Lu, Fangneng Zhan. Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping[J]. arXiv preprint arXiv:1807.03547, 2018.
  • Arindam Chowdhury, Lovekesh Vig. An Efficient End-to-End Neural Model for Handwritten Text Recognition[J]. arXiv preprint arXiv:1807.07965, 2018.
  • Yuting Gao, Zheng Huang, Yuchen Dai. Double Supervised Network with Attention Mechanism for Scene Text Recognition[J]. arXiv preprint arXiv:1808.00677, 2018.
  • Wenchao Wang, Jun Du, Zi-Rui Wang. Parsimonious HMMs for Offline Handwritten Chinese Text Recognition[J]. arXiv preprint arXiv:1808.04138, 2018.
  • Lluís Gómez, Andrés Mafla, Marçal Rusiñol, DimosthenisKaratzas. Single Shot Scene Text Retrieval[J]. arXiv preprint arXiv:1808.09044, 2018.
  • Dafang He, Xiao Yang, Daniel Kifer, C.Lee Giles .TextContourNet: a Flexible and Effective Framework for Improving Scene Text Detection Architecture with a Multi-task Cascade .[J] arXiv preprint arXiv:1809.03050.
  • Minghui Liao, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang, Pengyuan Lyu, Cong Yao, Xiang Bai .Scene Text Recognition from Two-Dimensional Perspective .[J] arXiv preprint arXiv:1809.06508.
  • Mayank Gupta, Abhinav Kumar, Sriganesh Madhvanath .Parametric Synthesis of Text on Stylized Backgrounds using PGGANs .[J] arXiv preprint arXiv:1809.08488.
  • Saad Bin Ahmed, Saeeda Naz, Muhammad Imran Razzak, Rubiyah Yusof .Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids .[J] arXiv preprint arXiv:1809.10792.
  • Zichuan Liu, Guosheng Lin, Wang Ling Goh, Fayao Liu, Chunhua Shen, Xiaokang Yang .Correlation Propagation Networks for Scene Text Detection .[J] arXiv preprint arXiv:1810.00304.
  • Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró .Visual Semantic Re-ranker for Text Spotting .[J] arXiv preprint arXiv:1810.09776.
  • Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró .Visual Re-ranking with Natural Language Understanding for Text Spotting .[J] arXiv preprint arXiv:1810.12738.
  • Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang .Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition .[J] arXiv preprint arXiv:1811.00751.
  • Shangbang Long, Xin He, Cong Ya .Scene Text Detection and Recognition: The Deep Learning Era .[J] arXiv preprint arXiv:1811.04256.
  • Jing Huang, Viswanath Sivakumar, Mher Mnatsakanyan, Guan Pang .Improving Rotated Text Detection with Rotation Region Proposal Networks .[J] arXiv preprint arXiv:1811.07031.
  • Yuan Li, Yuanjie Yu, Zefeng Li, Yangkun Lin, Meifang Xu, Jiwei Li, Xi Zhou .Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks .[J] arXiv preprint arXiv:1811.07432.
  • Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu .A Novel Integrated Framework for Learning both Text Detection and Recognition .[J] arXiv preprint arXiv:1811.08611.
  • Zhida Huang, Zhuoyao Zhong, Lei Sun, Qiang Huo .Mask R-CNN with Pyramid Attention Network for Scene Text Detection .[J] arXiv preprint arXiv:1811.09058.
  • Dinh NguyenVan, Shijian Lu, Shangxuan Tian, Nizar Ouarti, Mounir Mokhtari .A pooling based scene text proposal technique for scene text reading in the wild .[J] arXiv preprint arXiv:1811.10003.
  • Hanh T. M. Tran, Tien Ho-Phuoc .Deep Laplacian Pyramid Network for Text Images Super-Resolution .[J] arXiv preprint arXiv:1811.10449.
  • Yixing Zhu, Jun Du .TextMountain: Accurate Scene Text Detection via Instance Segmentation .[J] arXiv preprint arXiv:1811.12786.
  • Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai .EnsNet: Ensconce Text in the Wild .[J] arXiv preprint arXiv:1812.00723.
  • Yongchao Xu, Yukang Wang, Wei Zhou, Yongpan Wang, Zhibo Yang, Xiang Bai .TextField: Learning A Deep Direction Field for Irregular Scene Text Detection .[J] arXiv preprint arXiv:1812.01393.
  • Najoua Rahal, Maroua Tounsi, Adel M. Alimi .Auto-Encoder-BoF/HMM System for Arabic Text Recognition .[J] arXiv preprint arXiv:1812.03680.
  • 【Dataset】Masakazu Iwamura .Advances of Scene Text Datasets .[J] arXiv preprint arXiv:1812.05219.
  • Fangneng Zhan, Shijian Lu .ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification .[J] arXiv preprint arXiv:1812.05824.
  • Shuai Yang, Jiaying Liu, Wenjing Wang, Zongming Guo .TET-GAN: Text Effects Transfer via Stylization and Destylization .[J] arXiv preprint arXiv:1812.06384.
  • Chankyu Choi, Youngmin Yoon, Junsu Lee, Junseok Kim .Simultaneous Recognition of Horizontal and Vertical Text in Natural Images .[J] arXiv preprint arXiv:1812.07059.
  • Yunze Gao, Yingying Chen, Jinqiao Wang, Zhen Lei, Xiao-Yu Zhang, Hanqing Lu .Recurrent Calibration Network for Irregular Text Recognition .[J] arXiv preprint arXiv:1812.07145.
  • Zi-Rui Wang, Jun Du, Jia-Ming Wang .Writer-Aware CNN for Parsimonious HMM-Based Offline Handwritten Chinese Text Recognition .[J] arXiv preprint arXiv:1812.09809.
  • Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding .TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network .[J] arXiv preprint arXiv:1812.09900.
  • Mohamed Yousef, Khaled F. Hussain, Usama S. Mohammed .Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks .[J] arXiv preprint arXiv:1812.11894.

2019

  • Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding .Detecting Text in the Wild with Deep Character Embedding Network .[J] arXiv preprint arXiv:1901.00363.
  • Chuhui Xue, Shijian Lu, Wei Zhang .MSR: Multi-Scale Shape Regression for Scene Text Detection .[J] arXiv preprint arXiv:1901.02596.
  • 【MORAN】Canjie Luo, Lianwen Jin, Zenghui Sun .A Multi-Object Rectified Attention Network for Scene Text Recognition .[J] arXiv preprint arXiv:1901.03003.
    [code: Canjie-Luo/MORAN_v2]
  • Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong .SAFE: Scale Aware Feature Encoder for Scene Text Recognition .[J] arXiv preprint arXiv:1901.05770.
  • Yanxiang Gong, Linjie Deng, Zheng Ma, Mei Xie .Generating Text Sequence Images for Recognition .[J] arXiv preprint arXiv:1901.06782.
  • Fangneng Zhan, Hongyuan Zhu, Shijian Lu .Scene Text Synthesis for Efficient and Effective Deep Network Training .[J] arXiv preprint arXiv:1901.09193.
  • Amarnath R, P Nagabhushan .Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm .[J] arXiv preprint arXiv:1901.11477.
  • Eloi Alonso, Bastien Moysset, Ronaldo Messina .Adversarial Generation of Handwritten Text Images Conditioned on Sequences .[J] arXiv preprint arXiv:1903.00277.
  • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal .STEFANN: Scene Text Editor using Font Adaptive Neural Network .[J] arXiv preprint arXiv:1903.01192.
  • Zhanzhan Cheng, Jing Lu, Jianwen Xie, Yi Niu, Shiliang Pu, Fei Wu .Efficient Video Scene Text Spotting: Unifying Detection, Tracking, and Recognition .[J] arXiv preprint arXiv:1903.03299.
  • Bastien Moysset, Ronaldo Messina .Manifold Mixup improves text recognition with CTC loss .[J] arXiv preprint arXiv:1903.04246.
  • Johannes Michael, Roger Labahn, Tobias Grüning, Jochen Zöllner .Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition .[J] arXiv preprint arXiv:1903.07377.
  • Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, Wang Ling Goh .Towards Robust Curve Text Detection with Conditional Spatial Expansion .[J] arXiv preprint arXiv:1903.08836.
  • Zhao Zhou, Shufan Wu, Shuchen Kong, Yingbin Zheng, Hao Ye, Luhui Chen, Jian Pu .Curve Text Detection with Local Segmentation Network and Curve Connection .[J] arXiv preprint arXiv:1903.09837.
  • 【Dataset】Chongsheng Zhang, Guowen Peng, Yuefeng Tao, Feifei Fu, Wei Jiang, George Almpanidis, Ke Chen .ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views .[J] arXiv preprint arXiv:1903.10412.
  • Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, Qingjie Liu .Pyramid Mask Text Detector .[J] arXiv preprint arXiv:1903.11800.
  • Xiaohui Zhao, Zhuo Wu, Xiaoguang Wang .CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor .[J] arXiv preprint arXiv:1903.12363.
  • Wenhai Wang, Enze Xie, Xiang Li, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao .Shape Robust Text Detection with Progressive Scale Expansion Network .[J] arXiv preprint arXiv:1903.12473.
  • Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie .Tightness-aware Evaluation Protocol for Scene Text Detection .[J] arXiv preprint arXiv:1904.00813.
  • 【Dataset】Simone Bonechi, Paolo Andreini, Monica Bianchini, Franco Scarselli .COCO_TS Dataset: Pixel-level Annotations Based on Weak Supervision for Scene Text Segmentation .[J] arXiv preprint arXiv:1904.00818.
  • Peng Wang, Lu Yang, Hui Li, Yuyan Deng, Chunhua Shen, Yanning Zhang .A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition .[J] arXiv preprint arXiv:1904.01375.
  • Jeonghun Baek, Geewook Kim, Junyeop Lee, Sungrae Park, Dongyoon Han, Sangdoo Yun, Seong Joon Oh, Hwalsuk Lee .What is wrong with scene text recognition model comparisons? dataset and model analysis .[J] arXiv preprint arXiv:1904.01906.
  • Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee .Character Region Awareness for Text Detection .[J] arXiv preprint arXiv:1904.01941.
  • Chengquan Zhang, Borong Liang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding, Xinghao Ding .Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes .[J] arXiv preprint arXiv:1904.06535.
  • Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Jathushan Rajasegaran, Suranga Seneviratne, Ranga Rodrigo .TextCaps : Handwritten Character Recognition with Very Small Datasets .[J] arXiv preprint arXiv:1904.08095.
  • R. Reeve Ingle, Yasuhisa Fujii, Thomas Deselaers, Jonathan Baccash, Ashok C. Popat .A Scalable Handwritten Text Recognition System .[J] arXiv preprint arXiv:1904.09150.
  • Qingqing Wang, Wenjing Jia, Xiangjian He, Yue Lu, Michael Blumenstein, Ye Huang .FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition .[J] arXiv preprint arXiv:1904.09405.
  • Fady Medhat, Mahnaz Mohammadi, Sardar Jaf, Chris G. Willcocks, Toby P. Breckon, Peter Matthews, Andrew Stephen McGough, Georgios Theodoropoulos, Boguslaw Obara .TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text .[J] arXiv preprint arXiv:1904.12387.
  • Weijia Wu, Jici Xing, Hong Zhou .TextCohesion: Detecting Text for Arbitrary Shapes .[J] arXiv preprint arXiv:1904.12640.

Datasets

there are three websites that have the dataset list of some different data type:
1 - www.iapr-tc11.org
2 - tc11.cvc.uab.es
3 - rrc.cvc.uab.es

  • 2017 COCO-Text
    2017 DeTEXT
    2017 DOST
    2017 FSNS
    2017 MLT
    2017 IEHHR
    2011-2015 Born-DIgitalImage
    2013-2015 Focused Scene Text
    2013-2015 Text in Videos
    2015 Incidental Scene Text

  • ICDAR Chinese 2017

    • more than 12,000 images. Most of the images are collected in the wild by phone cameras.
    • Task: Chinese Text in the Wild.
  • Chinese Text in the Wild 2017

    • 32,285 high resolution images, 1,018,402 character instances, 3,850 character categories, 6 kinds of attributes
  • Total-Text 2017

    • 1555 images,11459 text instances, includes curved tex
  • SCUT_FORU_DB_Release 2016

    • FORU contains two parts, which are Chinese2k and English2k dataset, respectively.
  • SynthText in the Wild Dataset 2016

    • 800 thousand images, 8 million synthetic word instances.
    • Each text instance is annotated with its text-string, word-level and character-level bounding-boxes.
  • COCO-Text (Computer Vision Group, Cornell) 2016

    • 63,686 images, 173,589 text instances, 3 fine-grained text attributes.
    • Task: text location and recognition
    • COCO-Text API
  • USTB-SV1k 2014

    • 1000 (500 for training and 500 for testing) street view (patch) images from 6 USA cities
  • Synthetic Word Dataset (Oxford, VGG) 2014

    • 9 million images covering 90k English words
    • Task: text recognition, segmantation
    • download
  • IIIT 5K-Words 2012

    • 5000 images from Scene Texts and born-digital (2k training and 3k testing images)
    • Each image is a cropped word image of scene text with case-insensitive labels
    • Task: text recognition
    • download
  • StanfordSynth(Stanford, AI Group) 2012

    • Small single-character images of 62 characters (0-9, a-z, A-Z)
    • Task: text recognition
    • download
  • MSRA Text Detection 500 Database (MSRA-TD500) 2012

    • 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
    • Chinese, English or mixture of both
    • Task: text detection
  • OSTD 2011

    • cannot find the downloadlink
  • Traffice Guide Panel Text Dataset,TGPT 2016

    • 3841 high-resolution individual images, 2315 containing traffic guide panel level annotations (1911 for training and 404 for testing, and all the testing images are manually labeled with ground truth tight text region bounding boxes), 1526 containing no traffic signs}.
  • Street View Text (SVT) 2010

    • 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
    • Only word level bounding boxes are provided with case-insensitive labels
    • Task: text location
  • KAIST Scene_Text Database 2010

    • 3000 images of indoor and outdoor scenes containing text
    • Korean, English (Number), and Mixed (Korean + English + Number)
    • Task: text location, segmantation and recognition
  • Chars74k 2009

    • Over 74K images from natural images, as well as a set of synthetically generated characters
    • Small single-character images of 62 characters (0-9, a-z, A-Z)
    • Task: text recognition
  • ICDAR Benchmark Datasets

Dataset Discription Competition Paper
ICDAR 2015 1000 training images and 500 testing images paper link
ICDAR 2013 229 training images and 233 testing images paper link
ICDAR 2011 229 training images and 255 testing images paper link
ICDAR 2005 1001 training images and 489 testing images paper link
ICDAR 2003 181 training images and 251 testing images(word level and character level) paper link

你可能感兴趣的:(深度学习,神经网络,计算机视觉)