Journey of scene text components recognition: Progress and open issues

Sengupta, Payel; Mollah, Ayatullah Faruk

doi:10.1007/s11042-020-09862-x

Cited by 5 publications

(3 citation statements)

References 66 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several architectures have been proposed over the years for recognition of scene text [40,12,53] and handwritten text [58,43]. Throughout this work, we focus on a general text recognition framework, which was proposed by [3].…”

Section: Text Recognition Backgroundmentioning

confidence: 99%

Sequence-to-Sequence Contrastive Learning for Text Recognition

Aberdam

Litman²,

Tsiper³

et al. 2021

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Section: Text Recognition Backgroundmentioning

confidence: 99%

Sequence-to-Sequence Contrastive Learning for Text Recognition

Aberdam

Litman²,

Tsiper³

et al. 2021

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Section: Text Recognition Backgroundmentioning

confidence: 99%

Sequence-to-Sequence Contrastive Learning for Text Recognition

Aberdam¹,

Litman²,

Tsiper³

et al. 2020

Preprint

View full text Add to dashboard Cite

We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. To account for the sequenceto-sequence structure, each feature map is divided into different instances over which the contrastive loss is computed. This operation enables us to contrast in a sub-word level, where from each image we extract several positive pairs and multiple negative examples. To yield effective visual representations for text recognition, we further suggest novel augmentation heuristics, different encoder architectures and custom projection heads. Experiments on handwritten text and on scene text show that when a text decoder is trained on the learned representations, our method outperforms non-sequential contrastive methods. In addition, when the amount of supervision is reduced, SeqCLR significantly improves performance compared with supervised training, and when fine-tuned with 100% of the labels, our method achieves state-of-the-art results on standard handwritten text recognition benchmarks.

show abstract

“…It can play a crucial role in both general scene understanding and in specific applications, such as autonomous driving 1 and robotic navigation 2 . Significant progress has been made in STR-based research since the development of deep learning, 3 – 6 object detection, 7 , 8 and text detection 9 – 12 However, a range of circumstances, such as uneven illumination, poor image quality, perspective distortion, and orientation, continue to make recognizing texts from real images challenging.…”

Section: Introductionmentioning

confidence: 99%

Multilingual semantic fusion network for text recognition in the wild

et al. 2023

View full text Add to dashboard Cite

Most current approaches in the literature of scene text recognition train the language model via a text dataset far sparser than in natural language processing, resulting in inadequate training. Therefore, we propose a simple transformer encoder-decoder model called the multilingual semantic fusion network (MSFN) that can leverage prior linguistic knowledge to learn robust language features. First, we label the text dataset with forward, backward sequences, and subwords, which are extracted by tokenization with linguistic information. Then we introduce a multilingual model to the decoder corresponding to three different channels of the labeled dataset. The final output is fused by different channels to get more accurate results. In experiments, MSFN achieves cutting-edge performance across six benchmark datasets, and extensive ablative studies have proven the effectiveness of the proposed method. Code is available at https://github .com/lclee0577/MLViT.

show abstract

Journey of scene text components recognition: Progress and open issues

Cited by 5 publications

References 66 publications

Sequence-to-Sequence Contrastive Learning for Text Recognition

Sequence-to-Sequence Contrastive Learning for Text Recognition

Sequence-to-Sequence Contrastive Learning for Text Recognition

Multilingual semantic fusion network for text recognition in the wild

Contact Info

Product

Resources

About