2021
DOI: 10.48550/arxiv.2102.10884
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Revisiting Classification Perspective on Scene Text Recognition

Hongxiang Cai,
Jun Sun,
Yichao Xiong

Abstract: The prevalent perspectives of scene text recognition are from sequence to sequence (seq2seq) and segmentation. In this paper, we propose a new perspective on scene text recognition, in which we model the scene text recognition as an image classification problem. Based on the image classification perspective, a scene text recognition model is proposed, which is named as CSTR.The CSTR model consists of a series of convolutional layers and a global average pooling layer at the end, followed by independent multi-c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…In order to find a suitable STR component for our pipeline, we compare the performance of several methods proposed over the last few years. Specifically, we compare CRNN [19], TPS-ResNet-BiLSTM-CTC (TRBC), TPS-ResNet-BiLSTM-Attn (TRBA), SARN [9], CSTR [2] and the recognition module of EasyOCR (based on CRNN). All these methods except for the EasyOCR recognition module are trained on Synthtext [6] to allow for a fair comparison.…”
Section: Scene-text Recognitionmentioning
confidence: 99%
“…In order to find a suitable STR component for our pipeline, we compare the performance of several methods proposed over the last few years. Specifically, we compare CRNN [19], TPS-ResNet-BiLSTM-CTC (TRBC), TPS-ResNet-BiLSTM-Attn (TRBA), SARN [9], CSTR [2] and the recognition module of EasyOCR (based on CRNN). All these methods except for the EasyOCR recognition module are trained on Synthtext [6] to allow for a fair comparison.…”
Section: Scene-text Recognitionmentioning
confidence: 99%
“…The PhotoOCR application in [ 12 ] demonstrated the use of deep neural networks without convolutional operations for character recognition with raw and edge-based feature representations. The recent work by Cai et al [ 14 ] explored the image classification methodology used for scene text recognition using convolutional neural architecture. The authors in [ 15 ], demonstrated an early application of recurrent neural networks for modeling scene text using orientation features.…”
Section: Literature Surveymentioning
confidence: 99%