2022
DOI: 10.48550/arxiv.2208.03364
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

GLASS: Global to Local Attention for Scene-Text Spotting

Abstract: In recent years, the dominant paradigm for text spotting is to combine the tasks of text detection and recognition into a single endto-end framework. Under this paradigm, both tasks are accomplished by operating over a shared global feature map extracted from the input image. Among the main challenges that end-to-end approaches face is the performance degradation when recognizing text across scale variations (smaller or larger text), and arbitrary word rotation angles. In this work, we address these challenges… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 41 publications
(111 reference statements)
0
4
0
Order By: Relevance
“…As shown in Table 2, we have summarized various text spotting methods. We see that A3S surpasses the recent competitive methods [1,5,14,15] on most of the metrics. In CTW1500, our method achieves 64.4 and 82.3 for "None" and "Full", respectively.…”
Section: Resultsmentioning
confidence: 80%
See 2 more Smart Citations
“…As shown in Table 2, we have summarized various text spotting methods. We see that A3S surpasses the recent competitive methods [1,5,14,15] on most of the metrics. In CTW1500, our method achieves 64.4 and 82.3 for "None" and "Full", respectively.…”
Section: Resultsmentioning
confidence: 80%
“…Similar to CTW1500, we confirm that A3S performs well when unavailable lexicons. The proposed method performs better than GLASS [15], which enhances features before text recognition based on the global information in an input image. It is suggested that not only visual information but also semantic one is effective in scenetext spotting.…”
Section: Resultsmentioning
confidence: 96%
See 1 more Smart Citation
“…Text detection and recognition systems [11] and geometric layout analysis techniques [12,13] have long been developed separately as independent tasks. Research on text detection and recognition [14,15,16,17] has mainly focused on the domain of natural images and aimed at single level text spotting (mostly, wordlevel). Conversely, research on geometric layout analysis [12,13,18,19], which is targeted at parsing text paragraphs and forming text clusters, has assumed document images as input and taken OCR results as fixed and given by independent systems.…”
Section: Introductionmentioning
confidence: 99%