2022
DOI: 10.1609/aaai.v36i1.19904
|View full text |Cite
|
Sign up to set email alerts
|

Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

Abstract: In the last decade, the blossom of deep learning has witnessed the rapid development of scene text recognition. However, the recognition of low-resolution scene text images remains a challenge. Even though some super-resolution methods have been proposed to tackle this problem, they usually treat text images as general images while ignoring the fact that the visual quality of strokes (the atomic unit of text) plays an essential role for text recognition. According to Gestalt Psychology, humans are capable of c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(21 citation statements)
references
References 32 publications
1
14
0
Order By: Relevance
“…TPGSR [28] employs a text prior generator to extract categorical probability distribution as guidance for the text image reconstruction process. Text Gestalt [29] pre-trains a text recognizer to highlight the stroke-level details. All of the previous works concentrate on recovering SR text images in a fully supervised manner, that is, with all the LR-HR pairs being used.…”
Section: Scene Text Image Super Resolution (Stisr)mentioning
confidence: 99%
See 1 more Smart Citation
“…TPGSR [28] employs a text prior generator to extract categorical probability distribution as guidance for the text image reconstruction process. Text Gestalt [29] pre-trains a text recognizer to highlight the stroke-level details. All of the previous works concentrate on recovering SR text images in a fully supervised manner, that is, with all the LR-HR pairs being used.…”
Section: Scene Text Image Super Resolution (Stisr)mentioning
confidence: 99%
“…Inspired by the success of TSRN, many researchers have started to investigate real-world STISR to improve the quality of LR text images, thus improving recognition accuracy. However, all of the current works concentrate on recovering LR scene text images in a fully supervised manner, that is, with all the LR-HR pairs being used [25][26][27][28][29].…”
Section: Introductionmentioning
confidence: 99%
“…C3-STISR [14] is proposed to learn triple clues, including recognition clue from a STR, linguistical clue from a language model, and a visual clue from a skeleton painter to rich the representation of the text-specific clue. TG [9] and [11] exploit stroke-level information from HR images via stroke-focused module and skeleton loss for more fine-grained super-resolution. Compared with generic image super-resolution approaches, these methods greatly advance the recognition accuracy through various textspecific information extraction techniques.…”
Section: B Scene Text Image Super-resolutionmentioning
confidence: 99%
“…STT [8] exploits character-level attention maps from HR images to assist the recovery. [11] and TG [9] extract stroke-level information from HR images through specific networks to provide more fine-grained supervision information. [12], [13], [14] additionally introduce external modules to extract various textspecific clues to facilitate the recovery and use the supervision from HR images to finetune their modules.…”
mentioning
confidence: 99%
“…The same authors also focused on the internal stroke-level structures of characters in text images. Thus, they designed rules for decomposing English characters and digits at the stroke level and proposed using a pretrained text recognizer to provide stroke-level attention maps as positional cues [19]. Ma et al [9] provided guidance to recover HR text images by introducing an explicit text prior to the character probability sequence obtained from a text recognition model.…”
Section: Deep Learning-based Text Image Super-resolutionmentioning
confidence: 99%