2020
DOI: 10.1007/s11432-019-2713-1
|View full text |Cite
|
Sign up to set email alerts
|

FACLSTM: ConvLSTM with focused attention for scene text recognition

Abstract: Scene text recognition has recently been widely treated as a sequence-to-sequence prediction problem, where traditional fully-connected-LSTM (FC-LSTM) has played a critical role. Due to the limitation of FC-LSTM, existing methods have to convert 2-D feature maps into 1-D sequential feature vectors, resulting in severe damages of the valuable spatial and structural information of text images. In this paper, we argue that scene text recognition is essentially a spatiotemporal prediction problem for its 2-D image… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 32 publications
(14 citation statements)
references
References 29 publications
0
14
0
Order By: Relevance
“…Compared with regular text recognition, it is much more challenging to recognize irregular text of arbitrary shape for a model. The methods towards irregular scene text recognition usually exploit deep backbone network (e.g., ResNet101 [36]) for image feature extraction, or parallel convolutional layers [37], [39] to learn attention weights, or need per-pixel annotation for supervision [45], [46]. The gauge images utilized in AGC usually appear in regular arrangements, thus the benefits of these methods are limited at a cost of model complexity as well as inference time.…”
Section: A Methods Towards Scene Text Recognitionmentioning
confidence: 99%
“…Compared with regular text recognition, it is much more challenging to recognize irregular text of arbitrary shape for a model. The methods towards irregular scene text recognition usually exploit deep backbone network (e.g., ResNet101 [36]) for image feature extraction, or parallel convolutional layers [37], [39] to learn attention weights, or need per-pixel annotation for supervision [45], [46]. The gauge images utilized in AGC usually appear in regular arrangements, thus the benefits of these methods are limited at a cost of model complexity as well as inference time.…”
Section: A Methods Towards Scene Text Recognitionmentioning
confidence: 99%
“…For identifying corrupted characters from the Character-Aware Neural Network (Char-Net) is presented [69]. The focus attention convolution LSTM (FACLSTM) for text recognition [70] was suggested after considering scene text recognition as a spatial-temporal prediction issue. A dynamic log-polar transformer and a sequence recognition network are combined to build a novel scale-adaptive orientation attention network for recognizing the randomly aligned text in an image [71].…”
Section: Text Recognition Using Deep Learningmentioning
confidence: 99%
“…Text recognition in natural scenes is widely used, and it has a wide range of applications in the current instant translation of photos, image retrieval and other aspects, not only the above mentioned several algorithms, but also based on RARE [14] Network, FAN [15] Network, FACLSTM [16] Network and other text recognition algorithms.…”
Section: Research On Text Recognitionmentioning
confidence: 99%