2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA) 2020
DOI: 10.1109/icmla51294.2020.00223
|View full text |Cite
|
Sign up to set email alerts
|

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

Abstract: In the last decades, scene text recognition has gained worldwide attention from both the academic community and actual users due to its importance in a wide range of applications. Despite achievements in optical character recognition, scene text recognition remains challenging due to inherent problems such as distortions or irregular layout. Most of the existing approaches mainly leverage recurrence or convolution-based neural networks. However, while recurrent neural networks (RNNs) usually suffer from slow t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 32 publications
(30 reference statements)
0
2
0
Order By: Relevance
“…This method identifies characters one-by-one resulting in low speed. Tran [4] proposed SAFL, a self-attention-based neural network model with focal loss for scene text recognition. SAFL utilized focal loss, which allows the model to focus more on training low-frequency samples.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…This method identifies characters one-by-one resulting in low speed. Tran [4] proposed SAFL, a self-attention-based neural network model with focal loss for scene text recognition. SAFL utilized focal loss, which allows the model to focus more on training low-frequency samples.…”
Section: Related Workmentioning
confidence: 99%
“…In STR, Recurrent Neural Networks (RNNs) are proper approaches to capture context and dependencies in sequential data, while Convolutional Neural Networks (CNNs) excel at finding hidden patterns using local spatial information in the input. [4]. RNNs, with their recurrent connections, are well-suited for handling sequential data, such as text, because they can retain information from previous time steps and utilize it to make predictions.…”
Section: Introductionmentioning
confidence: 99%