2018 25th IEEE International Conference on Image Processing (ICIP) 2018
DOI: 10.1109/icip.2018.8451273
|View full text |Cite
|
Sign up to set email alerts
|

Dense Chained Attention Network for Scene Text Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(15 citation statements)
references
References 13 publications
0
15
0
Order By: Relevance
“…Comparison with Methods based on the Traditional FC-LSTM: As previously introduced, traditional FC-LSTM is widely used in existing recognizers. Among the methods listed in Table 1, RARE [8], AON [6] and FAN [5] combined FC-LSTM with the attention mechanism in the fully connected way when performing sequential transcription, while CRNN [7], R 2 AM [17], Gao's model [4] and SqueezedText [20] utilized FC-LSTM for frame-level prediction, sequential feature encoding or other purposes. As shown in Table 1, our proposed FACLSTM outperforms these FC-LSTM-based methods by large margins on both regular text dataset IIIT5K (90.5% vs 87.4%) and curved text dataset CUTE (83.33% and 76.8%) when no lexicon is used.…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…Comparison with Methods based on the Traditional FC-LSTM: As previously introduced, traditional FC-LSTM is widely used in existing recognizers. Among the methods listed in Table 1, RARE [8], AON [6] and FAN [5] combined FC-LSTM with the attention mechanism in the fully connected way when performing sequential transcription, while CRNN [7], R 2 AM [17], Gao's model [4] and SqueezedText [20] utilized FC-LSTM for frame-level prediction, sequential feature encoding or other purposes. As shown in Table 1, our proposed FACLSTM outperforms these FC-LSTM-based methods by large margins on both regular text dataset IIIT5K (90.5% vs 87.4%) and curved text dataset CUTE (83.33% and 76.8%) when no lexicon is used.…”
Section: Resultsmentioning
confidence: 99%
“…Methods based on LSTM: LSTM is widely used in the existing state-of-the-art recognizers for three purposes, i.e., producing frame-level predictions required by the subsequent sequential transcription module [4,7], encoding sequential features with considering historical information [8,16], and directly generating sequential predictions when cooperating with the attention mechanism [5,6,13,16,17]. For example, CRNN proposed by Shi et al [7] was composed of three parts, i.e., the convolution module used to extract features from input images, a bi-LSTM layer built to make predictions for individual frames, and a CTC-based sequential transcription component utilized to infer sequential outputs from frame-level predictions.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations