ReELFA: A Scene Text Recognizer with Encoded Location and Focused Attention

Wang, Qingqing; Jia, Wei; He, Xiangjian; Lu, Yue; Blumenstein, Michael; Huang, Ye; Lyu, Shujing

doi:10.1109/icdarw.2019.40084

Cited by 5 publications

(3 citation statements)

References 20 publications

(46 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Motivated by the successes of [180], [181], [68], Su et al [131] used the histogram of oriented gradients (HOG) feature [182] in their STR system to construct sequential features of word images. Later, CNNs [138], [140], [141], [143], [72] have been widely used for feature representation stage, such as the VGGNet [183], [116], [178], [133], [71], [137], [162]. For more powerful feature representation, some complex neural networks were applied in STR algorithms, such as ResNet [184] [135], [145], [148], [73], [151], [166], [152], [154], [48], [155], [157], [158], [161], [163], [164] and DenseNet [185], [150], [160].…”

Section: Feature Representation Stagementioning

confidence: 99%

“…Multiple bidirectional long short term memory (BiLSTM) model was introduced in [188] and widely used in [131], [116], [178], [135], [136], [139], [140], [141], [148], [150], [73], [72], [151], [154], [48], [155], [157], [158], [162], [166] as the sequence modeling module because of its ability to capture long-range dependencies. Litman et al [169] added intermediate supervisions along the network layers and successfully trained a deeper BiLSTM model to improve the encoding of contextual dependencies.…”

Section: Sequence Modeling Stagementioning

confidence: 99%

“…The attention drift phenomenon means that attention models cannot accurately associate each feature vector with the corresponding target region in the input image. Some researchers added extra information to solve this problem by focusing the deviated attention back onto the target areas, such as localization supervision [71] and encoded coordinates [162]. Others [157], [159], [163] increased the alignment precision of attention in a cascade way.…”

Section: Attention Mechanismmentioning

confidence: 99%

See 2 more Smart Citations

Text Recognition in the Wild: A Survey

Chen,

Jin,

Zhu

et al. 2020

Preprint

View full text Add to dashboard Cite

The history of text can be traced back over thousands of years. Rich and precise semantic information carried by text is important in a wide range of vision-based application scenarios. Therefore, text recognition in natural scenes has been an active research field in computer vision and pattern recognition. In recent years, with the rise and development of deep learning, numerous methods have shown promising in terms of innovation, practicality, and efficiency. This paper aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition; (2) introduce new insights and ideas; (3) provide a comprehensive review of publicly available resources; (4) point out directions for future work. In summary, this literature review attempts to present the entire picture of the field of scene text recognition. It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research. Related resources are available at our Github repository: https://github.com/HCIILAB/Scene-Text-Recognition.

show abstract