Revisiting Classification Perspective on Scene Text Recognition

Cai, Hongxiang; Sun, Jun; Xiong, Yichao

doi:10.48550/arxiv.2102.10884

Cited by 2 publications

(2 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to find a suitable STR component for our pipeline, we compare the performance of several methods proposed over the last few years. Specifically, we compare CRNN [19], TPS-ResNet-BiLSTM-CTC (TRBC), TPS-ResNet-BiLSTM-Attn (TRBA), SARN [9], CSTR [2] and the recognition module of EasyOCR (based on CRNN). All these methods except for the EasyOCR recognition module are trained on Synthtext [6] to allow for a fair comparison.…”

Section: Scene-text Recognitionmentioning

confidence: 99%

HyText – A Scene-Text Extraction Method for Video Retrieval

Theus

Rossetto

Bernstein

2022

MultiMedia Modeling

View full text Add to dashboard Cite

Scene-text has been shown to be an effective query target for video retrieval applications in a known-item search context. While much progress has been made in scene-text extraction from individual pictures, the special case of video has so far received less attention. This paper introduces HyText, a scene-text extraction method for video with a focus on retrieval applications. HyText uses intermittent scene-text detection in combination with bi-directional tracking in order to increase throughput without reducing detection accuracy.

show abstract

Section: Scene-text Recognitionmentioning

confidence: 99%

HyText – A Scene-Text Extraction Method for Video Retrieval

Theus

Rossetto

Bernstein

2022

MultiMedia Modeling

View full text Add to dashboard Cite

show abstract

“…The PhotoOCR application in [ 12 ] demonstrated the use of deep neural networks without convolutional operations for character recognition with raw and edge-based feature representations. The recent work by Cai et al [ 14 ] explored the image classification methodology used for scene text recognition using convolutional neural architecture. The authors in [ 15 ], demonstrated an early application of recurrent neural networks for modeling scene text using orientation features.…”

Section: Literature Surveymentioning

confidence: 99%

Attention Guided Feature Encoding for Scene Text Recognition

Hassan

Lekshmi

2022

J. Imaging

View full text Add to dashboard Cite

The real-life scene images exhibit a range of variations in text appearances, including complex shapes, variations in sizes, and fancy font properties. Consequently, text recognition from scene images remains a challenging problem in computer vision research. We present a scene text recognition methodology by designing a novel feature-enhanced convolutional recurrent neural network architecture. Our work addresses scene text recognition as well as sequence-to-sequence modeling, where a novel deep encoder–decoder network is proposed. The encoder in the proposed network is designed around a hierarchy of convolutional blocks enabled with spatial attention blocks, followed by bidirectional long short-term memory layers. In contrast to existing methods for scene text recognition, which incorporate temporal attention on the decoder side of the entire architecture, our convolutional architecture incorporates novel spatial attention design to guide feature extraction onto textual details in scene text images. The experiments and analysis demonstrate that our approach learns robust text-specific feature sequences for input images, as the convolution architecture designed for feature extraction is tuned to capture a broader spatial text context. With extensive experiments on ICDAR2013, ICDAR2015, IIIT5K and SVT datasets, the paper demonstrates an improvement over many important state-of-the-art methods.

show abstract

Revisiting Classification Perspective on Scene Text Recognition

Cited by 2 publications

References 31 publications

HyText – A Scene-Text Extraction Method for Video Retrieval

HyText – A Scene-Text Extraction Method for Video Retrieval

Attention Guided Feature Encoding for Scene Text Recognition

Contact Info

Product

Resources

About