FOTS: Fast Oriented Text Spotting with a Unified Network

Liu, Xuebo; Ding, Liang; Shi, Yan; Chen, Dagui; Qiao, Yu; Yan, Junjie

doi:10.1109/cvpr.2018.00595

Cited by 462 publications

(360 citation statements)

References 54 publications

Supporting

Mentioning

360

Contrasting

Order By: Relevance

“…In this work, we present Convolotional Character Networks (referred as CharNet) for joint text detection and recognition, by leveraging character as basic unit. Moreover, for the first time, we provide an one-stage CNN model for the joint tasks, with significant performance improvements over the state-of-the-art results achieved by a more complex two-stage framework, such as [12], [25] and [24]. The proposed CharNet implements direct character detection and recognition, jointly with text instance (e.g., word) detection.…”

Section: Contributionsmentioning

confidence: 99%

“…First, learning the two tasks independently would result in a sub-optimization problem, making it difficult to fully explore the potential of text nature. For example, text detection and recognition can work collaboratively by providing strong context and complementary information to each other, which is critical to improving the performance, as substantiated by recent work [12,24]. Second, it often requires to implement multiple sequential steps, resulting in a relatively complicated system, where the performance of text recognition is heavily relied on text detection results.…”

Section: Introductionmentioning

confidence: 99%

“…To overcome the limitations of RoI cropping and pooling for two-stage framework, He et al [12] proposed a text-alignment layer to precisely compute the convolutional features for a text instance of arbitrary orientation, which boosted the performance. In [24], multiple affinity transformations were applied to the convolutional features for enhancing text information in the RoI regions. However, these methods failed to work on curved text.…”

Section: Introductionmentioning

confidence: 99%

“…These technical improvements result in a simple, compact, yet powerful onestage model that works reliably on multi-orientation and curved text. We evaluate CharNet on three standard benchmarks, where it consistently outperforms the state-of-theart approaches [25,24] by a large margin, e.g., with improvements of 65.33%→71.08% (with generic lexicon) on ICDAR 2015, and 54.0%→69.23% on Total-Text, on endto-end text recognition. Code is available at: https:// github.com/MalongTech/research-charnet.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Convolutional Character Networks

Xing

Tian

Huang

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

149

View full text Add to dashboard Cite

Recent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on recognition task. In this work, we propose convolutional character networks, referred as CharNet, which is an one-stage model that can process two tasks simultaneously in one pass. CharNet directly outputs bounding boxes of words and characters, with corresponding character labels. We utilize character as basic element, allowing us to overcome the main difficulty of existing approaches that attempted to optimize text detection jointly with a RNN-based recognition branch. In addition, we develop an iterative character detection approach able to transform the ability of character detection learned from synthetic data to real-world images. These technical improvements result in a simple, compact, yet powerful onestage model that works reliably on multi-orientation and curved text. We evaluate CharNet on three standard benchmarks, where it consistently outperforms the state-of-theart approaches [25,24] by a large margin, e.g., with improvements of 65.33%→71.08% (with generic lexicon) on ICDAR 2015, and 54.0%→69.23% on Total-Text, on endto-end text recognition. Code is available at: https:// github.com/MalongTech/research-charnet.

show abstract

Section: Contributionsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Convolutional Character Networks

Xing

Tian

Huang

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

149

View full text Add to dashboard Cite

show abstract

“…The detection branches in our proposed method are denoted as DET. Inspired by [1,9], we use rotated box (RBOX) to describe text regions. Thus the DET branch is simply 1 × 1 convolutions to map final feature to detections.…”

Section: Architecture Overviewmentioning

confidence: 99%

Efficient Scene Text Detection with Textual Attention Tower

Zhang

Liu

Xiao³

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multi-oriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections. Experiments on public benchmarks including ICDAR 2013, ICDAR 2015 and MSRA-TD500 show that our proposed approach can achieve better or comparable performances with fewer parameters and less computational cost.

show abstract