2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00595
|View full text |Cite
|
Sign up to set email alerts
|

FOTS: Fast Oriented Text Spotting with a Unified Network

Abstract: Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community. Most existing methods treat text detection and recognition as separate tasks. In this work, we propose a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. Specially, RoIRotate is introduced to share convolutional features between detection … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
360
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 462 publications
(360 citation statements)
references
References 54 publications
0
360
0
Order By: Relevance
“…In this work, we present Convolotional Character Networks (referred as CharNet) for joint text detection and recognition, by leveraging character as basic unit. Moreover, for the first time, we provide an one-stage CNN model for the joint tasks, with significant performance improvements over the state-of-the-art results achieved by a more complex two-stage framework, such as [12], [25] and [24]. The proposed CharNet implements direct character detection and recognition, jointly with text instance (e.g., word) detection.…”
Section: Contributionsmentioning
confidence: 99%
See 3 more Smart Citations
“…In this work, we present Convolotional Character Networks (referred as CharNet) for joint text detection and recognition, by leveraging character as basic unit. Moreover, for the first time, we provide an one-stage CNN model for the joint tasks, with significant performance improvements over the state-of-the-art results achieved by a more complex two-stage framework, such as [12], [25] and [24]. The proposed CharNet implements direct character detection and recognition, jointly with text instance (e.g., word) detection.…”
Section: Contributionsmentioning
confidence: 99%
“…First, learning the two tasks independently would result in a sub-optimization problem, making it difficult to fully explore the potential of text nature. For example, text detection and recognition can work collaboratively by providing strong context and complementary information to each other, which is critical to improving the performance, as substantiated by recent work [12,24]. Second, it often requires to implement multiple sequential steps, resulting in a relatively complicated system, where the performance of text recognition is heavily relied on text detection results.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The detection branches in our proposed method are denoted as DET. Inspired by [1,9], we use rotated box (RBOX) to describe text regions. Thus the DET branch is simply 1 × 1 convolutions to map final feature to detections.…”
Section: Architecture Overviewmentioning
confidence: 99%