2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00922
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Character Networks

Abstract: Recent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on recognition task. In this work, we propose convolutional character networks, referred as CharNet, which is an one-stage model that can process two tasks simultaneously in one pass. CharNet directly outputs bounding boxes of words and characters, with correspond… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
91
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 162 publications
(108 citation statements)
references
References 38 publications
(115 reference statements)
2
91
0
Order By: Relevance
“…Recently, in order to sufficiently exploit the complementarity between detection and recognition, many methods [45], [4], [5], [6], [46], [7], [17], [47], [37], [48], [49], [50] are proposed to spot text in an end-to-end manner, which utilize the recognition information to optimize the localization task.…”
Section: A Text Reading In Single Imagesmentioning
confidence: 99%
“…Recently, in order to sufficiently exploit the complementarity between detection and recognition, many methods [45], [4], [5], [6], [46], [7], [17], [47], [37], [48], [49], [50] are proposed to spot text in an end-to-end manner, which utilize the recognition information to optimize the localization task.…”
Section: A Text Reading In Single Imagesmentioning
confidence: 99%
“…He et al [31] proposed an end-to-end framework based on SSD by introducing an text attention module, which enables a direct text mask supervision and achieves strong performance improvements by training text detection and recognition jointly. Xing [32] proposed a one-stage model that processes text detection and recognition simultaneously.…”
Section: Scene Text Spotting With Deep Learningmentioning
confidence: 99%
“…However, most existing scene text benchmarks do not include character-level annotations. To solve this problem, we use the iterative learning approach [9] to obtain character-level data. Instead of iterating from synthetic data in [9], we directly utilize their trained model to get character labels for further iterations.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…To remove the background area and more precisely locate characters rather than text, Mask TextSpotter [7], inspired by Mask-RCNN [8], proposes to detect all characters for each bounding box proposal and then perform character recognition. Recently, based on the observation that the two-stage framework which involves ROI pooling degrades the text recognition performance, CharNet [9] proposes a one-stage architecture to achieve higher efficiency. CharNet follows the pipeline of Mask TextSpotter and groups characters to text by the guidance of the relative position between detection results of characters and texts.…”
Section: Introductionmentioning
confidence: 99%