ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

Shi, Baoguang; Yao, Cong; Liao, Minghui; Yang, Mingkun; Xu, Pei; Cui, Linyan; Belongie, Serge; Lu, Shijian; Bai, Xiang

doi:10.1109/icdar.2017.233

Cited by 177 publications

(109 citation statements)

References 9 publications

Supporting

Mentioning

109

Contrasting

Order By: Relevance

“…The standard evaluation protocol of MSRA-TD500 based on F-measure is used. [40] is also a long text detection dataset, consisting of 8034 training images and 4229 test images annotated with text lines. This dataset is very challenging due to very large text scale variances.…”

Section: Datasets and Evaluation Protocolsmentioning

confidence: 99%

“…For oriented scene text detection on MSRA-TD500 [39] and RCTW-17 [40], we apply the same data augmentation as [20]. Besides, we also randomly rotate the images with π/2 to better handle vertical texts.…”

Section: Long Text Detection In Natural Scenesmentioning

confidence: 99%

“…For MSRA-TD500, we randomly resize the short side of cropped images to {512, 768, 864}. For RCTW-17 [40] containing many small texts, the short side is randomly resized to {960, 1200, 1400}. We first pre-train the model on Synth-Text [47] for one epoch.…”

Section: Long Text Detection In Natural Scenesmentioning

confidence: 99%

See 2 more Smart Citations

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Wang

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

560

274

View full text Add to dashboard Cite

Object detection has recently experienced substantial progress. Yet, the widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented objects such as objects in aerial images and scene texts. In this paper, we propose a simple yet effective framework to detect multi-oriented objects. Instead of directly regressing the four vertices, we glide the vertex of the horizontal bounding box on each corresponding side to accurately describe a multi-oriented object. Specifically, We regress four length ratios characterizing the relative gliding offset on each corresponding side. This may facilitate the offset learning and avoid the confusion issue of sequential label points for oriented objects. To further remedy the confusion issue for nearly horizontal objects, we also introduce an obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object. We add these five extra target variables to the regression head of fast R-CNN, which requires ignorable extra computation time. Extensive experimental results demonstrate that without bells and whistles, the proposed method achieves superior performances on multiple multi-oriented object detection benchmarks including object detection in aerial images, scene text detection, pedestrian detection in fisheye images.

show abstract

Section: Datasets and Evaluation Protocolsmentioning

confidence: 99%

Section: Long Text Detection In Natural Scenesmentioning

confidence: 99%

See 1 more Smart Citation

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Wang

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

560

274

View full text Add to dashboard Cite

show abstract

“…:,*"()[]/' ) at the beginning and at the end of both the ground truth and the submissions are removed. For Task 2.2, the Normalized Edit Distance metric (1-N.E.D specifically, which is also used in the ICDAR 2017 competition, RCTW-17 [12]) are treated as the ranking metric. The reason of utilizing 1-N.E.D as the official ranking metric for Task 2.2 is motivated by the fact that Chinese scripts usually contain more characters than the Latin scripts, which makes word accuracy metric too harsh to evaluate Task 2.2 fairly.…”

Section: B Evaluation Metricsmentioning

confidence: 99%

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT

Chng

Ding

Liu

et al. 2019

2019 International Conference on Document Analysis and Recognition (ICDAR)

129

View full text Add to dashboard Cite

This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text -RRC-ArT that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 -82.65%, ii) T2.1 -74.3%, iii) T2.2 -85.32%, iv) T3.1 -53.86%, and v) T3.2 -54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants' methods. The dataset, the evaluation kit as well as the results are publicly available at the challenge website 1 .

show abstract

“…Previous end-to-end trainable text reading models [40,14,22,28,29] only utilize images in full annotations provided by the previous benchmarks [36,3,43]. The improvement in performance of these models requires more fully annotated training data, which is extremely expensive and inefficient in annotations.…”

Section: Partially Supervised Learningmentioning

confidence: 99%

Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning

Sun

Liu

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Most existing text reading benchmarks make it difficult to evaluate the performance of more advanced deep learning models in large vocabularies due to the limited amount of training data. To address this issue, we introduce a new large-scale text reading benchmark dataset named Chinese Street View Text (C-SVT) with 430, 000 street view images, which is at least 14 times as large as the existing Chinese text reading benchmarks. To recognize Chinese text in the wild while keeping large-scale datasets labeling cost-effective, we propose to annotate one part of the C-SVT dataset (30,000 images) in locations and text labels as full annotations and add 400, 000 more images, where only the corresponding text-of-interest in the regions is given as weak annotations. To exploit the rich information from the weakly annotated data, we design a text reading network in a partially supervised learning framework, which enables to localize and recognize text, learn from fully and weakly annotated data simultaneously. To localize the best matched text proposals from weakly labeled images, we propose an online proposal matching module incorporated in the whole model, spotting the keyword regions by sharing parameters for end-to-end training. Compared with fully supervised training algorithms, this model can improve the end-to-end recognition performance remarkably by 4.03% in F-score at the same labeling cost. The proposed model can also achieve state-of-the-art results on the ICDAR 2017-RCTW dataset, which demonstrates the effectiveness of the proposed partially supervised learning framework.

show abstract

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

Cited by 177 publications

References 9 publications

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT

Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning

Contact Info

Product

Resources

About