Deep Direct Regression for Multi-Oriented Scene Text Detection

He, Wenhao; Zhang, Xu-Yao; Yin, Fei; Liu, Cheng‐Lin

doi:10.48550/arxiv.1703.08289

Cited by 38 publications

(40 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2, such as [2,11,23,24,29,32]. As its name described, the direct regression method directly calculates the error between the prediction and ground truth [7,8,35].…”

Section: Rbox Regression Parametersmentioning

confidence: 99%

A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application

Yao¹,

Ortiz²,

Bonnín-Pascual

2021

Preprint

View full text Add to dashboard Cite

Following the success of machine vision systems for on-line automated quality control and inspection processes, an object recognition solution is presented in this work for two different specific applications, i.e., the detection of quality control items in surgery toolboxes prepared for sterilizing in a hospital, as well as the detection of defects in vessel hulls to prevent potential structural failures. The solution has two stages. First, a feature pyramid architecture based on Single Shot MultiBox Detector (SSD) is used to improve the detection performance, and a statistical analysis based on ground truth is employed to select parameters of a range of default boxes. Second, a lightweight neural network is exploited to achieve oriented detection results using a regression method. The first stage of the proposed method is capable of detecting the small targets considered in the two scenarios. In the second stage, despite the simplicity, it is efficient to detect elongated targets while maintaining high running efficiency.

show abstract

“…2, such as [2,11,23,24,29,32]. As its name described, the direct regression method directly calculates the error between the prediction and ground truth [7,8,35].…”

Section: Rbox Regression Parametersmentioning

confidence: 99%

A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application

Yao¹,

Ortiz²,

Bonnín-Pascual

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The aspect ratio of text lines varies greatly, and limited anchors cannot cover the size or aspect ratio of all objects; thus, many methods are anchor-free. Both [4] and [1] generate labels with shrunk segmentation maps, and regress the vertices or angles of the bounding box on positive pixels. [29] generates a corner map and a position-sensitive segmentation map, calculates oriented bounding boxes based on the corner map, and calculates the score for each bounding box using the position-sensitive segmentation map.…”

Section: B Oriented Objects Detectionmentioning

confidence: 99%

Adaptive Period Embedding for Representing Oriented Objects in Aerial Images

Zhu

2020

IEEE Trans. Geosci. Remote Sensing

View full text Add to dashboard Cite

We propose a novel method for representing oriented objects in aerial images named Adaptive Period Embedding (APE). While traditional object detection methods represent object with horizontal bounding boxes, the objects in aerial images are oritented. Calculating the angle of object is an yet challenging task. While almost all previous object detectors for aerial images directly regress the angle of objects, they use complex rules to calculate the angle, and their performance is limited by the rule design. In contrast, our method is based on the angular periodicity of oriented objects. The angle is represented by two two-dimensional periodic vectors whose periods are different, the vector is continuous as shape changes. The label generation rule is more simple and reasonable compared with previous methods. The proposed method is general and can be applied to other oriented detector. Besides, we propose a novel IoU calculation method for long objects named length independent IoU (LIIoU). We intercept part of the long side of the target box to get the maximum IoU between the proposed box and the intercepted target box. Thereby, some long boxes will have corresponding positive samples. Our method reaches the 1 st place of DOAI2019 competition task1 (oriented object) held in workshop on Detecting Objects in Aerial Images in conjunction with IEEE CVPR 2019.

show abstract

“…[23] classifies text line and regresses its location with different feature which achieves significant improvement on oriented text line. [15] and [48] investigate to generate shrinked text line segmentation map then regress text sides or vertexes on text center.…”

Section: Related Workmentioning

confidence: 99%

“…Following [6,29,28] we resize images to 1280 × 768 in inference and report the single-scale result. We compare our method with other state-of-the-art meth-Methods P R F FPS Zhang et al [45] 71 43 54 -SegLink [35] 73.1 76.8 75.0 -EAST [48] 83.57 73.47 78.20 13.2 EAST [48] 83.27 78.33 80.72 -He et al [15] 82 80 81 -PixelLink [6] 85.5 82.0 83.7 3.0 Lyu et al [29] 89.5 79.7 84.3 1 TextSnake [28] 84.9 ods and show results in Table 3. Our method achieves better performance (precision: 88.51%, recall: 84.16% and Fmeasure: 86.28%) compared with other segmentation based methods [6,29].…”

Section: Experiments On Icdar2015mentioning

confidence: 99%

TextMountain: Accurate Scene Text Detection via Instance Segmentation

Zhu¹,

Du²

2018

Preprint

View full text Add to dashboard Cite

In this paper, we propose a novel scene text detection method named TextMountain. The key idea of TextMountain is making full use of border-center information. Different from previous works that treat center-border as a binary classification problem, we predict text center-border probability (TCBP) and text center-direction (TCD). The TCBP is just like a mountain whose top is text center and foot is text border. The mountaintop can separate text instances which cannot be easily achieved using semantic segmentation map and its rising direction can plan a road to top for each pixel on mountain foot at the group stage. The TCD helps TCBP learning better. Our label rules will not lead to the ambiguous problem with the transformation of angle, so the proposed method is robust to multi-oriented text and can also handle well with curved text. In inference stage, each pixel at the mountain foot needs to search the path to the mountaintop and this process can be efficiently completed in parallel, yielding the efficiency of our method compared with others. The experiments on MLT, ICDAR2015, RCTW-17 and SCUT-CTW1500 databases demonstrate that the proposed method achieves better or comparable performance in terms of both accuracy and efficiency. It is worth mentioning our method achieves an F-measure of 76.85% on MLT which outperforms the previous methods by a large margin. Code will be made available.

show abstract

Deep Direct Regression for Multi-Oriented Scene Text Detection

Cited by 38 publications

References 15 publications

A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application

A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application

Adaptive Period Embedding for Representing Oriented Objects in Aerial Images

TextMountain: Accurate Scene Text Detection via Instance Segmentation

Contact Info

Product

Resources

About