Tong He scite author profile

We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free. By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks.

show abstract

Detecting Text in Natural Image with Connectionist Text Proposal Network

Tian

Huang

et al. 2016

766

450

View full text Add to dashboard Cite

Abstract. We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image. The CTPN detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. We develop a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text. The CTPN works reliably on multi-scale and multilanguage text without further post-processing, departing from previous bottom-up methods requiring multi-step post filtering. It achieves 0.88 and 0.61 F-measure on the ICDAR 2013 and 2015 benchmarks, surpassing recent results [8,35] by a large margin. The CTPN is computationally efficient with 0.14s/image, by using the very deep VGG16 model [27]. Online demo is available at: http://textdet.com/.

show abstract

Bag of Tricks for Image Classification with Convolutional Neural Networks

Zhang

et al. 2019

991

392

View full text Add to dashboard Cite

Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods. In the literature, however, most refinements are either briefly mentioned as implementation details or only visible in source code. In this paper, we will examine a collection of such refinements and empirically evaluate their impact on the final model accuracy through ablation study. We will show that, by combining these refinements together, we are able to improve various CNN models significantly. For example, we raise ResNet-50's top-1 validation accuracy from 75.3% to 79.29% on ImageNet. We will also demonstrate that improvement on image classification accuracy leads to better transfer learning performance in other application domains such as object detection and semantic segmentation.

show abstract

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

Wang¹,

Zheng²,

Ye³

et al. 2019

Preprint

276

321

View full text Add to dashboard Cite

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

et al. 2020

View full text Add to dashboard Cite

Deep neural networks and kernel regression achieve comparable accuracies for functional connectivity prediction of behavior and demographics

et al. 2020

View full text Add to dashboard Cite

FCOS: Fully Convolutional One-Stage Object Detection

Tian¹,

Shen²,

Chen³

et al. 2019

Preprint

113

188

View full text Add to dashboard Cite

FCOS: A Simple and Strong Anchor-free Object Detector

Tian

Shen

Chen

et al. 2020

IEEE Trans. Pattern Anal. Mach. Intell.

245

160

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tong He

FCOS: Fully Convolutional One-Stage Object Detection

Detecting Text in Natural Image with Connectionist Text Proposal Network

Bag of Tricks for Image Classification with Convolutional Neural Networks

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

Deep neural networks and kernel regression achieve comparable accuracies for functional connectivity prediction of behavior and demographics

FCOS: Fully Convolutional One-Stage Object Detection

FCOS: A Simple and Strong Anchor-free Object Detector

Contact Info

Product

Resources

About