2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00119
|View full text |Cite
|
Sign up to set email alerts
|

R-FCN-3000 at 30fps: Decoupling Detection and Classification

Abstract: We present R-FCN-3000, a large-scale real-time object detector in which objectness detection and classification are decoupled. To obtain the detection score for an RoI, we multiply the objectness score with the fine-grained classification score. Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization. For fine-grained classification, these position-sensitive filters are not needed. R-FCN-3000 obtains an … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
69
0
1

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 97 publications
(70 citation statements)
references
References 45 publications
0
69
0
1
Order By: Relevance
“…In 2016, Dai et al [11] proposed an R-FCN (Region-based Fully Convolutional Networks) to solve the problem that the ROI-wise subnetwork of Faster R-CNN did not share calculations in different region proposals. In the past two years, based on the Faster R-CNN and R-FCN, RRPN (Rotation Region Proposal Networks) [12], R-FCN-3000 [13] and other object proposal-based approaches [14] [15] of which the detection accuracy was further improved were presented. However, the frameworks of proposal-based approaches that had two stages, the region proposal generation and the subsequent feature resampling, were much more complex in comparison with the regression-based approaches; which resulted in low speed and difficulty in real-time performance.…”
mentioning
confidence: 99%
“…In 2016, Dai et al [11] proposed an R-FCN (Region-based Fully Convolutional Networks) to solve the problem that the ROI-wise subnetwork of Faster R-CNN did not share calculations in different region proposals. In the past two years, based on the Faster R-CNN and R-FCN, RRPN (Rotation Region Proposal Networks) [12], R-FCN-3000 [13] and other object proposal-based approaches [14] [15] of which the detection accuracy was further improved were presented. However, the frameworks of proposal-based approaches that had two stages, the region proposal generation and the subsequent feature resampling, were much more complex in comparison with the regression-based approaches; which resulted in low speed and difficulty in real-time performance.…”
mentioning
confidence: 99%
“…Training Data/Label mAP-CG mAP-FG SNIPER [28] CG-Fully 54.0 -SNIPER [28] 3k-FG-Fully -41.6 YOLO-9000 * [24] COCO+9k-FG-Weakly 19.9 -R-FCN-3000 * [27] 3k-FG-Fully We then run experiments on ImageNet dataset. As shown in Table 3, we use the ILSVRC 2014 Detection set with 200 classes as the coarse-grained set.…”
Section: Methodsmentioning
confidence: 99%
“…In [41], a weakly-supervised object detector is trained on a weakly-labeled web dataset to generate pseudo ground-truths for the target detection task. [37] combines region-level semantic similarity and common-sense information learned from some external knowledge bases to train the detector with just image-level labels.…”
Section: Related Workmentioning
confidence: 99%
“…For example, YOLO-9000 [33] extend the detector's class coverage by concurrently training on bounding box-level data and image-level data, such that the image-level data contribute only to classification loss. By decoupling the detection network into two branches (positive-sensitive & semanticfocused), R-FCN-3K [37] is able to scale detection up to 3000 classes despite being trained on limited bounding box annotations for several object classes. In contrast to these, we focus on large-scale object detection without having access to additional data (classification) sources during the training.…”
Section: Related Workmentioning
confidence: 99%