2018
DOI: 10.1007/978-3-030-01264-9_48
|View full text |Cite
|
Sign up to set email alerts
|

Acquisition of Localization Confidence for Accurate Object Detection

Abstract: Modern CNN-based object detectors rely on bounding box regression and non-maximum suppression to localize objects. While the probabilities for class labels naturally reflect classification confidence, localization confidence is absent. This makes properly localized bounding boxes degenerate during iterative regression or even suppressed during NMS. In the paper we propose IoU-Net learning to predict the IoU between each detected bounding box and the matched ground-truth. The network acquires this confidence of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
573
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 748 publications
(574 citation statements)
references
References 37 publications
(73 reference statements)
0
573
0
1
Order By: Relevance
“…For bounding box estimation, we train the IoU-Net [14] based architecture proposed in [4], employing features from the same backbone network used for target classification. The training procedure in [4] is extended to image sets by computing the modulation vector on the first frame in M train and sampling proposal boxes from all images in M test .…”
Section: Offline Trainingmentioning
confidence: 99%
“…For bounding box estimation, we train the IoU-Net [14] based architecture proposed in [4], employing features from the same backbone network used for target classification. The training procedure in [4] is extended to image sets by computing the modulation vector on the first frame in M train and sampling proposal boxes from all images in M test .…”
Section: Offline Trainingmentioning
confidence: 99%
“…Upon publication of the conference version of this manuscript, several works have pursued the idea behind Cascade R-CNN [5], [32], [41], [55]. [41], [55] applied it to single-shot object detectors, showing nontrivial improvements for high quality single-shot detection, for general objects and pedestrians, respectively.…”
Section: Related Workmentioning
confidence: 99%
“…[41], [55] applied it to single-shot object detectors, showing nontrivial improvements for high quality single-shot detection, for general objects and pedestrians, respectively. The IoU-Net [32] explored in greater detail high-quality localization, achieving some gains over the Cascade R-CNN by cascading more bounding box regression steps. [24] showed it is possible to achieve state-of-the-art object detectors without ImageNet pretraining, with a help of the Cascade R-CNN.…”
Section: Related Workmentioning
confidence: 99%
“…[15] proposes an object relation module to learn the NMS function as an end-to-end general object detector. [41] and [17] replace the classification scores of proposals used in the NMS process with learned localization confidences to guide NMS to preserve more accurately localized bounding boxes. These methods prove effective in general object detection, but as we state, pedestrian detection in a crowd has its own challenge.…”
Section: Related Workmentioning
confidence: 99%