2012 IEEE International Conference on Robotics and Automation 2012
DOI: 10.1109/icra.2012.6225316
|View full text |Cite
|
Sign up to set email alerts
|

Detection-based object labeling in 3D scenes

Abstract: Abstract-We propose a view-based approach for labeling objects in 3D scenes reconstructed from RGB-D (color+depth) videos. We utilize sliding window detectors trained from object views to assign class probabilities to pixels in every RGB-D frame. These probabilities are projected into the reconstructed 3D scene and integrated using a voxel representation. We perform efficient inference on a Markov Random Field over the voxels, combining cues from view-based detection and 3D shape, to label the scene. Our detec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
125
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 147 publications
(126 citation statements)
references
References 25 publications
1
125
0
Order By: Relevance
“…The recent availability of affordable depth sensors has also led to an increase in the use of RGB-D data [21,11,3,30,26] to extract various types of 3D information from scenes such as depth, object pose, and intrinsic images. Our method can be easily modified to apply to this domain.…”
Section: Related Workmentioning
confidence: 99%
“…The recent availability of affordable depth sensors has also led to an increase in the use of RGB-D data [21,11,3,30,26] to extract various types of 3D information from scenes such as depth, object pose, and intrinsic images. Our method can be easily modified to apply to this domain.…”
Section: Related Workmentioning
confidence: 99%
“…For example, [18,17] use objects from Google's 3D Warehouse to train an object detection system for 3D point clouds collected by robots navigating through urban and indoor environments. [28] uses 3D CAD models as templates for a sliding window search.…”
Section: Related Workmentioning
confidence: 99%
“…The conventional way to approach this problem is to constrain the representation into only one of the modalities while integrating information from the other discarded domain as features. That is, the approach can be 2-D driven [5,[8][9][10][11][12]1], in that reasoning is done in the image while integrating 3-D features, or the approach can be 3-D driven [7,[13][14][15], in that the predictions are made on the 3-D data while integrating 2-D features. These approaches are typically only applicable when the two modalities are in correspondence.…”
Section: Motivation and Related Workmentioning
confidence: 99%