2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00783
|View full text |Cite
|
Sign up to set email alerts
|

Stereo R-CNN Based 3D Object Detection for Autonomous Driving

Abstract: We propose a 3D object detection method for autonomous driving by fully exploiting the sparse and dense, semantic and geometry information in stereo imagery. Our method, called Stereo R-CNN, extends Faster R-CNN for stereo inputs to simultaneously detect and associate object in left and right images. We add extra branches after stereo Region Proposal Network (RPN) to predict sparse keypoints, viewpoints, and object dimensions, which are combined with 2D left-right boxes to calculate a coarse 1 3D object boundi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
365
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 550 publications
(395 citation statements)
references
References 32 publications
2
365
0
Order By: Relevance
“…3DOP [5] exploits stereo images and prior knowledge about the scene to directly reason in 3D. Stereo R-CNN [16] tackles 3D object detection by exploiting stereo imagery and produces stereo boxes, keypoints, dimensions and viewpoint angles, summarized in a learned 3D box estimation module. In MV3D [6], a sensorfusion approach for LIDAR and RGB images is presented, approaching 3D object proposal generation and multi-view feature fusion via individual sub-networks.…”
Section: Related Workmentioning
confidence: 99%
“…3DOP [5] exploits stereo images and prior knowledge about the scene to directly reason in 3D. Stereo R-CNN [16] tackles 3D object detection by exploiting stereo imagery and produces stereo boxes, keypoints, dimensions and viewpoint angles, summarized in a learned 3D box estimation module. In MV3D [6], a sensorfusion approach for LIDAR and RGB images is presented, approaching 3D object proposal generation and multi-view feature fusion via individual sub-networks.…”
Section: Related Workmentioning
confidence: 99%
“…We compare our approach with state-of-the-art methods [2], [6], [12], [13], [15], [16], [31], which are divided into two groups depending on the input (i.e., point clouds or camera images). One group consists of MonoPSR [31] (Mono-based) and Stereo R-CNN [6] (Stereo-based) which process camera images with RGB information. The other group includes MV3D (LiDAR) [2], BirdNet [12], RT3D [13], VeloFCN [15] and LMNet [16] which are based on point clouds only.…”
Section: B Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…3D object detection [1]- [6] has proved to be increasingly important in many fields, such as autonomous driving [7], mobile robots [8] and virtual/augmented reality [9]. While there have been remarkable progresses in the field of image-based 2D object detection, 3D object detection is far less explored than its 2D counterpart.…”
Section: Introductionmentioning
confidence: 99%
“…Each frame in 3D points cloud data is processed in the same way, and one common key step in the process is to generate feature maps from points cloud data. Due to the popularity of CNN based solutions to object detection for autonomous driving vehicles [4,15,25], in this paper, we focus on the feature maps generated by the convolutional layers in CNN networks.…”
Section: Towards Feature Based Fusion Of Vehicle Data 21 Convolutionmentioning
confidence: 99%