2019 IEEE Intelligent Vehicles Symposium (IV) 2019
DOI: 10.1109/ivs.2019.8813895
|View full text |Cite
|
Sign up to set email alerts
|

RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement

Abstract: We present RoarNet, a new approach for 3D object detection from 2D image and 3D Lidar point clouds. Based on two stage object detection framework ([1], [2]) with PointNet [3] as our backbone network, we suggest several novel ideas to improve 3D object detection performance.The first part of our method, RoarNet 2D, estimates the 3D poses of objects from a monocular image, which approximates where to examine further, and derives multiple candidates that are geometrically feasible. This step significantly narrows… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
101
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 173 publications
(102 citation statements)
references
References 23 publications
(40 reference statements)
1
101
0
Order By: Relevance
“…LiDAR-Based 3D Object Detection. Existing works have explored three ways of processing the LiDAR data for 3D object detection: (1) As the convolutional neural networks (CNNs) can naturally process images, many works focus on projecting the LiDAR point cloud into the bird's eye view (BEV) images as a pre-processing step and then regressing the 3D bounding box based on the features extracted from the BEV images [2,56,57,24,20,64,59,63]; (2) On the other hand, one can divide the LiDAR point cloud into equally spaced 3D voxels and then apply 3D CNNs for 3D bounding box prediction [25,62,73]; (3) The most popular approach so far is to directly process the LiDAR point cloud through the neural network without pre-processing [22,10,45,65,61,40,41,44,11,71,16,54,34,23]. To this end, novel neural networks that can directly consume the point cloud are developed [7,35,47,69,18,53,15].…”
Section: Related Workmentioning
confidence: 99%
“…LiDAR-Based 3D Object Detection. Existing works have explored three ways of processing the LiDAR data for 3D object detection: (1) As the convolutional neural networks (CNNs) can naturally process images, many works focus on projecting the LiDAR point cloud into the bird's eye view (BEV) images as a pre-processing step and then regressing the 3D bounding box based on the features extracted from the BEV images [2,56,57,24,20,64,59,63]; (2) On the other hand, one can divide the LiDAR point cloud into equally spaced 3D voxels and then apply 3D CNNs for 3D bounding box prediction [25,62,73]; (3) The most popular approach so far is to directly process the LiDAR point cloud through the neural network without pre-processing [22,10,45,65,61,40,41,44,11,71,16,54,34,23]. To this end, novel neural networks that can directly consume the point cloud are developed [7,35,47,69,18,53,15].…”
Section: Related Workmentioning
confidence: 99%
“…Following the tremendous advances in deep learning methods for computer vision, a large body of literature has investigated to what extent this technology could be applied towards object detection from lidar point clouds [31,29,30,11,2,21,15,28,26,25]. While there are many similarities between the modalities, there are two key differences: 1) the point cloud is a sparse representation, while an image is dense and 2) the point cloud is 3D, while the image is 2D.…”
Section: Introductionmentioning
confidence: 99%
“…PointNet [139] and its improved version PointNet++ [140] propose to predict individual features for each point and aggregate the features from several points via max pooling. This method was firstly introduced in 3D object recognition and later extended by Qi et al [105], Xu et al [104] and Shin et al [141] to 3D object detection in combination with RGB images. Furthermore, Wang et al [142] propose a new learnable operator called Parametric Continuous Convolution to aggregate points via a weighted sum, and Li et al [143] propose to learn a χ transformation before applying transformed point cloud features into standard CNN.…”
Section: ) Lidar Point Cloudsmentioning
confidence: 99%