2020
DOI: 10.48550/arxiv.2001.03343
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving

Abstract: In this work, we propose an efficient and accurate monocular 3D detection framework in single shot. Most successful 3D detectors take the projection constraint from the 3D bounding box to the 2D box as an important component. Four edges of a 2D box provide only four constraints and the performance deteriorates dramatically with the small error of the 2D detector. Different from these approaches, our method predicts the nine perspective keypoints of a 3D bounding box in image space, and then utilize the geometr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(49 citation statements)
references
References 47 publications
0
49
0
Order By: Relevance
“…A quantitative comparison between 2D and 3D detectors is included in Section 5.2. methods. On state-of-the-art 3D detection benchmarks [2,12], state-of-the-art monocular methods [26,41] achieve about half the mAP detection accuracy, of standard Lidar based baselines [60]. Pseudo-Lidar [56] based methods produce a virtual point cloud from RGB images, similar to our approach.…”
Section: Related Workmentioning
confidence: 73%
“…A quantitative comparison between 2D and 3D detectors is included in Section 5.2. methods. On state-of-the-art 3D detection benchmarks [2,12], state-of-the-art monocular methods [26,41] achieve about half the mAP detection accuracy, of standard Lidar based baselines [60]. Pseudo-Lidar [56] based methods produce a virtual point cloud from RGB images, similar to our approach.…”
Section: Related Workmentioning
confidence: 73%
“…Such methods are useful for traffic studies that require counts of vehicle turns and traffic queues, and not necessarily full 3D bounding boxes to represent moving objects. While the sensors' mounting is not the same, methods for 3D object detection in the context of autonomous driving have developed interesting ideas such as carefully designed loss functions [22], the use of auxiliary CAD models [23], geometric constraints between 2D and 3D bounding boxes [24], stereo imaging [25], and keypoint detection [26] [27]. However, most of these methods are designed for 3D detection up to 40m [25][28] and their direct implementation at a longer range produces significantly less accurate results, especially when looking at roads with changing slope.…”
Section: B Image-based 3d Object Detectionmentioning
confidence: 99%
“…There are mainly two types of methods: RGB imagebased and pseudo-LiDAR based. Former employ detection networks like CenterNet [34] to predict the bounding boxes [35,36,37,38] directly from images. The latter perform the detection over the pseudo-LiDAR representation, which is lifted from the dense depth prediction [39,40,41,42].…”
Section: Scale-invariant Lossmentioning
confidence: 99%