2022
DOI: 10.48550/arxiv.2206.10092
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
96
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(97 citation statements)
references
References 0 publications
1
96
0
Order By: Relevance
“…Among them, BEV grid augmentation is par-ticularly important for this paradigm, which is also mentioned in [62]. In addition, for class-imbalanced issues, similar to LiDAR-based approaches, some methods [62], [82], [129] exploit CBGS [130] to increase the number of samples for long-tailed categories. However, to our best knowledge, there are still very few works targeting this problem.…”
Section: Training Detailsmentioning
confidence: 99%
“…Among them, BEV grid augmentation is par-ticularly important for this paradigm, which is also mentioned in [62]. In addition, for class-imbalanced issues, similar to LiDAR-based approaches, some methods [62], [82], [129] exploit CBGS [130] to increase the number of samples for long-tailed categories. However, to our best knowledge, there are still very few works targeting this problem.…”
Section: Training Detailsmentioning
confidence: 99%
“…To learn BEV representation from surrounding views, depth-based methods [33,36,21,16,43,35] infer depth in image views and project them to the BEV plane with the extrinsics and intrinsics, where Unsupervised depth estimation remains challenges. Although BEVdepth [21] improves feature extracting and downstream tasks performance compared with other paradigms, additional supervision is a key issue, relatively more difficult to obtain.…”
Section: Related Workmentioning
confidence: 99%
“…To learn BEV representation from surrounding views, depth-based methods [33,36,21,16,43,35] infer depth in image views and project them to the BEV plane with the extrinsics and intrinsics, where Unsupervised depth estimation remains challenges. Although BEVdepth [21] improves feature extracting and downstream tasks performance compared with other paradigms, additional supervision is a key issue, relatively more difficult to obtain. In contrast, inspired by the transformer framework for 2D object detection [55,6], some works such as BEVformer [22,17,32,8] emphasizes directly learning the transformation relationship between image view and BEV based on the attention mechanism.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep neural network-based 3D object detectors, such as those presented in [13,15,18,21,[37][38][39][40]43,44], have demonstrated promising performance on various challenging realworld benchmarks, including the KITTI [9], Waymo [32], and nuScenes [3] datasets. These popular approaches utilize either point clouds [15,37,43,44] or images [13,17,18,21,[38][39][40] as inputs for detection tasks. In comparison to LIDAR-based methods, camera-based approaches have garnered significant attention due to their low deployment cost, high computational efficiency, and dense semantic information.…”
Section: Introductionmentioning
confidence: 99%