2022
DOI: 10.48550/arxiv.2208.02797
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Vision-Centric BEV Perception: A Survey

Abstract: Vision-centric BEV perception has recently received increased attention from both industry and academia due to its inherent merits, including presenting a natural representation of the world and being fusion-friendly. With the rapid development of deep learning, numerous methods have been proposed to address the vision-centric BEV perception. However, there is no recent survey for this novel and growing research field. To stimulate its future research, this paper presents a comprehensive survey of recent progr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 83 publications
(138 reference statements)
0
8
0
Order By: Relevance
“…Projection-based methods generate dense voxel or BEV representation from image features through 3D-to-2D projection [1]. ImVoxelNet [16] aggregated the projected features from several images via a simple element-wise averaging, where spatial information might not be exploited sufficiently.…”
Section: B Camera-based Feature Fusionmentioning
confidence: 99%
See 1 more Smart Citation
“…Projection-based methods generate dense voxel or BEV representation from image features through 3D-to-2D projection [1]. ImVoxelNet [16] aggregated the projected features from several images via a simple element-wise averaging, where spatial information might not be exploited sufficiently.…”
Section: B Camera-based Feature Fusionmentioning
confidence: 99%
“…Subject to sensor limitations, autonomous vehicles lack a global perception capability for monitoring holistic road conditions and accurately detecting surrounding objects, which bears great safety risks [1], [2]. Vehicle-to-everything (V2X) [3], [4] aims to build a communication system between vehicles and other devices in a complex traffic environment.…”
Section: Introductionmentioning
confidence: 99%
“…Online HD map construction can be seen as an advanced setting of lane detection, consisting of lines and polygons with various semantics in the local 360 • FOV perception range of egovehicle. With advanced 2D-to-BEV modules [30], previous online HD map construction methods cast it into semantic segmentation task on the transformed BEV features [27,38,19,28,26,29]. To build vectorized semantic HD map, HDMapNet [18] follows a segmentation-then-vectorization paradigm.…”
Section: Related Workmentioning
confidence: 99%
“…Projection-based methods generate dense voxel or BEV representation from image features through 3D-to-2D projection [19]. ImVoxelNet [25] aggregated the projected features from several images via a simple element-wise averaging, where spatial information might not be exploited sufficiently.…”
Section: Multi-view Camera Fusionmentioning
confidence: 99%