StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

Akan, Adil Kaan; Güney, Fatma

doi:10.1007/978-3-031-19839-7_26

Cited by 12 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, planning is sampled in trajectory retrieval by predictions [38]. Prominence in BEV perception [39]- [41] enables modular integration and learning under unified BEV geometry [42]. This further prompts a planning-oriented system, which organizes and serves all intermediate modules targeting planning under visual [2], [15], [43] or vectorized [5] perceptions.…”

Section: End-to-end Systems and Llms For Autonomous Drivingmentioning

confidence: 99%

Hierarchical speed planning and energy management for autonomous plug-in hybrid electric vehicle in vehicle-following environment

Liu

Huang

Yang

et al. 2022

Energy

View full text Add to dashboard Cite

Section: End-to-end Systems and Llms For Autonomous Drivingmentioning

confidence: 99%

Hierarchical speed planning and energy management for autonomous plug-in hybrid electric vehicle in vehicle-following environment

Liu

Huang

Yang

et al. 2022

Energy

View full text Add to dashboard Cite

“…Considering the dimension gap between the 2D image input and the 3D prediction, recent studies for vision-based 3D perception first construct the BEV feature representations and then perform various downstream tasks on the BEV space [20,29,31,39,60,40,62,42,19,1,44]. To transform the perspective image features into the BEV features, LSS [40] and its follow-ups [42,29,19,60] predict the pixel-wise depth distribution to project the image features into 3D points, which are then voxelized into the BEV features.…”

Section: Camera-based Bev Perceptionmentioning

confidence: 99%

“…The predicted 3D semantic occupancy can serve as a comprehensive and fine-grained understanding of the surrounding environment. The video demos on Se-manticKITTI and nuScenes datasets are also available at the project page 1 .…”

Section: B More Visualizationsmentioning

confidence: 99%

See 1 more Smart Citation

Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation

Zhang

Zhu

Zheng

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Online semantic 3D segmentation in company with realtime RGB-D reconstruction poses special challenges such as how to perform 3D convolution directly over the progressively fused 3D geometric data, and how to smartly fuse information from frame to frame. We propose a novel fusionaware 3D point convolution which operates directly on the geometric surface being reconstructed and exploits effectively the inter-frame correlation for high quality 3D feature learning. This is enabled by a dedicated dynamic data structure which organizes the online acquired point cloud with global-local trees. Globally, we compile the online reconstructed 3D points into an incrementally growing coordinate interval tree, enabling fast point insertion and neighborhood query. Locally, we maintain the neighborhood information for each point using an octree whose construction benefits from the fast query of the global tree. Both levels of trees update dynamically and help the 3D convolution effectively exploits the temporal coherence for effective information fusion across RGB-D frames. Through evaluation on public benchmark datasets, we show that our method achieves the state-of-the-art accuracy of semantic segmentation with online RGB-D fusion in 10 FPS.

show abstract

“…Due to its inherent potential for developing cost-effective autonomous driving systems, the visioncentric perception has recently gained remarkable traction within both industry and academia. Taking multiple surrounding camera image as input, vision-centric models have evinced promising performance on various 3D scene understanding tasks such as 3D object detection (Wang et al 2021b,c;Li et al 2022b,a;Zhou et al 2023;Li et al 2023a), 3D map segmentation (Hu et al 2021;Akan et al 2022; The lower spatial resolution is hard to represent complex geometric shapes. Zhang et al 2022), and depth estimation (Guizilini et al 2022;Wei et al 2022).…”

Section: Introductionmentioning

confidence: 99%

OctOcc: High-Resolution 3D Occupancy Prediction with Octree

Ouyang,

Song,

Feng

et al. 2024

AAAI

View full text Add to dashboard Cite

3D semantic occupancy has garnered considerable attention due to its abundant structural information encompassing the entire scene in autonomous driving. However, existing 3D occupancy prediction methods contend with the constraint of low-resolution 3D voxel features arising from the limitation of computational memory. To address this limitation and achieve a more fine-grained representation of 3D scenes, we propose OctOcc, a novel octree-based approach for 3D semantic occupancy prediction. OctOcc is conceptually rooted in the observation that the vast majority of 3D space is left unoccupied. Capitalizing on this insight, we endeavor to cultivate memory-efficient high-resolution 3D occupancy predictions by mitigating superfluous cross-attentions. Specifically, we devise a hierarchical octree structure that selectively generates finer-grained cross-attentions solely in potentially occupied regions. Extending our inquiry beyond 3D space, we identify analogous redundancies within another side of cross attentions, 2D images. Consequently, a 2D image feature filtering network is conceived to expunge extraneous regions. Experimental results demonstrate that the proposed OctOcc significantly outperforms existing methods on nuScenes and SemanticKITTI datasets with limited memory consumption.

show abstract

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

Cited by 12 publications

References 19 publications

Hierarchical speed planning and energy management for autonomous plug-in hybrid electric vehicle in vehicle-following environment

Hierarchical speed planning and energy management for autonomous plug-in hybrid electric vehicle in vehicle-following environment

Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation

OctOcc: High-Resolution 3D Occupancy Prediction with Octree

Contact Info

Product

Resources

About