Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map

Elmira, Amirloo,; Rohani, Mohsen; Banijamali, Ershad; Luo, Jun; Poupart, Pascal

doi:10.1109/cvpr46437.2021.00839

Cited by 4 publications

(4 citation statements)

References 37 publications

(44 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Trajectory prediction is of vital importance to many artificial intelligent applications. There is a large body of works on this topic designed to predict behavior of pedestrians [21,28,37,45,50] and vehicles [1,10,11,13,19,36].…”

Section: Trajectory Prediction Methodsmentioning

confidence: 99%

LatentFormer: Multi-Agent Transformer-Based Interaction Modeling and Trajectory Prediction

Elmira¹,

Rasouli²,

Lakner³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Multi-agent trajectory prediction is a fundamental problem in autonomous driving. The key challenges in prediction are accurately anticipating the behavior of surrounding agents and understanding the scene context. To address these problems, we propose LatentFormer, a transformerbased model for predicting future vehicle trajectories. The proposed method leverages a novel technique for modeling interactions among dynamic objects in the scene. Contrary to many existing approaches which model cross-agent interactions during the observation time, our method additionally exploits the future states of the agents. This is accomplished using a hierarchical attention mechanism where the evolving states of the agents autoregressively control the contributions of past trajectories and scene encodings in the final prediction. Furthermore, we propose a multiresolution map encoding scheme that relies on a vision transformer module to effectively capture both local and global scene context to guide the generation of more admissible future trajectories. We evaluate the proposed method on the nuScenes benchmark dataset and show that our approach achieves state-of-the-art performance and improves upon trajectory metrics by up to 40%. We further investigate the contributions of various components of the proposed technique via extensive ablation studies.

show abstract

Section: Trajectory Prediction Methodsmentioning

confidence: 99%

LatentFormer: Multi-Agent Transformer-Based Interaction Modeling and Trajectory Prediction

Elmira¹,

Rasouli²,

Lakner³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…With the objective of encoding traffic scenes similar to ours (albeit not in the context of motion planning), encoder architectures for learning representations of occupancy maps have been proposed [50]- [52]. Using graphical or otherwise spatially-aware encoders similar to ours, recent works such as [53]- [57] predict occupancy grids [58] as an intermediate learning target for guiding the training of neural motion planners. However, these approaches do not provide global, low-dimensional representations appropriate for decoupled RL agents.…”

Section: Applications To Motion Planningmentioning

confidence: 99%

Deep Occupancy-Predictive Representations for Autonomous Driving

Meyer¹,

Peiss²,

Althoff³

2023

Preprint

View full text Add to dashboard Cite

Manually specifying features that capture the diversity in traffic environments is impractical. Consequently, learning-based agents cannot realize their full potential as neural motion planners for autonomous vehicles. Instead, this work proposes to learn which features are task-relevant. Given its immediate relevance to motion planning, our proposed architecture encodes the probabilistic occupancy map as a proxy for obtaining pre-trained state representations of the environment. By leveraging a map-aware traffic graph formulation, our agent-centric encoder generalizes to arbitrary road networks and traffic situations. We show that our approach significantly improves the downstream performance of a reinforcement learning agent operating in urban traffic environments.

show abstract

“…Semantic scene completion is a similar task to semantic mapping, however it is generally defined using sensor data from only a single frame. While there exist some deep learning 3D mapping methods, it is not a common task due to the lack of accurate outdoor dynamic data to supervise and quantify performance on [17,[45][46][47][48]. SSC is currently a difficult task with minimal generalizability to real life due to the lack of accurately labeled dynamic scenes, as discussed in the previous section.…”

Section: B Semantic Scene Completionmentioning

confidence: 99%

“…Dynamic occupancy maps construct binary labels for cells indicating free or occupied, and extend their domain to scenes with dynamic actors by incorporating scene dynamics [13][14][15]. While learning-based approaches have been attempted in 2D [16,17], most 3D maps rely on feature engineering which can decrease performance and efficiency.…”

Section: Introductionmentioning

confidence: 99%

MotionSC: Data Set and Network for Real-Time Semantic Mapping in Dynamic Environments

Wilson¹,

Song²,

Yuewei³

et al. 2022

Preprint

View full text Add to dashboard Cite

This work addresses a gap in semantic scene completion (SSC) data by creating a novel outdoor data set with accurate and complete dynamic scenes. Our data set is formed from randomly sampled views of the world at each time step, which supervises generalizability to complete scenes without occlusions or traces. We create SSC baselines from state-of-the-art open source networks and construct a benchmark real-time dense local semantic mapping algorithm, MotionSC, by leveraging recent 3D deep learning architectures to enhance SSC with temporal information. Our network shows that the proposed data set can quantify and supervise accurate scene completion in the presence of dynamic objects, which can lead to the development of improved dynamic mapping algorithms. All software is available at https://github.com/UMich-CURLY/3DMapping.

show abstract

Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map

Cited by 4 publications

References 37 publications

LatentFormer: Multi-Agent Transformer-Based Interaction Modeling and Trajectory Prediction

LatentFormer: Multi-Agent Transformer-Based Interaction Modeling and Trajectory Prediction

Deep Occupancy-Predictive Representations for Autonomous Driving

MotionSC: Data Set and Network for Real-Time Semantic Mapping in Dynamic Environments

Contact Info

Product

Resources

About