Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction

Girgis, Roger; Golemo, Florian; Codevilla, Felipe; Weiss, Martin; Aldon, D'Souza, Jim; Kahou, Samira Ebrahimi; Heide, Felix; Pal, Christopher

doi:10.48550/arxiv.2104.00563

Cited by 4 publications

(6 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As Transformers (Vaswani et al 2017) have gained popularity, an increasing number of studies (Liu et al 2021;Ngiam et al 2021;Jia et al 2023) have utilized the attention mechanism to encode scene context. Encouraged by the successful application of DETR (Carion et al 2020), many Transformerbased models (Girgis et al 2021;Varadarajan et al 2022;Nayakanti et al 2023) have adopted learnable queries in decoder to generate multiple potential future trajectories. In our study, we utilize the architecture presented in MTR (Shi et al 2022a), which is an advanced transformer framework incorporating a local attention based encoder and a decoder with intention queries.…”

Section: Architectures For Motion Predictionmentioning

confidence: 99%

“…In prediction-based matching methods (Ngiam et al 2021;Varadarajan et al 2022;Nayakanti et al 2023), the positive mixture component is chosen by directly comparing predicted trajectories to the ground truth. Some models (Tang and Salakhutdinov 2019;Girgis et al 2021) using the loss based on EM algorithm can also be viewed as prediction-based matching when its KL term converges. Due to the challenge of selecting representative future trajectories, these methods have opted to use well-designed aggregation techniques (Varadarajan et al 2022;Nayakanti et al 2023), or to directly utilize an end-to-end version (Ngiam et al 2021;Girgis et al 2021).…”

Section: Modeling For Multimodal Future Motionmentioning

confidence: 99%

“…Some models (Tang and Salakhutdinov 2019;Girgis et al 2021) using the loss based on EM algorithm can also be viewed as prediction-based matching when its KL term converges. Due to the challenge of selecting representative future trajectories, these methods have opted to use well-designed aggregation techniques (Varadarajan et al 2022;Nayakanti et al 2023), or to directly utilize an end-to-end version (Ngiam et al 2021;Girgis et al 2021). However, their scoring performance still lags behind that of anchor-based matching methods.…”

Section: Modeling For Multimodal Future Motionmentioning

confidence: 99%

See 2 more Smart Citations

Cancer immunotherapy: an evolving paradigm

Lin

2022

J. Zhejiang Univ. Sci. B

View full text Add to dashboard Cite

Section: Architectures For Motion Predictionmentioning

confidence: 99%

Section: Modeling For Multimodal Future Motionmentioning

confidence: 99%

Section: Modeling For Multimodal Future Motionmentioning

confidence: 99%

See 1 more Smart Citation

Cancer immunotherapy: an evolving paradigm

Lin

2022

J. Zhejiang Univ. Sci. B

View full text Add to dashboard Cite

“…Hence, the multimodal trajectory prediction for agent i can be regarded as estimating a mixture distribution Note that we primarily focus on marginal motion prediction in this paper, but our approach can be smoothly extended to joint motion prediction tasks by involving scene-level loss functions [8,33]. We leave it as an important future work.…”

Section: A Problem Formulationmentioning

confidence: 99%

Efficient policy detecting and reusing for non-stationarity in Markov games

Yan

Hao

Zhang

et al. 2020

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

This paper presents a Simple and effIcient Motion Prediction baseLine (SIMPL) for autonomous vehicles. Unlike conventional agent-centric methods with high accuracy but repetitive computations and scene-centric methods with compromised accuracy and generalizability, SIMPL delivers realtime, accurate motion predictions for all relevant traffic participants. To achieve improvements in both accuracy and inference speed, we propose a compact and efficient global feature fusion module that performs directed message passing in a symmetric manner, enabling the network to forecast future motion for all road users in a single feed-forward pass and mitigating accuracy loss caused by viewpoint shifting. Additionally, we investigate the continuous trajectory parameterization using Bernstein basis polynomials in trajectory decoding, allowing evaluations of states and their higher-order derivatives at any desired time point, which is valuable for downstream planning tasks. As a strong baseline, SIMPL exhibits highly competitive performance on Argoverse 1 & 2 motion forecasting benchmarks compared with other state-of-the-art methods. Furthermore, its lightweight design and low inference latency make SIMPL highly extensible and promising for real-world onboard deployment. We open-source the code at https: //github.com/HKUST-Aerial-Robotics/SIMPL.

show abstract

“…Mercat et al [20], by introducing self-attention mechanisms considering interactions between vehicles, successfully achieved trajectory prediction for multiple vehicle agents. Roger et al [21] proposed the AutoBots model, which, through the use of social multi-head self-attention (MHSA) modules, efficiently performs single-pass forward inference for the entire future scene, demonstrating high performance in handling complex traffic scenarios with multi-agent interactions. The adoption of deep learning models like Transformers and MHSA modules has significantly advanced multi-agent trajectory prediction in complex traffic scenarios.…”

Section: Introductionmentioning

confidence: 99%

Goal-Guided Graph Attention Network with Interactive State Refinement for Multi-Agent Trajectory Prediction

Wu,

Qiao,

et al. 2024

Sensors

View full text Add to dashboard Cite

The accurate prediction of the future trajectories of traffic participants is crucial for enhancing the safety and decision-making capabilities of autonomous vehicles. Modeling social interactions among agents and revealing the inherent relationships is crucial for accurate trajectory prediction. In this context, we propose a goal-guided and interaction-aware state refinement graph attention network (SRGAT) for multi-agent trajectory prediction. This model effectively integrates high-precision map data and dynamic traffic states and captures long-term temporal dependencies through the Transformer network. Based on these dependencies, it generates multiple potential goals and Points of Interest (POIs). Through its dual-branch, multimodal prediction approach, the model not only proposes various plausible future trajectories associated with these POIs, but also rigorously assesses the confidence levels of each trajectory. This goal-oriented strategy enables SRGAT to accurately predict the future movement trajectories of other vehicles in complex traffic scenarios. Tested on the Argoverse and nuScenes datasets, SRGAT surpasses existing algorithms in key performance metrics by adeptly integrating past trajectories and current context. This goal-guided approach not only enhances long-term prediction accuracy, but also ensures its reliability, demonstrating a significant advancement in trajectory forecasting.

show abstract

Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction

Cited by 4 publications

References 0 publications

Cancer immunotherapy: an evolving paradigm

Cancer immunotherapy: an evolving paradigm

Efficient policy detecting and reusing for non-stationarity in Markov games

Goal-Guided Graph Attention Network with Interactive State Refinement for Multi-Agent Trajectory Prediction

Contact Info

Product

Resources

About