2021
DOI: 10.48550/arxiv.2103.14023
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting

Abstract: Predicting accurate future trajectories of multiple agents is essential for autonomous systems but is challenging due to the complex interaction between agents and the uncertainty in each agent's future behavior. Forecasting multiagent trajectories requires modeling two key dimensions:(1) time dimension, where we model the influence of past agent states over future states; (2) social dimension, where we model how the state of each agent affects others. Most prior methods model these two dimensions separately, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(21 citation statements)
references
References 46 publications
0
11
0
Order By: Relevance
“…In inference, we use greedy decoding to decode the future trajectory, and based on the experiments, using greedy decoding in training is able to increase the trajectory prediction accuracy during inference. This increase is also observed in [22]. Therefore, in both training and inference, the decoder decodes the future trajectory of the camera wearer in a greedy autoregressive manner.…”
Section: Model Structurementioning
confidence: 82%
See 2 more Smart Citations
“…In inference, we use greedy decoding to decode the future trajectory, and based on the experiments, using greedy decoding in training is able to increase the trajectory prediction accuracy during inference. This increase is also observed in [22]. Therefore, in both training and inference, the decoder decodes the future trajectory of the camera wearer in a greedy autoregressive manner.…”
Section: Model Structurementioning
confidence: 82%
“…STAR [21] predicts pedestrian trajectories with only the attention mechanism, which is achieved by a graph-based spatial transformer and a temporal transformer. AgentFormer [22], a transformer-based framework, jointly models temporal and social dimensions in human motion dynamics to predict future trajectories. Our model is also based on transformer, but we differ from them in that 1) we target for egocentric scenarios, and 2) our model encodes multiple modalities with a novel cascaded crossattention mechanism, whereas their models are designed to use the past trajectories as the only cue for the future trajectory prediction.…”
Section: A Non-egocentric Human Trajectory Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, people have found transformers to have stronger encoding abilities and the advantage of avoiding recursion [13]. The current state-of-the-art algorithm of predicting human trajectories on ETH dataset, AgentFormer, uses transformer from end to end [14]. Therefore, we also experiment using transformer structure as encoder in addition to lstm to extract information from observed trajectories.…”
Section: Related Workmentioning
confidence: 99%
“…In general, researchers use transformers to process the node embedding in two orthogonal directions: first, through the node-wise residual feature transformation, an arbitrary type of intra-node transformation is enabled [18,47,48]; second, through the attention mechanism, features from different nodes are dynamically aggregated and the inter-nodes relationships are captured [48]. Previous efforts have shown the potential of transformers in multi-agent system [49], by flattening connections features across time and agents.…”
Section: Introductionmentioning
confidence: 99%