2021
DOI: 10.48550/arxiv.2110.07288
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(19 citation statements)
references
References 22 publications
0
19
0
Order By: Relevance
“…Finally, some methods use non-recurrent methods to encode agents' historical trajectories (e.g., Hasan et al, 2018;Mangalam et al, 2020a;Wong et al, 2021). While Hasan et al (2018) incorporates the position and gaze direction of all agents in the scene, at each timestep, into a simple kinematic model that calculates their velocity, acceleration, and orientation at each point in time, Mangalam et al (2020a) plots each history position onto a birds-eye-view heatmap (which are the same dimensions as the input map) where the values decrease inverse proportionally to the distance from that position, and then concatenates these heatmaps with the input map encoding to put through a U-net (Ronneberger et al, 2015) style architecture.…”
Section: Trajectory Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, some methods use non-recurrent methods to encode agents' historical trajectories (e.g., Hasan et al, 2018;Mangalam et al, 2020a;Wong et al, 2021). While Hasan et al (2018) incorporates the position and gaze direction of all agents in the scene, at each timestep, into a simple kinematic model that calculates their velocity, acceleration, and orientation at each point in time, Mangalam et al (2020a) plots each history position onto a birds-eye-view heatmap (which are the same dimensions as the input map) where the values decrease inverse proportionally to the distance from that position, and then concatenates these heatmaps with the input map encoding to put through a U-net (Ronneberger et al, 2015) style architecture.…”
Section: Trajectory Predictionmentioning
confidence: 99%
“…This non-quantized way of representing past histories allows aleatoric uncertainty (noise) to be incorporated within the system, and facilitates the combining of history and map encodings when predicting the future (Mangalam et al, 2020a). Wong et al (2021), however, encodes history coordinates by their discrete fourier transform (DFT) frequencies (similar to the DCT in human motion prediction) by applying a 1D-DFT on each dimension of the trajectory (x and y), obtaining a magnitude and phase sequence for each dimension, and embedding the concatenated magnitude and phase sequences into a higher dimension through an MLP. Since pedestrians typically make a coarse overall motion decision first, and then respond to potential emergencies (like interactive behaviors) with quicker maneuvers, Wong et al (2021) use the DFT frequencies to take advantage of the fact that low-frequency portions in the spectrums of agents' observed trajectories reflect coarse motion trends and can potentially predict global future motion, while the high-frequency portions reflect quicker movements, by separating the motion forecasting into a coarse trajectory prediction which is then refined by a fine interpolation network (Wong et al, 2021).…”
Section: Trajectory Predictionmentioning
confidence: 99%
“…Many works have tackled the problem of path prediction in recent years. Most of them only focus on multi-future path prediction [42,44,26,36], and evaluate the model by picking the best out of several predicted paths. However, in real-time scenarios generating multiple outputs per subject is not insightful.…”
Section: Pedestrian Bird's-eye View Path Predictionmentioning
confidence: 99%
“…However, many existing path predictions rely on a spectrum of predictions with significantly large model sizes. Most works in this context predict multiple future trajectories and choose the best to assess their accuracy and performance [42,26,36]. Predicting the spectrum of possibilities and picking the best one makes the real-world implementation for real-world applications with the demand for a single prediction per each subject infeasible.…”
Section: Introductionmentioning
confidence: 99%
“…V [49] is a concurrent method proposing a two-stage Transformer network to model the trajectory and its Fourier spectrum in the keypoints and interactions levels, respectively.…”
Section: Performance Evaluationmentioning
confidence: 99%