View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums

Wong, Conghao; Xia, Beihao; Hong, Ziming; Peng, Qinmu; You, Xinge

doi:10.48550/arxiv.2110.07288

Cited by 3 publications

(19 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, some methods use non-recurrent methods to encode agents' historical trajectories (e.g., Hasan et al, 2018;Mangalam et al, 2020a;Wong et al, 2021). While Hasan et al (2018) incorporates the position and gaze direction of all agents in the scene, at each timestep, into a simple kinematic model that calculates their velocity, acceleration, and orientation at each point in time, Mangalam et al (2020a) plots each history position onto a birds-eye-view heatmap (which are the same dimensions as the input map) where the values decrease inverse proportionally to the distance from that position, and then concatenates these heatmaps with the input map encoding to put through a U-net (Ronneberger et al, 2015) style architecture.…”

Section: Trajectory Predictionmentioning

confidence: 99%

“…This non-quantized way of representing past histories allows aleatoric uncertainty (noise) to be incorporated within the system, and facilitates the combining of history and map encodings when predicting the future (Mangalam et al, 2020a). Wong et al (2021), however, encodes history coordinates by their discrete fourier transform (DFT) frequencies (similar to the DCT in human motion prediction) by applying a 1D-DFT on each dimension of the trajectory (x and y), obtaining a magnitude and phase sequence for each dimension, and embedding the concatenated magnitude and phase sequences into a higher dimension through an MLP. Since pedestrians typically make a coarse overall motion decision first, and then respond to potential emergencies (like interactive behaviors) with quicker maneuvers, Wong et al (2021) use the DFT frequencies to take advantage of the fact that low-frequency portions in the spectrums of agents' observed trajectories reflect coarse motion trends and can potentially predict global future motion, while the high-frequency portions reflect quicker movements, by separating the motion forecasting into a coarse trajectory prediction which is then refined by a fine interpolation network (Wong et al, 2021).…”

Section: Trajectory Predictionmentioning

confidence: 99%

See 1 more Smart Citation

Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Thuremella,

Kunze

2023

jair

View full text Add to dashboard Cite

Autonomous robots that can perform common tasks like driving, surveillance, and chores have the biggest potential for impact due to frequency of usage, and the biggest potential for risk due to direct interaction with humans. These tasks take place in openended environments where humans socially interact and pursue their goals in complex and diverse ways. To operate in such environments, such systems must predict this behaviour, especially when the behavior is unexpected and potentially dangerous. Therefore, we summarize trends in various types of tasks, modeling methods, datasets, and social interaction modules aimed at predicting the future location of dynamic, socially interactive agents. Furthermore, we describe long-tailed learning techniques from classification and regression problems that can be applied to prediction problems. To our knowledge this is the first work that reviews social interaction modeling within prediction, and long-tailed learning techniques within regression and prediction.

show abstract

Section: Trajectory Predictionmentioning

confidence: 99%

Section: Trajectory Predictionmentioning

confidence: 99%

Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Thuremella,

Kunze

2023

jair

View full text Add to dashboard Cite

show abstract

“…Many works have tackled the problem of path prediction in recent years. Most of them only focus on multi-future path prediction [42,44,26,36], and evaluate the model by picking the best out of several predicted paths. However, in real-time scenarios generating multiple outputs per subject is not insightful.…”

Section: Pedestrian Bird's-eye View Path Predictionmentioning

confidence: 99%

“…However, many existing path predictions rely on a spectrum of predictions with significantly large model sizes. Most works in this context predict multiple future trajectories and choose the best to assess their accuracy and performance [42,26,36]. Predicting the spectrum of possibilities and picking the best one makes the real-world implementation for real-world applications with the demand for a single prediction per each subject infeasible.…”

Section: Introductionmentioning

confidence: 99%

Pishgu: Universal Path Prediction Architecture through Graph Isomorphism and Attentive Convolution

Noghre¹,

Vinit²,

Pazho³

et al. 2022

Preprint

View full text Add to dashboard Cite

Path prediction is an essential task for several real-world real-time applications, from autonomous driving and video surveillance to environmental monitoring. Most existing approaches are computation-intensive and only target a narrow domain (e.g., a specific point of view for a particular subject). However, many real-time applications demand a universal path predictor that can work across different subjects (vehicles, pedestrians), perspectives (bird's-eye, highangle), and scenes (sidewalk, highway). This article proposes Pishgu, a universal graph isomorphism approach for attentive path prediction that accounts for environmental challenges. Pishgu captures the inter-dependencies within the subjects in each frame by taking advantage of Graph Isomorphism Networks. In addition, an attention module is adopted to represent the intrinsic relations of the subjects of interest with their surroundings. We evaluate the adaptability of our approach to multiple publicly available vehicle (bird's-eye view) and pedestrian (bird's-eye and high-angle view) path prediction datasets. Pishgu's universal solution outperforms existing domain-focused methods by producing state-of-the-art results for vehicle bird's-eye view by 42% and 61% and pedestrian high-angle views by 23% and 22% in terms of ADE and FDE, respectively. Moreover, we analyze the domain-specific details for various datasets to understand their effect on path prediction and model interpretation. Although our model is a single solution for path prediction problems and defines a new standard in multiple domains, it still has a comparable complexity to state-ofthe-art models, which makes it suitable for real-world application. We also report the latency and throughput for all three domains on multiple embedded processors.

show abstract

“…V [49] is a concurrent method proposing a two-stage Transformer network to model the trajectory and its Fourier spectrum in the keypoints and interactions levels, respectively.…”

Section: Performance Evaluationmentioning

confidence: 99%

End-to-End Trajectory Distribution Prediction Based on Occupancy Grid Maps

Guo¹,

Liu²,

Jia³

2022

Preprint

View full text Add to dashboard Cite

In this paper, we aim to forecast a future trajectory distribution of a moving agent in the real world, given the social scene images and historical trajectories. Yet, it is a challenging task because the ground-truth distribution is unknown and unobservable, while only one of its samples can be applied for supervising model learning, which is prone to bias. Most recent works focus on predicting diverse trajectories in order to cover all modes of the real distribution, but they may despise the precision and thus give too much credit to unrealistic predictions. To address the issue, we learn the distribution with symmetric cross-entropy using occupancy grid maps as an explicit and scene-compliant approximation to the ground-truth distribution, which can effectively penalize unlikely predictions. In specific, we present an inverse reinforcement learning based multi-modal trajectory distribution forecasting framework that learns to plan by an approximate value iteration network in an end-to-end manner. Besides, based on the predicted distribution, we generate a small set of representative trajectories through a differentiable Transformer-based network, whose attention mechanism helps to model the relations of trajectories. In experiments, our method achieves state-of-the-art performance on the Stanford Drone Dataset and Intersection Drone Dataset.

show abstract

View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums

Cited by 3 publications

References 22 publications

Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Pishgu: Universal Path Prediction Architecture through Graph Isomorphism and Attentive Convolution

End-to-End Trajectory Distribution Prediction Based on Occupancy Grid Maps

Contact Info

Product

Resources

About