Abstract:Multimodal prediction results are essential for trajectory prediction task as there is no single correct answer for the future. Previous frameworks can be divided into three categories: regression, generation and classification frameworks. However, these frameworks have weaknesses in different aspects so that they cannot model the multimodal prediction task comprehensively. In this paper, we present a novel insight along with a brand-new prediction framework by formulating multimodal prediction into three step… Show more
“…Method ADE FDE DESIRE [19] 19.25 34.05 Ridel et al [68] 14.92 27.97 MANTRA [14] 13.51 27.34 PECNet [34] 12.79 25.98 PCCSNet [69] 12.54 -TNT [32] 12. 23 benefit from relying on a working memory in which data can be explicitly stored instead of blended in a unique latent vector.…”
Effective modeling of human interactions is of utmost importance when forecasting behaviors such as future trajectories. Each individual, with its motion, influences surrounding agents since everyone obeys to social non-written rules such as collision avoidance or group following. In this paper we model such interactions, which constantly evolve through time, by looking at the problem from an algorithmic point of view, i.e. as a data manipulation task. We present a neural network based on an end-to-end trainable working memory, which acts as an external storage where information about each agent can be continuously written, updated and recalled. We show that our method is capable of learning explainable cause-effect relationships between motions of different agents, obtaining state-of-the-art results on multiple trajectory forecasting datasets.
“…Method ADE FDE DESIRE [19] 19.25 34.05 Ridel et al [68] 14.92 27.97 MANTRA [14] 13.51 27.34 PECNet [34] 12.79 25.98 PCCSNet [69] 12.54 -TNT [32] 12. 23 benefit from relying on a working memory in which data can be explicitly stored instead of blended in a unique latent vector.…”
Effective modeling of human interactions is of utmost importance when forecasting behaviors such as future trajectories. Each individual, with its motion, influences surrounding agents since everyone obeys to social non-written rules such as collision avoidance or group following. In this paper we model such interactions, which constantly evolve through time, by looking at the problem from an algorithmic point of view, i.e. as a data manipulation task. We present a neural network based on an end-to-end trainable working memory, which acts as an external storage where information about each agent can be continuously written, updated and recalled. We show that our method is capable of learning explainable cause-effect relationships between motions of different agents, obtaining state-of-the-art results on multiple trajectory forecasting datasets.
“…Method ADE FDE DESIRE [19] 19.25 34.05 Ridel et al [55] 14.92 27.97 MANTRA [14] 13.51 27.34 PECNet [31] 12.79 25.98 PCCSNet [56] 12.54 -TNT [29] 12. 23 What SMEMO is learning is a chain of cause-effect relationships that makes an agent stop depending on the position of another.…”
Effective modeling of human interactions is of utmost importance when forecasting behaviors such as future trajectories. Each individual, with its motion, influences surrounding agents since everyone obeys to social non-written rules such as collision avoidance or group following. In this paper we model such interactions, which constantly evolve through time, by looking at the problem from an algorithmic point of view, i.e. as a data manipulation task. We present a neural network based on an end-to-end trainable working memory, which acts as an external storage where information about each agent can be continuously written, updated and recalled. We show that our method is capable of learning explainable cause-effect relationships between motions of different agents, obtaining state-of-the-art results on multiple trajectory forecasting datasets.
“…Social interaction with multiple persons. Multi-person trajectory prediction has been a long standing problem in decades [25,48,68,53,76,7,6,24,19,50,37,43,57,8,35,62,70,63].…”
We propose a novel framework for multi-person 3D motion trajectory prediction.Our key observation is that a human's action and behaviors may highly depend on the other persons around. Thus, instead of predicting each human pose trajectory in isolation, we introduce a Multi-Range Transformers model which contains of a local-range encoder for individual motion and a global-range encoder for social interactions. The Transformer decoder then performs prediction for each person by taking a corresponding pose as a query which attends to both local and global-range encoder features. Our model not only outperforms state-of-the-art methods on long-term 3D motion prediction, but also generates diverse social interactions. More interestingly, our model can even predict 15-person motion simultaneously by automatically dividing the persons into different interaction groups. Project page with code is available at https://jiashunwang.github.io/MRT/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.