PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

Wang, Yunbo; Wu, Haixu; Zhang, Jianjin; Gao, Zhifeng; Wang, Jianmin; Yu, Philip S.; Long, Mingsheng

doi:10.1109/tpami.2022.3165153

Cited by 296 publications

(404 citation statements)

References 28 publications

Supporting

Mentioning

261

Contrasting

Unclassified

Order By: Relevance

“…From the perspective of models, Shi et al (2015) proposed Convolutional Long Short-Term Memory (Con-vLSTM), a spatiotemporal-forecasting neural network model, as a first attempt for DL-based precipitation nowcasting. This work is then followed by a number of studies (Shi et al, 2017;Sønderby et al, 2020;Wang et al, 2017). However, since the structure of ConvLSTM is relatively fixed, current ConvLSTM-based nowcasting methods hardly have specially designed structure for multiple input information.…”

mentioning

confidence: 99%

Improving Nowcasting of Convective Development by Incorporating Polarimetric Radar Variables Into a Deep‐Learning Model

Pan

Zhao

et al. 2021

Geophysical Research Letters

View full text Add to dashboard Cite

Severe convective precipitation is a major cause of many hazards such as floods and mudslides that lead to massive economic losses and casualties. Unfortunately, the characteristics such as rapid development, short life cycle and highly nonlinear dynamics of convective precipitation make it rather challenging to be precisely forecasted. Very short-term forecasting, that is, nowcasting, of convective precipitation using weather radar observations, has raised extensive research interest. Wilson et al. (1998) made a comprehensive review of convective storm characteristics and nowcasting methods, and pointed out that the insufficiency of data information and the ineffectiveness of nowcasting model are the two major challenges that convective precipitation nowcasting faces. Although improved over the past decades, these two deficiencies still remain to be settled (

show abstract

mentioning

confidence: 99%

Improving Nowcasting of Convective Development by Incorporating Polarimetric Radar Variables Into a Deep‐Learning Model

Pan

Zhao

et al. 2021

Geophysical Research Letters

View full text Add to dashboard Cite

show abstract

“…Similarly, Ballas et al [11] also employed convolutional layers to the Gated Recurrent Units (GRUs) for video prediction. However, Wang et al [14] held the idea that the temporal information and the spatial information should be equally considered, and proposed a spatial module for ConvLSTMs (ST-LSTM) to help model the spatial representation for each frame. Then, they further proposed the Casual LSTM [15] to help increase the model depth in the temporal domain and Gradient Highway Unit to alleviate the gradient propagation difficulties in deep predictive models.…”

Section: Related Workmentioning

confidence: 99%

“…Moving MNIST Method SSIM/frame ↑ MSE/frame ↓ ConvLSTM (NeurIPS2015) [8] 0.707 103.3 FRNN (ECCV2018) [9] 0.819 68.4 VPN (ICML2017) [48] 0.870 70.0 PredRNN (NeurIPS2017) [14] 0.869 56.8 PredRNN++ (ICML2018) [15] 0.898 46.5 MIM (CVPR2019) [16] 0.910 44.2 E3D-LSTM (ICLR2019) [21] 0.910 41.3 CrevNet (ICLR2020) [17] 0.928 38.5 MAU (NeurIPS2021) [39] 0.931 29.5 STAU 0.939 27.1…”

Section: Moving Mnistmentioning

confidence: 99%

“…However, LSTMs and GRUs are mainly designed to capture the temporal dependencies, the spatial information in video frames is overlooked, indicating traditional RNN structure may not be powerful enough to handle the complex spatiotemporal dynamics in videos. To solve this problem, some works [13], [14], [15], [16] attempted to redesign the structure of LSTMs and GRUs to help jointly model the spatial information (appearance details) and the temporal information (motion patterns) in videos.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

Chang¹,

Zhang²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video prediction aims to predict future frames by modeling the complex spatiotemporal dynamics in videos. However, most of the existing methods only model the temporal information and the spatial information for videos in an independent manner but haven't fully explored the correlations between both terms. In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos. On the one hand, the motion-aware attention weights are learned from the spatial states to help aggregate the temporal states in the temporal domain. On the other hand, the appearance-aware attention weights are learned from the temporal states to help aggregate the spatial states in the spatial domain. In this way, the temporal information and the spatial information can be greatly aware of each other in both domains, during which, the spatiotemporal receptive field can also be greatly broadened for more reliable spatiotemporal modeling. Experiments are not only conducted on traditional video prediction tasks but also other tasks beyond video prediction, including the early action recognition and object detection tasks. Experimental results show that our STAU can outperform other methods on all tasks in terms of performance and computation efficiency.

show abstract

“…A straightforward deep learning solution to visual control problems is to learn action-conditioned video prediction models [38,14,8,53] and then perform Monte-Carlo importance sampling and optimization algorithms, such as the cross-entropy methods, over available behaviors [15,12,29]. Hot topics in video prediction mainly includes long-term and high-fidelity future frames generation [44,43,51,5,52,50,54,41,40,36,56,28,2], dynamics uncertainty modeling [1,10,48,31,7,16,55], object-centric scene decomposition [47,27,18,58,3], and space-time disentanglement [49,27,19,6]. The corresponding technical improvements mainly involve the use of more effective neural architectures, novel probabilistic modeling methods, and specific forms of video representation.…”

Section: Related Workmentioning

confidence: 99%

Isolating and Leveraging Controllable and Noncontrollable Visual Dynamics in World Models

Pan¹,

Zhu²,

Wang³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios such as autonomous driving, there commonly exists noncontrollable dynamics independent of the action signals, making it difficult to learn effective world models. To tackle this problem, we present a novel reinforcement learning approach named Iso-Dream, which improves the Dream-to-Control framework [22] in two aspects. First, by optimizing the inverse dynamics, we encourage the world model to learn controllable and noncontrollable sources of spatiotemporal changes on isolated state transition branches. Second, we optimize the behavior of the agent on the decoupled latent imaginations of the world model. Specifically, to estimate state values, we roll-out the noncontrollable states into the future and associate them with the current controllable state. In this way, the isolation of dynamics sources can greatly benefit long-horizon decisionmaking of the agent, such as a self-driving car that can avoid potential risks by anticipating the movement of other vehicles. Experiments show that Iso-Dream is effective in decoupling the mixed dynamics and remarkably outperforms existing approaches in a wide range of visual control and prediction domains.

show abstract

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

Cited by 296 publications

References 28 publications

Improving Nowcasting of Convective Development by Incorporating Polarimetric Radar Variables Into a Deep‐Learning Model

Improving Nowcasting of Convective Development by Incorporating Polarimetric Radar Variables Into a Deep‐Learning Model

STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

Isolating and Leveraging Controllable and Noncontrollable Visual Dynamics in World Models

Contact Info

Product

Resources

About