LSTM WarpX t-6 X t-3 X t St Ot-6,t-3 Ot-3,t Ot,t+3 St+3 Figure 1: Our proposed approach aggregates past optical flow features using a convolutional LSTM to predict future optical flow, which is used by an learnable warp layer to produce future segmentation.
AbstractUnderstanding the world around us and making decisions about the future is a critical component to human intelligence. As autonomous systems continue to develop, their ability to reason about the future will be the key to their success. Semantic anticipation is a relatively underexplored area for which autonomous vehicles could take advantage of (e.g., forecasting pedestrian trajectories). Motivated by the need for real-time prediction in autonomous systems, we propose to decompose the challenging semantic forecasting task into two subtasks: current frame segmentation and future optical flow prediction. Through this decomposition, we built an efficient, effective, low overhead model with three main components: flow prediction network, feature-flow aggregation LSTM, and end-to-end learnable warp layer. Our proposed method achieves state-of-the-art accuracy on short-term and moving objects semantic forecasting while simultaneously reducing model parameters by up to 95% and increasing efficiency by greater than 40x. arXiv:1809.08318v2 [cs.CV] 21 Nov 2018 RGB 1 RGB 2 ... RGB t-1 RGB t Flow CNN Flow LSTM Seg CNN Seg t+1 Warp Flow CNN Seg CNN Seg t+1 RGB t-3 RGB t-2 RGB t-1 RGB t Seg Pred CNN Seg CNN Seg t+1 RGB t-3s RGB t-2 RGB t-1 RGB t Seg Pred CNN