We propose a novel approach for optical flow estima tion, targeted at large displacements with significant oc clusions. It consists of two steps: i) dense matching by edge-preserving interpolation from a sparse set of matches; ii) variational energy minimization initialized with the dense matches. The sparse-to-dense interpolation relies on an appropriate choice of the distance, namely an edge aware geodesic distance. This distance is tailored to han dle occlusions and motion boundaries -two common and difficult issues for optical flow computation. We also pro pose an approximation scheme for the geodesic distance to allow fast computation without loss of performance. Sub sequent to the dense interpolation step, standard one-level variational energy minimization is carried out on the dense matches to obtain the final flow estimation. The proposed approach, called Edge-Preserving Interpolation of Corre spondences (EpicFlow) is fast and robust to large displace ments. It significantly outperforms the state of the art on MPI-Sintel and performs on par on Kitti and Middlebury.
Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level that are then linked or tracked across time. In this paper, we leverage the temporal continuity of videos instead of operating at the frame level. We propose the ACtion Tubelet detector (ACT-detector) that takes as input a sequence of frames and outputs tubelets, i.e., sequences of bounding boxes with associated scores. The same way state-of-theart object detectors rely on anchor boxes, our ACT-detector is based on anchor cuboids. We build upon the SSD framework [18]. Convolutional features are extracted for each frame, while scores and regressions are based on the temporal stacking of these features, thus exploiting information from a sequence. Our experimental results show that leveraging sequences of frames significantly improves detection performance over using individual frames. The gain of our tubelet detector can be explained by both more accurate scores and more precise localization. Our ACT-detector outperforms the state-of-the-art methods for frame-mAP and video-mAP on the J-HMDB [12] and UCF-101 [30] datasets, in particular at high overlap thresholds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.