Abstract:Abstract. We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation-and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF infere… Show more
“…cost volumes are constructed. We employ the method of [23] and generate 2 3D volumes from one complete 4D cost volume (c.f. Sec.…”
Section: Methodsmentioning
confidence: 99%
“…For feature generation we follow [23] and employ a Siamese network consisting of two convolutional branches with shared parameters. In our implementation we utilize a feed forward network comprised of 5 convolutional layers with a filter size of 3 × 3 and 64 channels, followed by a single 1-D convolution.…”
Section: Feature Generationmentioning
confidence: 99%
“…(3). To that end, [23] proposes to use min-projection to eliminate one of the motion directions via: cor j i (x, y, u i ) := min u1−i∈H cor j (x, y, u 0 , u 1 ). Most prominently this reduces the memory complexity of storing the cost volume from quadratic to linear.…”
Section: Feature Generationmentioning
confidence: 99%
“…Compared to nearest neighbor field methods [18,36], we make use of a complete cost-volume and avoid a coarse-to-fine scheme or hashing. To that end, we make use of a recent result [23] that allows for a low memory footprint of the cost volume. In contrast to network based solutions to inpainting [41], we maintain the interpretability of an energy based framework.…”
Modern optical flow methods are often composed of a cascade of many independent steps or formulated as a black box neural network that is hard to interpret and analyze. In this work we seek for a plain, interpretable, but learnable solution. We propose a novel inpainting based algorithm that approaches the problem in three steps: feature selection and matching, selection of supporting points and energy based inpainting. To facilitate the inference we propose an optimization layer that allows to backpropagate through 10K iterations of a first-order method without any numerical or memory problems. Compared to recent state-of-the-art networks, our modular CNN is very lightweight and competitive with other, more involved, inpainting based methods.
“…cost volumes are constructed. We employ the method of [23] and generate 2 3D volumes from one complete 4D cost volume (c.f. Sec.…”
Section: Methodsmentioning
confidence: 99%
“…For feature generation we follow [23] and employ a Siamese network consisting of two convolutional branches with shared parameters. In our implementation we utilize a feed forward network comprised of 5 convolutional layers with a filter size of 3 × 3 and 64 channels, followed by a single 1-D convolution.…”
Section: Feature Generationmentioning
confidence: 99%
“…(3). To that end, [23] proposes to use min-projection to eliminate one of the motion directions via: cor j i (x, y, u i ) := min u1−i∈H cor j (x, y, u 0 , u 1 ). Most prominently this reduces the memory complexity of storing the cost volume from quadratic to linear.…”
Section: Feature Generationmentioning
confidence: 99%
“…Compared to nearest neighbor field methods [18,36], we make use of a complete cost-volume and avoid a coarse-to-fine scheme or hashing. To that end, we make use of a recent result [23] that allows for a low memory footprint of the cost volume. In contrast to network based solutions to inpainting [41], we maintain the interpretability of an energy based framework.…”
Modern optical flow methods are often composed of a cascade of many independent steps or formulated as a black box neural network that is hard to interpret and analyze. In this work we seek for a plain, interpretable, but learnable solution. We propose a novel inpainting based algorithm that approaches the problem in three steps: feature selection and matching, selection of supporting points and energy based inpainting. To facilitate the inference we propose an optimization layer that allows to backpropagate through 10K iterations of a first-order method without any numerical or memory problems. Compared to recent state-of-the-art networks, our modular CNN is very lightweight and competitive with other, more involved, inpainting based methods.
“…which follows the scalable model of Munda et al [34], avoiding the storage of all matching scores that for an M ×N image have the size M ×N ×D 2 . The inner maximization steps correspond to the first iteration of an approximate MAP inference [34].…”
It has been proposed by many researchers that combining deep neural networks with graphical models can create more efficient and better regularized composite models. The main difficulties in implementing this in practice are associated with a discrepancy in suitable learning objectives as well as with the necessity of approximations for the inference. In this work we take one of the simplest inference methods, a truncated max-product Belief Propagation, and add what is necessary to make it a proper component of a deep learning model: We connect it to learning formulations with losses on marginals and compute the backprop operation. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs), allowing us to design a hierarchical model composing BP inference and CNNs at different scale levels. The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.