Abstract:Establishing dense correspondences between a pair of images is an important and general problem, covering geometric matching, optical flow and semantic correspondences. While these applications share fundamental challenges, such as large displacements, pixel-accuracy, and appearance changes, they are currently addressed with specialized network architectures, designed for only one particular task. This severely limits the generalization capabilities of such networks to new scenarios, where e.g. robustness to l… Show more
“…Fundamentally, the goal of probabilistic deep learning is to achieve a predictive model p(y|X; θ) that coincides with [34] between the flow y estimated by GLU-Net [64] and the ground-truth y. empirical probabilities as well as possible. We can get important insights into this problem by studying the empirical error distribution of a state-of-the-art matching model, in this case GLU-Net [64], as shown in Fig. 3.…”
Section: Constrained Mixture Model Predictionmentioning
confidence: 99%
“…This forces the network to focus on the appearance of the image region in order to predict its motion and uncertainty. Given a base flow Ỹ relating Ĩr to Ĩq and representing a simple transformation such as a homography as in prior works [40,46,63,64], we create a residual flow = i ε i , by adding small local perturbations ε i . The query image I q = Ĩq is left unchanged while the reference I r is generated by warping Ĩr according to the residual flow .…”
Section: Data For Self-supervised Uncertaintymentioning
confidence: 99%
“…We adopt the recent GLU-Net-GOCor [63,64] as our base architecture. It consists in a four-level pyramidal network operating at two image resolutions and employing a VGG-16 network [5] pre-trained on ImageNet for feature extraction.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…This leads to large displacements and significant appearance transformations between the frames. In contrast to optical flow, the more general dense correspondence problem has received much less attention [40,48,53,64]. Dense flow estimation is prone to errors in the presence of large displacements, appearance changes, or homogeneous regions.…”
Section: Introductionmentioning
confidence: 99%
“…However, learning reliable and generalizable uncertainties without densely annotated realworld training data is a highly challenging problem. Standard self-supervised techniques [40,46,64] do not faithfully model real motion patterns, appearance changes, and occlusions. We tackle this challenge by introducing a carefully designed architecture and improved self-supervision to ensure robust and generalizable uncertainty predictions.…”
Establishing dense correspondences between a pair of images is an important and general problem. However, dense flow estimation is often inaccurate in the case of large displacements or homogeneous regions. For most applications and down-stream tasks, such as pose estimation, image manipulation, or 3D reconstruction, it is crucial to know when and where to trust the estimated matches.In this work, we aim to estimate a dense flow field relating two images, coupled with a robust pixel-wise confidence map indicating the reliability and accuracy of the prediction. We develop a flexible probabilistic approach that jointly learns the flow prediction and its uncertainty. In particular, we parametrize the predictive distribution as a constrained mixture model, ensuring better modelling of both accurate flow predictions and outliers. Moreover, we develop an architecture and training strategy tailored for robust and generalizable uncertainty prediction in the context of self-supervised training. Our approach obtains stateof-the-art results on multiple challenging geometric matching and optical flow datasets. We further validate the usefulness of our probabilistic confidence estimation for the task of pose estimation. Code and models are available at https://github.com/PruneTruong/PDCNet.
“…Fundamentally, the goal of probabilistic deep learning is to achieve a predictive model p(y|X; θ) that coincides with [34] between the flow y estimated by GLU-Net [64] and the ground-truth y. empirical probabilities as well as possible. We can get important insights into this problem by studying the empirical error distribution of a state-of-the-art matching model, in this case GLU-Net [64], as shown in Fig. 3.…”
Section: Constrained Mixture Model Predictionmentioning
confidence: 99%
“…This forces the network to focus on the appearance of the image region in order to predict its motion and uncertainty. Given a base flow Ỹ relating Ĩr to Ĩq and representing a simple transformation such as a homography as in prior works [40,46,63,64], we create a residual flow = i ε i , by adding small local perturbations ε i . The query image I q = Ĩq is left unchanged while the reference I r is generated by warping Ĩr according to the residual flow .…”
Section: Data For Self-supervised Uncertaintymentioning
confidence: 99%
“…We adopt the recent GLU-Net-GOCor [63,64] as our base architecture. It consists in a four-level pyramidal network operating at two image resolutions and employing a VGG-16 network [5] pre-trained on ImageNet for feature extraction.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…This leads to large displacements and significant appearance transformations between the frames. In contrast to optical flow, the more general dense correspondence problem has received much less attention [40,48,53,64]. Dense flow estimation is prone to errors in the presence of large displacements, appearance changes, or homogeneous regions.…”
Section: Introductionmentioning
confidence: 99%
“…However, learning reliable and generalizable uncertainties without densely annotated realworld training data is a highly challenging problem. Standard self-supervised techniques [40,46,64] do not faithfully model real motion patterns, appearance changes, and occlusions. We tackle this challenge by introducing a carefully designed architecture and improved self-supervision to ensure robust and generalizable uncertainty predictions.…”
Establishing dense correspondences between a pair of images is an important and general problem. However, dense flow estimation is often inaccurate in the case of large displacements or homogeneous regions. For most applications and down-stream tasks, such as pose estimation, image manipulation, or 3D reconstruction, it is crucial to know when and where to trust the estimated matches.In this work, we aim to estimate a dense flow field relating two images, coupled with a robust pixel-wise confidence map indicating the reliability and accuracy of the prediction. We develop a flexible probabilistic approach that jointly learns the flow prediction and its uncertainty. In particular, we parametrize the predictive distribution as a constrained mixture model, ensuring better modelling of both accurate flow predictions and outliers. Moreover, we develop an architecture and training strategy tailored for robust and generalizable uncertainty prediction in the context of self-supervised training. Our approach obtains stateof-the-art results on multiple challenging geometric matching and optical flow datasets. We further validate the usefulness of our probabilistic confidence estimation for the task of pose estimation. Code and models are available at https://github.com/PruneTruong/PDCNet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.