Unsupervised convolutional neural networks for motion estimation

Ahmadi, Arash; Patras, Ioannis

doi:10.1109/icip.2016.7532634

Cited by 87 publications

(71 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These algorithms take a pair of images as input, and use a convolutional neural network to learn image features that capture the concept of optical flow from data. Several of these works require supervision in the form of ground truth flow fields [52], [53], [55], [56], while we build on a few that use an unsupervised objective [51], [54]. The spatial transform layer enables neural networks to perform both global parametric 2D image alignment [42] and dense spatial transformations [54], [57], [58] without requiring supervised labels.…”

Section: D Image Alignmentmentioning

confidence: 99%

VoxelMorph: A Learning Framework for Deformable Medical Image Registration

Balakrishnan

Zhao

Sabuncu

et al. 2019

IEEE Trans. Med. Imaging

1,367

1,231

View full text Add to dashboard Cite

We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach, and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network (CNN), and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this work, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model's accuracy is comparable to state-of-the-art methods, while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time, and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is freely available at http://voxelmorph.csail.mit.edu.

show abstract

Section: D Image Alignmentmentioning

confidence: 99%

VoxelMorph: A Learning Framework for Deformable Medical Image Registration

Balakrishnan

Zhao

Sabuncu

et al. 2019

IEEE Trans. Med. Imaging

1,367

1,231

View full text Add to dashboard Cite

show abstract

“…They are based on an autoencoder design, allow for supervised end-to-end training, and enable fast inference during testing time. To alleviate the need for training data with ground truth in a specific domain, unsupervised [1,40,47,58,63,66] and semisupervised [31,65] alternatives have also been developed.…”

Section: Related Workmentioning

confidence: 99%

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation

Hur

Roth

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

288

239

View full text Add to dashboard Cite

Deep learning approaches to optical flow estimation have seen rapid progress over the recent years. One common trait of many networks is that they refine an initial flow estimate either through multiple stages or across the levels of a coarse-to-fine representation. While leading to more accurate results, the downside of this is an increased number of parameters. Taking inspiration from both classical energy minimization approaches as well as residual networks, we propose an iterative residual refinement (IRR) scheme based on weight sharing that can be combined with several backbone networks. It reduces the number of parameters, improves the accuracy, or even achieves both. Moreover, we show that integrating occlusion prediction and bi-directional flow estimation into our IRR scheme can further boost the accuracy. Our full network achieves stateof-the-art results for both optical flow and occlusion estimation across several standard datasets. Number of parameters (million) AEPE on Sintel Train Clean FlowNetC FlowNetS SpyNetOurs (PWC-Net + Occ) PWC-NetOurs (PWC-Net + Bi) Ours (IRR-PWC) LiteFlowNetOurs (PWC-Net + IRR)

show abstract

“…We used FlowNetC trained by data scheduling without fine-tuning as a baseline in the evaluation. To obtain an unbiased evaluation result, we trained and tested each of these networks on both Flying Chairs and Sintel dataset [Butler et al, 2012] three times. The average EPE is reported in Table 3.…”

Section: Methodsmentioning

confidence: 99%

“…We evaluatethe our performance of the proposed approach on three standard optical flow benchmarks: Flying Chairs [Dosovitskiy et al, 2015], Sintel [Butler et al, 2012], and KITTI [Geiger et al, 2012]. We compare the performance of the proposed approach to both supervised methods such as: FlowNet(S/C) [ [Ilg et al, 2017] uses several FlowNets and contains cascade training of the FlowNets in different phases.…”

Section: Benchmarkmentioning

confidence: 99%

See 1 more Smart Citation

Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask

Zhang

Ouyang

et al. 2018

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Using a layered representation for motion estimation has the advantage of being able to cope with discontinuities and occlusions. In this paper, we learn to estimate optical flow by combining a layered motion representation with deep learning. Instead of pre-segmenting the image to layers, the proposed approach automatically generates a layered representation of optical flow using the proposed soft-mask module. The essential components of the soft-mask module are maxout and fuse operations, which enable a disjoint layered representation of optical flow and more accurate flow estimation. We show that by using masks the motion estimate results in a quadratic function of input features in the output layer. The proposed softmask module can be added to any existing optical flow estimation networks by replacing their flow output layer. In this work, we use FlowNet as the base network to which we add the soft-mask module. The resulting network is tested on three wellknown benchmarks with both supervised and unsupervised flow estimation tasks. Evaluation results show that the proposed network achieve better results compared with the original FlowNet.

show abstract

Unsupervised convolutional neural networks for motion estimation

Cited by 87 publications

References 20 publications

VoxelMorph: A Learning Framework for Deformable Medical Image Registration

VoxelMorph: A Learning Framework for Deformable Medical Image Registration

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation

Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask

Contact Info

Product

Resources

About