BA-Net: Dense Bundle Adjustment Network

Tang, Chengzhou; Tan, Ping

doi:10.48550/arxiv.1806.04807

Cited by 48 publications

(75 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our primary focus lies on the depth prediction performance and we include both multi-view and single-view comparisons. As common in deep monocular multi-view SfM we report results for the ground truth scale aligned depth [72,69,70]. We compare with current SotA multi-view frameworks and report results for their publicly available models.…”

Section: Methodsmentioning

confidence: 99%

“…One of the first deep SfM systems was published by [72]. Since then, a series of frameworks combine multi-view image information for inferring camera motion and scene geometry [88,11,69,70,23,75]. While most works rely on generic network architectures, few combine learning with a traditional geometric optimization [70,69,11].…”

Section: Related Workmentioning

confidence: 99%

“…Since then, a series of frameworks combine multi-view image information for inferring camera motion and scene geometry [88,11,69,70,23,75]. While most works rely on generic network architectures, few combine learning with a traditional geometric optimization [70,69,11]. We base our model on DeepV2D [70], which couples supervised training of depth based on a cost volume architecture with a geometric pose graph optimization.…”

Section: Related Workmentioning

confidence: 99%

“…Single-view networks may not generalize well from one dataset to another [48,14] or need to be trained on massive datasets [59,58]. Using a true multi-view learning approach [72,69,70,75,23] turned out to be favorable in terms of performance. An Supported by Robert Bosch GmbH.…”

Section: Introductionmentioning

confidence: 99%

“…Recent developments have shown, that modeling scene geometry explicitly inside the architecture [70,69,23] leads to better reconstruction results than loosely coupling neural networks with a common training loss. Inside this paradigm, learned features are aligned temporally and spatially based on scene geometry and camera motion.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in Dynamic Environments

Homeyer¹,

Lange²,

Schnörr³

2022

Preprint

View full text Add to dashboard Cite

3D reconstruction of depth and motion from monocular video in dynamic environments is a highly ill-posed problem due to scale ambiguities when projecting to the 2D image domain. In this work, we investigate the performance of the current State-of-the-Art (SotA) deep multi-view systems in such environments. We find that current supervised methods work surprisingly well despite not modelling individual object motions, but make systematic errors due to a lack of dense ground truth data. To detect such errors during usage, we extend the cost volume based Deep Video to Depth (DeepV2D) framework [70] with a learned uncertainty. Our Deep Video to certain Depth (DeepV2cD) model allows i) to perform en par or better with current SotA and ii) achieve a better uncertainty measure than the naive Shannon entropy. Our experiments show that a simple filter strategy based on the uncertainty can significantly reduce systematic errors. This results in cleaner reconstructions both on static and dynamic parts of the scene.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in Dynamic Environments

Homeyer¹,

Lange²,

Schnörr³

2022

Preprint

View full text Add to dashboard Cite

show abstract

CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Fácil

Ummenhofer

Zhou

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Single-view depth estimation suffers from the problem that a network trained on images from one camera does not generalize to images taken with a different camera model. Thus, changing the camera model requires collecting an entirely new training dataset. In this work, we propose a new type of convolution that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns. Experiments confirm that this improves the generalization capabilities of depth prediction networks considerably, and clearly outperforms the state of the art when the train and test images are acquired with different cameras. * CAM-Conv Encoder DecoderCamera Model

show abstract

VOLDOR: Visual Odometry From Log-Logistic Dense Optical Flow Residuals

Min

Yang

Dunn

2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

We propose a dense indirect visual odometry method taking as input externally estimated optical flow fields instead of hand-crafted feature correspondences. We define our problem as a probabilistic model and develop a generalized-EM formulation for the joint inference of camera motion, pixel depth, and motion-track confidence. Contrary to traditional methods assuming Gaussian-distributed observation errors, we supervise our inference framework under an (empirically validated) adaptive log-logistic distribution model. Moreover, the log-logistic residual model generalizes well to different state-of-the-art optical flow methods, making our approach modular and agnostic to the choice of optical flow estimators. Our method achieved top-ranking results on both TUM RGB-D and KITTI odometry benchmarks. Our open-sourced implementation 1 is inherently GPU-friendly with only linear computational and storage growth.

show abstract

BA-Net: Dense Bundle Adjustment Network

Cited by 48 publications

References 50 publications

Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in Dynamic Environments

Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in Dynamic Environments

CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

VOLDOR: Visual Odometry From Log-Logistic Dense Optical Flow Residuals

Contact Info

Product

Resources

About