The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible. In particular, we propose a robust training objective that is invariant to changes in depth range and scale, advocate the use of principled multi-objective learning to combine data from different sources, and highlight the importance of pretraining encoders on auxiliary tasks. Armed with these tools, we experiment with five diverse training datasets, including a new, massive data source: 3D films. To demonstrate the generalization power of our approach we use zero-shot cross-dataset transfer, i.e. we evaluate on datasets that were not seen during training. The experiments confirm that mixing data from complementary sources greatly improves monocular depth estimation. Our approach clearly outperforms competing methods across diverse datasets, setting a new state of the art for monocular depth estimation.
The impact of individual scientists is commonly quantified using citation-based measures. The most common such measure is the h-index. A scientist’s h-index affects hiring, promotion, and funding decisions, and thus shapes the progress of science. Here we report a large-scale study of scientometric measures, analyzing millions of articles and hundreds of millions of citations across four scientific fields and two data platforms. We find that the correlation of the h-index with awards that indicate recognition by the scientific community has substantially declined. These trends are associated with changing authorship patterns. We show that these declines can be mitigated by fractional allocation of citations among authors, which has been discussed in the literature but not implemented at scale. We find that a fractional analogue of the h-index outperforms other measures as a correlate and predictor of scientific awards. Our results suggest that the use of the h-index in ranking scientists should be reconsidered, and that fractional allocation measures such as h-frac provide more robust alternatives.
Abstract. The census transform is becoming increasingly popular in the context of optic flow computation in image sequences. Since it is invariant under monotonically increasing grey value transformations, it forms the basis of an illumination-robust constancy assumption. However, its underlying mathematical concepts have not been studied so far. The goal of our paper is to provide this missing theoretical foundation. We study the continuous limit of the inherently discrete census transform and embed it into a variational setting. Our analysis shows two surprising results: The census-based technique enforces matchings of extrema, and it induces an anisotropy in the data term by acting along level lines. Last but not least, we establish links to the widely-used gradient constancy assumption and present experiments that confirm our findings.
Abstract-Camera shakes and moving objects pose a severe problem in the high dynamic range (HDR) reconstruction from differently exposed images. We present the first approach that simultaneously computes the aligned HDR composite as well as accurate displacement maps. In this way, we can not only cope with dynamic scenes but even precisely represent the underlying motion. We design our fully coupled model transparently in a well-founded variational framework. The proposed joint optimisation has beneficial effects, such as intrinsic ghost removal or HDR-coupled smoothing. Both the HDR images and the optic flows benefit substantially from these features and the induced mutual feedback. We demonstrate this with synthetic and realworld experiments.
Most researchers agree that invariances are desirable in computer vision systems. However, one always has to keep in mind that this is at the expense of accuracy: By construction, all invariances inevitably discard information. The concept of morphological invariance is a good example for this trade-off and will be in the focus of this paper. Our goal is to develop a descriptor of local image structure that carries the maximally possible amount of local image information under this invariance. To fulfill this requirement, our descriptor has to encode the full ordering of the pixel intensities in the local neighbourhood. As a solution, we introduce the complete rank transform, which stores the intensity rank of every pixel in the local patch. As a proof of concept, we embed our novel descriptor in a prototypical TV−L 1 -type energy functional for optical flow computation, which we minimise with a traditional coarse-to-fine warping scheme. In this straightforward framework, we demonstrate that our descriptor is preferable over related features that exhibit the same invariance. Finally, we show by means of public benchmark systems that our method produces -in spite of its simplicity -results of competitive quality.
Invariances are one of the key concepts to render computer vision algorithms robust against severe illumination changes. However, there is no free lunch: With any invariance comes an unavoidable loss of information. The goal of our paper is to introduce two novel descriptors which minimise this loss: the complete rank transform and the complete census transform. They are invariant under monotonically increasing intensity rescalings, while containing a maximally possible amount of information.To analyse our descriptors, we embed them as constancy assumptions into a variational framework for optic flow computation. As a suitable regularisation term, we choose the total generalised variation that favours piecewise affine solutions. Our experiments focus on the KITTI benchmark where robustness w.r.t. illumination changes is one of the main issues. The results demonstrate that our descriptors yield state-of-the-art accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.