While deep learning has led to significant advances in visual recognition over the past few years, such advances often require a lot of annotated data. Unsupervised domain adaptation has emerged as an alternative approach that does not require as much annotated data, prior evaluations of domain adaptation approaches have been limited to relatively similar datasets, e.g source and target domains are samples captured by different cameras. A new data suite is proposed that comprehensively evaluates cross-modality domain adaptation problems. This work pushes the limit of unsupervised domain adaptation through an in-depth evaluation of several state of the art methods on benchmark datasets and the new dataset suite. We also propose a new domain adaptation network called "Deep MagNet" that effectively transfers knowledge for crossmodality domain adaptation problems. Deep Magnet achieves state of the art performance on two benchmark datasets. More importantly, the proposed method shows consistent improvements in performance on the newly proposed dataset suite.
Leveraging synthetically rendered data offers great potential to improve monocular depth estimation, but closing the synthetic-real domain gap is a non-trivial and important task. While much recent work has focused on unsupervised domain adaptation, we consider a more realistic scenario where a large amount of synthetic training data is supplemented by a small set of real images with groundtruth. In this setting we find that existing domain translation approaches are difficult to train and offer little advantage over simple baselines that use a mix of real and synthetic data. A key failure mode is that real-world images contain novel objects and clutter not present in synthetic training. This high-level domain shift isn't handled by existing image translation models.Based on these observations, we develop an attentional module that learns to identify and remove (hard) out-ofdomain regions in real images in order to improve depth prediction for a model trained primarily on synthetic data. We carry out extensive experiments to validate our attendremove-complete approach (ARC) and find that it significantly outperforms state-of-the-art domain adaptation methods for depth prediction. Visualizing the removed regions provides interpretable insights into the synthetic-real domain gap.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.