Abstract:We address the issue of domain gap when making use of synthetic data to train a scene-specific object detector and pose estimator. While previous works have shown that the constraints of learning a scene-specific model can be leveraged to create geometrically and photometrically consistent synthetic data, care must be taken to design synthetic content which is as close as possible to the real-world data distribution. In this work, we propose to solve domain gap through the use of appearance randomization to ge… Show more
“…As shown in Fig. 3, we categorize them into three groups, namely: Domain randomization [24,25,26,27,28,29,30] Adversarial data augmentation [31,32,33] Data generation [34,35,36,37,38,39,30,40,41,42,43,44,45,46,157] Representation learning…”
Section: Methodsmentioning
confidence: 99%
“…Tobin et al [25] first used this method to generate more training data from the simulated environment for generalization in the real environment. Similar techniques were also used in [26,27,28,24] to strengthen the generalization capability of the models. Prakash et al [29] further took into account the structure of the scene when randomly placing objects for data generation, which enables the neural network to learn to utilize context when detecting objects.…”
Domain generalization (DG), i.e., out-of-distribution generalization, has attracted increased interests in recent years. Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain. For years, great progress has been achieved. This paper presents the first review for recent advances in domain generalization. First, we provide a formal definition of domain generalization and discuss several related fields. Then, we categorize recent algorithms into three classes and present them in detail: data manipulation, representation learning, and learning strategy, each of which contains several popular algorithms. Third, we introduce the commonly used datasets and applications. Finally, we summarize existing literature and present some potential research topics for the future.
“…As shown in Fig. 3, we categorize them into three groups, namely: Domain randomization [24,25,26,27,28,29,30] Adversarial data augmentation [31,32,33] Data generation [34,35,36,37,38,39,30,40,41,42,43,44,45,46,157] Representation learning…”
Section: Methodsmentioning
confidence: 99%
“…Tobin et al [25] first used this method to generate more training data from the simulated environment for generalization in the real environment. Similar techniques were also used in [26,27,28,24] to strengthen the generalization capability of the models. Prakash et al [29] further took into account the structure of the scene when randomly placing objects for data generation, which enables the neural network to learn to utilize context when detecting objects.…”
Domain generalization (DG), i.e., out-of-distribution generalization, has attracted increased interests in recent years. Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain. For years, great progress has been achieved. This paper presents the first review for recent advances in domain generalization. First, we provide a formal definition of domain generalization and discuss several related fields. Then, we categorize recent algorithms into three classes and present them in detail: data manipulation, representation learning, and learning strategy, each of which contains several popular algorithms. Third, we introduce the commonly used datasets and applications. Finally, we summarize existing literature and present some potential research topics for the future.
“…They rely on model data during inference, from which we refrain from for the reasons stated in section 2.4. The works of Ren [67], Khirodkar [68] and Tremblay [69] focus on detection and 3-dof estimation. Yet they offer a great contribution showing the potential of DR in overcoming the reality gap, increasing overall accuracy.…”
Augmented reality applications use object tracking to estimate the pose of a camera and to superimpose virtual content onto the observed object. Today, a number of tracking systems are available, ready to be used in industrial applications. However, such systems are hard to handle for a service maintenance engineer, due to obscure configuration procedures. In this paper, we investigate options towards replacing the manual configuration process with a machine learning approach based on automatically synthesized data. We present an automated process of creating object tracker facilities exclusively from synthetic data. The data is highly enhanced to train a convolutional neural network, while still being able to receive reliable and robust results during real world applications only from simple RGB cameras. Comparison against related work using the LINEMOD dataset showed that we are able to outperform similar approaches. For our intended industrial applications with high accuracy demands, its performance is still lower than common object tracking methods with manual configuration. Yet, it can greatly support those as an add-on during initialization, due to its higher reliability.
“…Once trained, Task2Sim can be used not only for "seen" tasks but also can be used in one-shot to generate simulation parameters for novel "unseen" tasks. similar to ours, is domain randomization [2,26,46,61,77], which learns pre-trained models from datasets generated by randomly varying simulator parameters. In contrast, Task2Sim learns simulator parameters to generate synthetic datasets that maximize transfer learning performance.…”
Section: Related Workmentioning
confidence: 99%
“…Our extensive experiments using 20 downstream classification datasets show that on seen tasks, given a number of images per category, Task2Sim's output parameters generate pre-training datasets that are much better for downstream performance than approaches like domain randomization [2,26,77] that are not task-adaptive. Moreover, we show Task2Sim also generalizes well to unseen tasks, maintaining an edge over non-adaptive approaches while being competitive with Imagenet pre-training.…”
Pre-training models on Imagenet or other massive datasets of real images has led to major advances in computer vision, albeit accompanied with shortcomings related to curation cost, privacy, usage rights, and ethical issues. In this paper, for the first time, we study the transferability of pre-trained models based on synthetic data generated by graphics simulators to downstream tasks from very different domains. In using such synthetic data for pre-training, we find that downstream performance on different tasks are favored by different configurations of simulation parameters (e.g. lighting, object pose, backgrounds, etc.), and that there is no one-size-fits-all solution. It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance. We introduce Task2Sim, a unified model mapping downstream task representations to optimal simulation parameters to generate synthetic pre-training data for them. Task2Sim learns this mapping by training to find the set of best parameters on a set of "seen" tasks. Once trained, it can then be used to predict best simulation parameters for novel "unseen" tasks in one shot, without requiring additional training. Given a budget in number of images per class, our extensive experiments with 20 diverse downstream tasks show Task2Sim's task-adaptive pretraining data results in significantly better downstream performance than non-adaptively choosing simulation parameters on both seen and unseen tasks. It is even competitive with pre-training on real images from Imagenet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.