Unsupervised image-to-image translation methods have received a lot of attention in the last few years. Multiple techniques emerged to tackle the initial challenge from different perspectives. Some focus on learning as much as possible from the target-style using several images of that style for each translation while others make use of object detection in order to produce more realistic results on content-rich scenes. In this paper, we explore multiple frameworks that rely on different paradigms and assess how one of these that has initially been developed for single object translation performs on more diverse and content-rich images. Our work is based on an already existing framework. We explore its versatility by training it with a more diverse dataset than the one it was designed and tuned for. This helps understanding how such methods behave beyond their original application. We explore how to make the most out of the datasets despite our computational power limitations. We present a way to extend a dataset by passing it through an object detector. The latter provides us with new and diverse dataset classes. Moreover, we propose a way to adapt the framework in order to leverage the power of object detection by integrating it in the architecture as one can see in other methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.