Rethinking the Truly Unsupervised Image-to-Image Translation

Baek, Kyungjune; Choi, Yunjey; Uh, Youngjung; Yoo, Jaejun; Shim, Hyunjung

doi:10.48550/arxiv.2006.06500

Cited by 10 publications

(26 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each task, we used a suitable baseline architecture, but replaced their content losses with our F/LSeSim loss. In addition, we are only interested in scenarios where scene structure is preserved during the translation [23,56,57], rather investigating translations incorporating shape modification [9,10,37,28,4].…”

Section: Methodsmentioning

confidence: 99%

The Spatially-Correlative Loss for Various Image Translation Tasks

Zheng

Cham

Cai

2021

Preprint

View full text Add to dashboard Cite

We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. Previous methods attempt this by using pixel-level cycle-consistency or feature-level matching losses, but the domain-specific nature of these losses hinder translation across large domain gaps. To address this, we exploit the spatial patterns of self-similarity as a means of defining scene structure. Our spatially-correlative loss is geared towards only capturing spatial relationships within an image rather than domain appearance. We also introduce a new self-supervised learning method to explicitly learn spatially-correlative maps for each specific translation task. We show distinct improvement over baseline models in all three modes of unpaired I2I translation: single-modal, multi-modal, and even single-image translation. This new loss can easily be integrated into existing network architectures and thus allows wide applicability. The code is available at https://github.com/lyndonzheng/F-LSeSim.

show abstract

Section: Methodsmentioning

confidence: 99%

The Spatially-Correlative Loss for Various Image Translation Tasks

Zheng

Cham

Cai

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Baselines. We use TUNIT [2] and SwapAE [31] as our unsupervised baselines. In the case of TUNIT, the number of clusters must be specified in advance, but it is difficult to know the optimal number for each dataset.…”

Section: Methodsmentioning

confidence: 99%

“…While successful, these methods rely on a vast quantity of domain labels, which often becomes a serious bottleneck. To reduce such appetite for labels, more recent studies propose fully unsupervised methods that leverage pseudolabels acquired by the image clustering methods [3,2]. However these methods easily yield unintended translation results if the clustering algorithms fail to produce consistent clusters.…”

Section: Multi-domain Image-to-image Translationmentioning

confidence: 99%

See 1 more Smart Citation

Contrastive Learning for Unsupervised Image-to-Image Translation

Lee¹,

Seol²,

Lee³

2021

Preprint

View full text Add to dashboard Cite

Figure 1: Unsupervised image-to-image translation results on the AFHQ dataset (left) and CelebA dataset (right). First column and first row shows the input images and the reference images, respectively, while the rest of the images are synthesized by CLUIT. CLUIT can create high-quality images reflecting the style of given arbitrary reference image. Note that we did not use any ground truth labels.

show abstract

“…However, the above scenario [111], [161] still assumes access to the domain labels of the training images. Some recent work aims to reduce the need for such supervision by using few [209] or even no [9] domain labels. Very recently, some works [15], [106], [147] are able to achieve image translation even when each domain only has a single image, inspired by recent advances that can train GANs on a single image [169].…”

Section: Unsupervised Image Translationmentioning

confidence: 99%

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications

Huang

et al. 2020

Preprint

View full text Add to dashboard Cite

The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks, allowing the synthesis of visual content in an unconditional or input-conditional manner. It has enabled the generation of high-resolution photorealistic images and videos, a task that was challenging or impossible with prior methods. It has also led to the creation of many new applications in content creation. In this paper, we provide an overview of GANs with a special focus on algorithms and applications for visual synthesis. We cover several important techniques to stabilize GAN training, which has a reputation for being notoriously difficult. We also discuss its applications to image translation, image processing, video synthesis, and neural rendering.

show abstract

Rethinking the Truly Unsupervised Image-to-Image Translation

Cited by 10 publications

References 37 publications

The Spatially-Correlative Loss for Various Image Translation Tasks

The Spatially-Correlative Loss for Various Image Translation Tasks

Contrastive Learning for Unsupervised Image-to-Image Translation

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications

Contact Info

Product

Resources

About