WESPE: Weakly Supervised Photo Enhancer for Digital Cameras

Ignatov, Andrey; Kobyshev, Nikolay; Timofte, Radu; Vanhoey, Kenneth; Gool, Luc Van

doi:10.1109/cvprw.2018.00112

Cited by 172 publications

(94 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the albedo, we use a fully convolutional network without downsampling or upsampling blocks. This results in a small receptive field for the network and better preserves the texture details while avoiding large structural changes [16,17]. As shown in Figure 6, allowing downsampling blocks in the Figure 4: To finish the backward cycle, the real image is first translated to the PBR domain.…”

Section: Pbr-to-real Image Translationmentioning

confidence: 99%

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

Sunkavalli

Perazzi

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline first learns to predict accurate shading in a supervised fashion using physically-based renderings as targets, and further increases the realism of the textures and shading with an improved CycleGAN network. Extensive evaluations on the SUNCG indoor scene dataset demonstrate that our approach yields more realistic images compared to other state-of-the-art approaches. Furthermore, networks trained on our generated "real" images predict more accurate depth and normals than domain adaptation approaches, suggesting that improving the visual realism of the images can be more effective than imposing task-specific losses.

show abstract

Section: Pbr-to-real Image Translationmentioning

confidence: 99%

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

Sunkavalli

Perazzi

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…Nowadays, various deep learning models can be found in nearly any mobile device. Among the most popular tasks are different computer vision problems like image classification [38,82,23], image enhancement [27,28,32,30], image super-resolution [17,42,83], bokeh simulation [85], object tracking [87,25], optical character recognition [56], face detection and recognition [44,70], augmented reality [3,16], etc. Another important group of tasks running on mobile devices is related to various NLP (Natural Language Processing) problems, such as natural language translation [80,7], sentence completion [52,24], sentence sentiment analysis [77,72,33], voice assistants [18] and interactive chatbots [71].…”

Section: Introductionmentioning

confidence: 99%

AI Benchmark: All About Deep Learning on Smartphones in 2019

Ignatov

Timofte

Kulik

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Self Cite

183

106

View full text Add to dashboard Cite

The performance of mobile AI accelerators has been evolving rapidly in the past two years, nearly doubling with each new generation of SoCs. The current 4th generation of mobile NPUs is already approaching the results of CUDAcompatible Nvidia graphics cards presented not long ago, which together with the increased capabilities of mobile deep learning frameworks makes it possible to run complex and deep AI models on mobile devices. In this paper, we evaluate the performance and compare the results of all chipsets from Qualcomm, HiSilicon, Samsung, MediaTek and Unisoc that are providing hardware acceleration for AI inference. We also discuss the recent changes in the Android ML pipeline and provide an overview of the deployment of deep learning models on mobile devices. All numerical results provided in this paper can be found and are regularly updated on the official project website 1 . * We also thank Oli Gaymond (ogaymond@google.com), Google Inc., for writing and editing section 3.1 of this paper. 1

show abstract

“…The ability of humans to easily imagine how a black haired person would look like if they were blond, or with a different type of eyeglasses, or to imagine a winter scene as summer is formulated as the image-to-image (I2I) translation problem in the computer vision community. Since the recent introduction of Generative Adversarial Networks (GANs) [19], a plethora of problems such as video analysis [51,7], super resolution [33,9], semantic synthesis [26,10], photo enhancement [24,25], photo editing [49,14], and most recently domain adaptation [21,43] have been addressed as I2I translation problems.…”

Section: Introductionmentioning

confidence: 99%

“…However, this approach is unpractical because the full representation of the cross-domain mapping is, in most cases, intractable. Existing techniques try to perform deterministic I2I translation with unpaired images to map from one domain into another (one-to-one) [55,4,37,25], or into multiple domains (one-to-many) [12,46,20]. Nevertheless, many problems are fundamentally stochastic as there are countless mappings from one domain to another e.g., a day↔night or cat↔dog translation.…”

Section: Introductionmentioning

confidence: 99%

SMIT: Stochastic Multi-Label Image-to-Image Translation

Romero¹,

Pont-Tuset²,

Gool

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Self Cite

View full text Add to dashboard Cite

Cross-domain mapping has been a very active topic in recent years. Given one image, its main purpose is to translate it to the desired target domain, or multiple domains in the case of multiple labels. This problem is highly challenging due to three main reasons: (i) unpaired datasets, (ii) multiple attributes, and (iii) the multimodality (e.g. style) associated with the translation. Most of the existing stateof-the-art has focused only on two reasons i.e., either on (i) and (ii), or (i) and (iii). In this work, we propose a joint framework (i, ii, iii) of diversity and multi-mapping image-to-image translations, using a single generator to conditionally produce countless and unique fake images that hold the underlying characteristics of the source image. Our system does not use style regularization, instead, it uses an embedding representation that we call domain embedding for both domain and style. Extensive experiments over different datasets demonstrate the effectiveness of our proposed approach in comparison with the state-ofthe-art in both multi-label and multimodal problems. Additionally, our method is able to generalize under different scenarios: continuous style interpolation, continuous label interpolation, and fine-grained mapping. Code and pretrained models are available at https://github. com/BCV-Uniandes/SMIT.

show abstract

WESPE: Weakly Supervised Photo Enhancer for Digital Cameras

Cited by 172 publications

References 32 publications

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

AI Benchmark: All About Deep Learning on Smartphones in 2019

SMIT: Stochastic Multi-Label Image-to-Image Translation

Contact Info

Product

Resources

About