Hao Tang scite author profile

Cross-view image translation is challenging because it involves images with drastically different views and severe deformation. In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (Selection-GAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map. The proposed SelectionGAN explicitly utilizes the semantic information and consists of two stages. In the first stage, the condition image and the target semantic map are fed into a cycled semantic-guided generation network to produce initial coarse results. In the second stage, we refine the initial results by using a multi-channel attention selection mechanism. Moreover, uncertainty maps automatically learned from attentions are used to guide the pixel loss for better network optimization. Extensive experiments on Dayton [42], CVUSA [44] and Ego2Top [1] datasets show that our model is able to generate significantly better results than the state-of-the-art methods. The source code, data and trained models are available at https://github. com/Ha0Tang/SelectionGAN .

show abstract

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

Wang

Tang

et al. 2018

298

169

View full text Add to dashboard Cite

Recent works have shown the benefit of integrating Conditional Random Fields (CRFs) models into deep architectures for improving pixel-level prediction tasks. Following this line of research, in this paper we introduce a novel approach for monocular depth estimation. Similarly to previous works, our method employs a continuous CRF to fuse multi-scale information derived from different layers of a front-end Convolutional Neural Network (CNN). Differently from past works, our approach benefits from a structured attention model which automatically regulates the amount of information transferred between corresponding features at different scales. Importantly, the proposed attention model is seamlessly integrated into the CRF, allowing end-to-end training of the entire architecture. Our extensive experimental evaluation demonstrates the effectiveness of the proposed method which is competitive with previous methods on the KITTI benchmark and outperforms the state of the art on the NYU Depth V2 dataset.

show abstract

XingGAN for Person Image Generation

et al. 2020

View full text Add to dashboard Cite

GestureGAN for Hand Gesture-to-Gesture Translation in the Wild

Tang

Wang

et al. 2018

115

View full text Add to dashboard Cite

Hand gesture-to-gesture translation in the wild is a challenging task since hand gestures can have arbitrary poses, sizes, locations and self-occlusions. Therefore, this task requires a high-level understanding of the mapping between the input source gesture and the output target gesture. To tackle this problem, we propose a novel hand Gesture Generative Adversarial Network (Ges-tureGAN). GestureGAN consists of a single generator G and a discriminator D, which takes as input a conditional hand image and a target hand skeleton image. GestureGAN utilizes the hand skeleton information explicitly, and learns the gesture-to-gesture mapping through two novel losses, the color loss and the cycleconsistency loss. The proposed color loss handles the issue of "channel pollution" while back-propagating the gradients. In addition, we present the Fréchet ResNet Distance (FRD) to evaluate the quality of generated images. Extensive experiments on two widely used benchmark datasets demonstrate that the proposed Gesture-GAN achieves state-of-the-art performance on the unconstrained hand gesture-to-gesture translation task. Meanwhile, the generated images are in high-quality and are photo-realistic, allowing them to be used as data augmentation to improve the performance of a hand gesture classifier. Our model and code are available at https://github.com/Ha0Tang/GestureGAN.

show abstract

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Tang

Yan³

et al. 2020

129

View full text Add to dashboard Cite

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

Tang

Liu

et al. 2019

View full text Add to dashboard Cite

In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C 2 GAN) for the task of keypoint-guided image generation. The proposed C 2 GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C 2 GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces [19] and Market-1501 [58], demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.

show abstract

Impaired prefrontal–amygdala effective connectivity is responsible for the dysfunction of emotion process in major depressive disorder: A dynamic causal modeling study on MEG

Lü

Luo

et al. 2012

Neuroscience Letters

112

View full text Add to dashboard Cite

AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks

Tang

Liu

et al. 2023

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hao Tang

Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

XingGAN for Person Image Generation

GestureGAN for Hand Gesture-to-Gesture Translation in the Wild

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

Impaired prefrontal–amygdala effective connectivity is responsible for the dysfunction of emotion process in major depressive disorder: A dynamic causal modeling study on MEG

AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks

Contact Info

Product

Resources

About