Qi She scite author profile

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation, and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are as follows: (1) the generation of high quality images, (2) diversity of image generation, and (3) stabilizing training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state-of-the-art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress toward addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress toward critical computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Codes related to the GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.

show abstract

Involution: Inverting the Inherence of Convolution for Visual Recognition

Wang

et al. 2021

226

100

View full text Add to dashboard Cite

MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis

Li¹,

Feng²,

She³

et al. 2021

View full text Add to dashboard Cite

Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM

Shi¹,

Zhao³

et al. 2020

View full text Add to dashboard Cite

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Wang¹,

She²,

Ward³

2019

Preprint

View full text Add to dashboard Cite

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably the revolutionary techniques are in the area of computer vision such as plausible image generation, image to image translation, facial attribute manipulation and similar domains. Despite the significant success achieved in the computer vision field, applying GANs to real-world problems still poses significant challenges, three of which we focus on here: (1) High quality image generation; (2) Diverse image generation; and (3) Stable training. Through an in-depth review of GAN-related research in the literature, we provide an account of the architecture-variants and loss-variants, which have been proposed to handle these three challenges from two perspectives. We propose loss-variants and architecture-variants for classifying the most popular GANs, and discuss the potential improvements with focusing on these two aspects. While several reviews for GANs have been presented to date, none have focused on the review of GAN-variants based on their handling the challenges mentioned above. In this paper, we review and critically discuss 7 architecture-variant GANs and 9 loss-variant GANs for remedying those three challenges. The objective of this review is to provide an insight on the footprint that current GANs research focuses on the performance improvement. Code related to GAN-variants studied in this work is summarized on https:// github.com/ sheqi/ GAN Review.

show abstract

ACTION-Net: Multipath Excitation for Action Recognition

Wang

She²,

Smolić

2021

137

View full text Add to dashboard Cite

Avalanche: an End-to-End Library for Continual Learning

Lomonaco

Pellegrini

Cossu

et al. 2021

View full text Add to dashboard Cite

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms.

show abstract

Involution: Inverting the Inherence of Convolution for Visual Recognition

Li¹,

Hu²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatialagnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including Im-ageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involutionbased models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. Code and pre-trained models for all the tasks are available at https://github.com/d-li14/involution.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qi She

Generative Adversarial Networks in Computer Vision

Involution: Inverting the Inherence of Convolution for Visual Recognition

MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis

Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

ACTION-Net: Multipath Excitation for Action Recognition

Avalanche: an End-to-End Library for Continual Learning

Involution: Inverting the Inherence of Convolution for Visual Recognition

Contact Info

Product

Resources

About