Shaoting Zhang scite author profile

Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGANs) aimed at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of a scene based on a given text description, yielding low-resolution images. The Stage-II GAN takes Stage-I results and the text description as inputs, and generates high-resolution images with photo-realistic details. Second, an advanced multi-stage generative adversarial network architecture, StackGAN-v2, is proposed for both conditional and unconditional generative tasks. Our StackGAN-v2 consists of multiple generators and multiple discriminators arranged in a tree-like structure; images at multiple scales corresponding to the same scene are generated from different branches of the tree. StackGAN-v2 shows more stable training behavior than StackGAN-v1 by jointly approximating multiple distributions. Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images.

show abstract

A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions From CT Images

Wang

Liu

et al. 2020

IEEE Trans. Med. Imaging

363

296

View full text Add to dashboard Cite

Segmentation of pneumonia lesions from CT scans of COVID-19 patients is important for accurate diagnosis and follow-up. Deep learning has a potential to automate this task but requires a large set of high-quality annotations that are difficult to collect. Learning from noisy training labels that are easier to obtain has a potential to alleviate this problem. To this end, we propose a novel noise-robust framework to learn from noisy labels for the segmentation task. We first introduce a noise-robust Dice loss that is a generalization of Dice loss for segmentation and Mean Absolute Error (MAE) loss for robustness against noise, then propose a novel COVID-19 Pneumonia Lesion segmentation network (COPLE-Net) to better deal with the lesions with various scales and appearances. The noiserobust Dice loss and COPLE-Net are combined with an adaptive self-ensembling framework for training, where an Exponential Moving Average (EMA) of a student model is used as a teacher model that is adaptively updated by suppressing the contribution of the student to EMA when the student has a large training loss. The student

show abstract

Multispectral Deep Neural Networks for Pedestrian Detection

Liu¹,

Zhang²,

Wang³

et al. 2016

256

241

View full text Add to dashboard Cite

Multispectral pedestrian detection is essential for around-the-clock applications, e.g., surveillance and autonomous driving. In some sense, color and thermal images provide complementary visual information. As shown in Figure 1, thermal images usually present clear silhouettes of human objects [1], but losing fine visual details of human objects (e.g. clothing) which can be captured by RGB cameras (depending on external illumination), Nevertheless, except very recent efforts (e.g.,[2]), most of previous studies concentrated on detecting pedestrians with color or thermal images only. It is still unknown how color and thermal image channels can be properly fused in DNNs to achieve the best pedestrian detection synergy. In this paper, we focus on how to make the most of multispectral images (color and thermal) for pedestrian detection. With the recent success of DNNs on generic object detection, it becomes very natural and interesting to exploit the effectiveness of DNNs for multispectral pedestrian detection. We deeply analyze Faster R-CNN [3] for this task and then model it into a convolutional network (ConvNet) fusion problem. We carefully design four distinct ConvNet fusion architectures that integrate two-branch ConvNets on different DNNs stages, i.e., convolutional stages, fullyconnected stages, and decision stage, corre- sponding to information fusion on low level, middle level, high level, and confidence level. All these models outperform the strong baseline detector Faster-RCNN on KAIST multispectral pedestrian dataset (KAIST) [4].We reveal that our Halfway Fusion model -fusion of middle-level convolutional features, provides the best performance on multispectral pedestrian detection. Our Halfway Fusion model significantly reduces the missing rate of baseline method Faster R-CNN by 11%, yielding a 37% overall missing rate on KAIST, which is also 3.5% lower than the other proposed fusion models. We speculate that middle-level convolutional features from color and thermal branches are more compatible in fusion: they contain some semantic meanings and meanwhile do not completely throw all fine visual details.

show abstract

Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model

et al. 2013

View full text Add to dashboard Cite

show abstract

Efficient MR image reconstruction for compressed MR imaging

Huang¹,

Zhang²,

Metaxas³

2011

Medical Image Analysis

278

155

View full text Add to dashboard Cite

Towards robust and effective shape modeling: Sparse shape composition

Metaxas

Zhang

2012

Medical Image Analysis

240

159

View full text Add to dashboard Cite

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

et al. 2016

View full text Add to dashboard Cite

Query Specific Fusion for Image Retrieval

Zhang

Yang²,

Cour³

et al. 2012

137

144

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaoting Zhang

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions From CT Images

Multispectral Deep Neural Networks for Pedestrian Detection

Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model

Efficient MR image reconstruction for compressed MR imaging

Towards robust and effective shape modeling: Sparse shape composition

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

Query Specific Fusion for Image Retrieval

Contact Info

Product

Resources

About