Zhengming Ding scite author profile

Conventional zero-shot learning (ZSL) methods generally learn an embedding, e.g., visual-semantic mapping, to handle the unseen visual samples via an indirect manner. In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions. Specifically, we train a conditional Wasserstein GANs in which the generator synthesizes fake unseen features from noises and the discriminator distinguishes the fake from real via a minimax game. Considering that one semantic description can correspond to various synthesized visual samples, and the semantic description, figuratively, is the soul of the generated features, we introduce soul samples as the invariant side of generative zero-shot learning in this paper. A soul sample is the meta-representation of one class. It visualizes the most semantically-meaningful aspects of each sample in the same category. We regularize that each generated sample (the varying side of generative ZSL) should be close to at least one soul sample (the invariant side) which has the same class label with it. At the zero-shot recognition stage, we propose to use two classifiers, which are deployed in a cascade way, to achieve a coarse-to-fine result. Experiments on five popular benchmarks verify that our proposed approach can outperform state-of-the-art methods with significant improvements 1 .

show abstract

MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification

Wang

et al. 2021

View full text Add to dashboard Cite

To fully utilize the advances in omics technologies and achieve a more comprehensive understanding of human diseases, novel computational methods are required for integrative analysis of multiple types of omics data. Here, we present a novel multi-omics integrative method named Multi-Omics Graph cOnvolutional NETworks (MOGONET) for biomedical classification. MOGONET jointly explores omics-specific learning and cross-omics correlation learning for effective multi-omics data classification. We demonstrate that MOGONET outperforms other state-of-the-art supervised multi-omics integrative analysis approaches from different biomedical classification applications using mRNA expression data, DNA methylation data, and microRNA expression data. Furthermore, MOGONET can identify important biomarkers from different omics data types related to the investigated biomedical problems.

show abstract

3D Human Pose Estimation with Spatial and Temporal Transformers

et al. 2021

View full text Add to dashboard Cite

Domain Invariant and Class Discriminative Feature Learning for Visual Domain Adaptation

Song

Huang

et al. 2018

IEEE Trans. on Image Process.

222

103

View full text Add to dashboard Cite

Domain adaptation manages to build an effective target classifier or regression model for unlabeled target data by utilizing the well-labeled source data but lying different distributions. Intuitively, to address domain shift problem, it is crucial to learn domain invariant features across domains, and most existing approaches have concentrated on it. However, they often do not directly constrain the learned features to be class discriminative for both source and target data, which is of vital importance for the final classification. Therefore, in this paper, we put forward a novel feature learning method for domain adaptation to construct both domain invariant and class discriminative representations, referred to as DICD. Specifically, DICD is to learn a latent feature space with important data properties preserved, which reduces the domain difference by jointly matching the marginal and class-conditional distributions of both domains, and simultaneously maximizes the inter-class dispersion and minimizes the intra-class scatter as much as possible. Experiments in this paper have demonstrated that the class discriminative properties will dramatically alleviate the cross-domain distribution inconsistency, which further boosts the classification performance. Moreover, we show that exploring both domain invariance and class discriminativeness of the learned representations can be integrated into one optimization framework, and the optimal solution can be derived effectively by solving a generalized eigen-decomposition problem. Comprehensive experiments on several visual cross-domain classification tasks verify that DICD can outperform the competitors significantly.

show abstract

Low-Rank Common Subspace for Multi-view Learning

Ding

2014

117

View full text Add to dashboard Cite

Multi-view data is very popular in real-world applications, as different view-points and various types of sensors help to better represent data when fused across views or modalities. Samples from different views of the same class are less similar than those with the same view but different class. We consider a more general case that prior view information of testing data is inaccessible in multi-view learning. Traditional multiview learning algorithms were designed to obtain multiple viewspecific linear projections and would fail without this prior information available. That was because they assumed the probe and gallery views were known in advance, so the correct viewspecific projections were to be applied in order to better learn low-dimensional features. To address this, we propose a LowRank Common Subspace (LRCS) for multi-view data analysis, which seeks a common low-rank linear projection to mitigate the semantic gap among different views. The low-rank common projection is able to capture compatible intrinsic information across different views and also well-align the within-class samples from different views. Furthermore, with a low-rank constraint on the view-specific projected data and that transformed by the common subspace, the within-class samples from multiple views would concentrate together. Different from the traditional supervised multi-view algorithms, our LRCS works in a weakly supervised way, where only the view information gets observed. Such a common projection can make our model more flexible when dealing with the problem of lacking prior view information of testing data. Two scenarios of experiments, robust subspace learning and transfer learning, are conducted to evaluate our algorithm. Experimental results on several multi-view datasets reveal that our proposed method outperforms state-of-the-art, even when compared with some supervised learning methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhengming Ding

Leveraging the Invariant Side of Generative Zero-Shot Learning

MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification

3D Human Pose Estimation with Spatial and Temporal Transformers

Domain Invariant and Class Discriminative Feature Learning for Visual Domain Adaptation

Low-Rank Common Subspace for Multi-view Learning

Contact Info

Product

Resources

About