Zhenyong Fu scite author profile

Zero-shot learning (ZSL) can be considered as a special case of transfer learning where the source and target domains have different tasks/label spaces and the target domain is unlabelled, providing little guidance for the knowledge transfer. A ZSL method typically assumes that the two domains share a common semantic representation space, where a visual feature vector extracted from an image/video can be projected/embedded using a projection function. Existing approaches learn the projection function from the source domain and apply it without adaptation to the target domain. They are thus based on naive knowledge transfer and the learned projections are prone to the domain shift problem. In this paper a novel ZSL method is proposed based on unsupervised domain adaptation. Specifically, we formulate a novel regularised sparse coding framework which uses the target domain class labels' projections in the semantic space to regularise the learned target domain projection thus effectively overcoming the projection domain shift problem. Extensive experiments on four object and action recognition benchmark datasets show that the proposed ZSL method significantly outperforms the state-of-the-arts.

show abstract

Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation

Hospedales

Xiang

et al. 2014

194

205

View full text Add to dashboard Cite

Abstract. Most existing zero-shot learning approaches exploit transfer learning via an intermediate-level semantic representation such as visual attributes or semantic word vectors. Such a semantic representation is shared between an annotated auxiliary dataset and a target dataset with no annotation. A projection from a low-level feature space to the semantic space is learned from the auxiliary dataset and is applied without adaptation to the target dataset. In this paper we identify an inherent limitation with this approach. That is, due to having disjoint and potentially unrelated classes, the projection functions learned from the auxiliary dataset/domain are biased when applied directly to the target dataset/domain. We call this problem the projection domain shift problem and propose a novel framework, transductive multi-view embedding, to solve it. It is 'transductive' in that unlabelled target data points are explored for projection adaptation, and 'multi-view' in that both lowlevel feature (view) and multiple semantic representations (views) are embedded to rectify the projection shift. We demonstrate through extensive experiments that our framework (1) rectifies the projection shift between the auxiliary and target domains, (2) exploits the complementarity of multiple semantic representations, (3) achieves state-of-the-art recognition results on image and video benchmark datasets, and (4) enables novel cross-view annotation tasks.

show abstract

Zero-shot object recognition by semantic manifold distance

et al. 2015

View full text Add to dashboard Cite

Object recognition by zero-shot learning (ZSL) aims to recognise objects without seeing any visual examples by learning knowledge transfer between seen and unseen object classes. This is typically achieved by exploring a semantic embedding space such as attribute space or semantic word vector space. In such a space, both seen and unseen class labels, as well as image features can be embedded (projected), and the similarity between them can thus be measured directly. Existing works differ in what embedding space is used and how to project the visual data into the semantic embedding space. Yet, they all measure the similarity in the space using a conventional distance metric (e.g. cosine) that does not consider the rich intrinsic structure, i.e. semantic manifold, of the semantic categories in the embedding space. In this paper we propose to model the semantic manifold in an embedding space using a semantic class label graph. The semantic manifold structure is used to redefine the distance metric in the semantic embedding space for more effective ZSL. The proposed semantic manifold distance is computed using a novel absorbing Markov chain process (AMP), which has a very efficient closedform solution. The proposed new model improves upon and seamlessly unifies various existing ZSL algorithms. Extensive experiments on both the large scale ImageNet dataset and the widely used Animal with Attribute (AwA) dataset show that our model outperforms significantly the state-ofthe-arts.

show abstract

Person Re-Identification by Unsupervised $$\ell _1$$ Graph Learning

Kodirov

Xiang

et al. 2016

111

View full text Add to dashboard Cite

Learning the Redundancy-Free Features for Generalized Zero-Shot Object Recognition

Han

Yang

2020

View full text Add to dashboard Cite

Learning from Weak and Noisy Labels for Semantic Segmentation

Xiang

et al. 2017

IEEE Trans. Pattern Anal. Mach. Intell.

109

View full text Add to dashboard Cite

Multi-Scale Dynamic Feature Encoding Network for Image Demoiréing

Cheng

Yang

2019

View full text Add to dashboard Cite

The prevalence of digital sensors, such as digital cameras and mobile phones, simplifies the acquisition of photos. Digital sensors, however, suffer from producing Moiré when photographing objects having complex textures, which deteriorates the quality of photos. Moiré spreads across various frequency bands of images and is a dynamic texture with varying colors and shapes, which pose two main challenges in demoiréing-an important task in image restoration. In this paper, towards addressing the first challenge, we design a multi-scale network to process images at different spatial resolutions, obtaining features in different frequency bands, and thus our method can jointly remove moiré in different frequency bands. Towards solving the second challenge, we propose a dynamic feature encoding module (DFE), embedded in each scale, for dynamic texture. Moiré pattern can be eliminated more effectively via DFE. Our proposed method, termed Multi-scale convolutional network with Dynamic feature encoding for image DeMoiréing (MDDM), can outperform the state of the arts in fidelity as well as perceptual on benchmarks.

show abstract

Zero-Shot Learning on Semantic Class Prototype Graph

Xiang

Kodirov

et al. 2018

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Zero-Shot Learning (ZSL) for visual recognition is typically achieved by exploiting a semantic embedding space. In such a space, both seen and unseen class labels as well as image features can be embedded so that the similarity among them can be measured directly. In this work, we consider that the key to effective ZSL is to compute an optimal distance metric in the semantic embedding space. Existing ZSL works employ either euclidean or cosine distances. However, in a high-dimensional space where the projected class labels (prototypes) are sparse, these distances are suboptimal, resulting in a number of problems including hubness and domain shift. To overcome these problems, a novel manifold distance computed on a semantic class prototype graph is proposed which takes into account the rich intrinsic semantic structure, i.e., semantic manifold, of the class prototype distribution. To further alleviate the domain shift problem, a new regularisation term is introduced into a ranking loss based embedding model. Specifically, the ranking loss objective is regularised by unseen class prototypes to prevent the projected object features from being biased towards the seen prototypes. Extensive experiments on four benchmarks show that our method significantly outperforms the state-of-the-art.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhenyong Fu

Unsupervised Domain Adaptation for Zero-Shot Learning

Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation

Zero-shot object recognition by semantic manifold distance

Person Re-Identification by Unsupervised $$\ell _1$$ Graph Learning

Learning the Redundancy-Free Features for Generalized Zero-Shot Object Recognition

Learning from Weak and Noisy Labels for Semantic Segmentation

Multi-Scale Dynamic Feature Encoding Network for Image Demoiréing

Zero-Shot Learning on Semantic Class Prototype Graph

Contact Info

Product

Resources

About