2014
DOI: 10.1007/978-3-319-10605-2_38
|View full text |Cite
|
Sign up to set email alerts
|

Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation

Abstract: Abstract. Most existing zero-shot learning approaches exploit transfer learning via an intermediate-level semantic representation such as visual attributes or semantic word vectors. Such a semantic representation is shared between an annotated auxiliary dataset and a target dataset with no annotation. A projection from a low-level feature space to the semantic space is learned from the auxiliary dataset and is applied without adaptation to the target dataset. In this paper we identify an inherent limitation wi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
206
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 194 publications
(207 citation statements)
references
References 26 publications
1
206
0
Order By: Relevance
“…In [2], Akata et al propose an embedding-based framework that regards all of the defined attributes as a whole representation. Many recent approaches adopt such an embedding manner and achieve promising results [13,4,33,15,7,19,39,8,23]. Besides, similarity-based frameworks also adopt the embedding approach [24,40,41,34,8,25].…”
Section: Related Workmentioning
confidence: 99%
“…In [2], Akata et al propose an embedding-based framework that regards all of the defined attributes as a whole representation. Many recent approaches adopt such an embedding manner and achieve promising results [13,4,33,15,7,19,39,8,23]. Besides, similarity-based frameworks also adopt the embedding approach [24,40,41,34,8,25].…”
Section: Related Workmentioning
confidence: 99%
“…For embeddingbased approaches, direct attribute prediction (DAP) and indirect attribute prediction (IAP) [22] are the two fundamental paradigms. More sophisticated zero-shot models are also proposed, such as max-margin semi-supervised learning for exploiting the unlabeled data [23], and multi-view zero-shot learning for utilizing multiple data sources [15]. Multiple knowledge bases such as Wikipedia [14,26], web search logs [25] and human-annotated images [22] are compared.…”
Section: Problem Setting 2: Zero-shot Learningmentioning
confidence: 99%
“…These methods focus on issues such as prediction time complexity and tail labels that consist of most of the labels but have scarce true positives [1,[4][5][6][7][8]19,28,33,39]. Methods in the second category make the weaker assumption that only a small number of classes have labeled data (the "seen" classes), while the vast remaining classes have zero labeled data (the "unseen" classes) and need to be predicted, resulting in the so-called zero-shot learning problem [12,14,15,[22][23][24][25][26]32,35]. Both lines of research share the common assumption that the label space is in a low-dimensional space and the original labels can be expressed using a much smaller set of signals, and model training and prediction in this compressed space are thus more efficient.…”
Section: Introductionmentioning
confidence: 99%
“…Semantic attributes [22] have gained wide popularity as an effective representation in the broader vision community with application to object recognition [22,23], person identification [21], and action recognition [24,25]. However application of attributes to faces [26] or face recognition [27] has been relatively limited, and their potential for bridging the cross-modal gap is not yet explored.…”
Section: Semantic Attributesmentioning
confidence: 99%