2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.466
|View full text |Cite
|
Sign up to set email alerts
|

Multi-label Cross-Modal Retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
90
0
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 195 publications
(91 citation statements)
references
References 26 publications
0
90
0
1
Order By: Relevance
“…Similar phenomenon can also be observed for other extensions of harmonized GPLVM models. Compared to existing subspace learning approaches [29], [64] that usually fix the dimensionality of the common space to be 10 as reported in their papers, we obtain a lower dimensional embedding to summarize the high dimensional data, which also shows the remarkable representation learning ability of our non-linear nonparametric model. Thus we can improve the efficiency of our model with latent embeddings of lower dimensionality.…”
Section: Dimensionality Of the Latent Spacementioning
confidence: 81%
“…Similar phenomenon can also be observed for other extensions of harmonized GPLVM models. Compared to existing subspace learning approaches [29], [64] that usually fix the dimensionality of the common space to be 10 as reported in their papers, we obtain a lower dimensional embedding to summarize the high dimensional data, which also shows the remarkable representation learning ability of our non-linear nonparametric model. Thus we can improve the efficiency of our model with latent embeddings of lower dimensionality.…”
Section: Dimensionality Of the Latent Spacementioning
confidence: 81%
“…In a pioneering work [20], Canonical Correlation Analysis [8] (CCA) was used to learn linear projections for each modality, by learning a set of canonical coefficients, that define a subspace where modalities are maximally correlated. This approach was extended for the multi-label scenario, by using label information to establish correspondences between instances [18]. A multi-view kernel CCA formulation is proposed in [4], where a joint space for visual, textual and semantic information is learned.…”
Section: Related Workmentioning
confidence: 99%
“…First, these methods simply and directly adopt single-class labels to measure the semantic relevance across modalities [9] [12]. In fact, in standard cross-modal benchmark datasets such as NUS-WIDE [6] and Microsoft COCO [15], an image instance can be assigned to multiple category labels [27], which is beneficial as it permits semantic relevance to be described more accurately across different modalities. Second, these methods enforce a narrowing of the modality gap by constraining the corresponding hash codes with certain pre-defined loss functions [4].…”
Section: Introductionmentioning
confidence: 99%