2021
DOI: 10.1109/tpami.2019.2942028
|View full text |Cite
|
Sign up to set email alerts
|

Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

Abstract: Multimodal learning aims to discover the relationship between multiple modalities. It has become an important research topic due to extensive multimodal applications such as cross-modal retrieval. This paper attempts to address the modality heterogeneity problem based on Gaussian process latent variable models (GPLVMs) to represent multimodal data in a common space. Previous multimodal GPLVM extensions generally adopt individual learning schemes on latent representations and kernel hyperparameters, which ignor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 45 publications
(82 reference statements)
0
15
0
Order By: Relevance
“…In this paper, Eq. ( 18) can be solved by the scaled conjugate gradient (SCG) technique [24] in the same manner as previous GPLVM-based approaches [22,25]. Since the proposed method assumes that the similarity matrices of the observations are generated from the latent variables in Eq.…”
Section: Feature Integration Via Semi-omgpmentioning
confidence: 99%
“…In this paper, Eq. ( 18) can be solved by the scaled conjugate gradient (SCG) technique [24] in the same manner as previous GPLVM-based approaches [22,25]. Since the proposed method assumes that the similarity matrices of the observations are generated from the latent variables in Eq.…”
Section: Feature Integration Via Semi-omgpmentioning
confidence: 99%
“…Each image is also annotated by 5 independent sentences via Amazon Mechanical Turk. Following [54], after removing images without labels, we randomly select 10,000 image-text pairs for testing, and the remaining 72,081 pairs are used for training. For image representation, we use the multi-layer CNN features extracted from conv 4 , conv 5 , fc 6 and fc 7 of a pre-trained VGG-19 model.…”
Section: E Performance Evaluation On Mscocomentioning
confidence: 99%
“…(2) Harmonized GPLVM [54]: a harmonized multi-modal GPLVM which performs topological alignment between the hyperparameter space of multi-modal GPLVM and the kernel matrix of the joint latent space. We report the results using the best variants hm-SimGP (tr) and hm-RSimGP (tr) with tracenorm kernel alignment.…”
Section: E Performance Evaluation On Mscocomentioning
confidence: 99%
“…The GPJM framework is based on Gaussian process latent variable models (GPLVMs) and their hierarchical extensions (22)(23)(24)(25), which pursue commonly shared latent representations across simultaneous observations from multiple measurement sources. GPLVMs have been successfully used in many machine-learning applications (26)(27)(28)(29)(30)(31).…”
Section: Gaussian Process As a Linking Functionmentioning
confidence: 99%