Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

Song, Guoli; Wang, Shuhui; Huang, Qingming

doi:10.1109/tpami.2019.2942028

Cited by 18 publications

(15 citation statements)

References 45 publications

(82 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper, Eq. ( 18) can be solved by the scaled conjugate gradient (SCG) technique [24] in the same manner as previous GPLVM-based approaches [22,25]. Since the proposed method assumes that the similarity matrices of the observations are generated from the latent variables in Eq.…”

Section: Feature Integration Via Semi-omgpmentioning

confidence: 99%

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Kamikawa

Maeda

Ogawa

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper presents a method of feature integration via semisupervised ordinally multi-modal Gaussian process latent variable model (Semi-OMGP). The proposed method transforms multimodal features into common latent variables suitable for users' interest level estimation. For dealing with the multi-modal features, the proposed method newly derives Semi-OMGP. Semi-OMGP has two contributions. First, Semi-OMGP is suitable for integration between heterogeneous modalities with different distributions by assuming that the similarity matrices of these modalities as observations are generated from latent variables. Second, Semi-OMGP can efficiently use label information by introducing an operator considering the ordinal grade into the prior distribution of latent variables when obtained label information is partially given. Semi-OMGP can simultaneously realize the above contributions, and successful multi-modal feature integration becomes feasible. Experimental results show the effectiveness of the proposed method.

show abstract

Section: Feature Integration Via Semi-omgpmentioning

confidence: 99%

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Kamikawa

Maeda

Ogawa

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…Each image is also annotated by 5 independent sentences via Amazon Mechanical Turk. Following [54], after removing images without labels, we randomly select 10,000 image-text pairs for testing, and the remaining 72,081 pairs are used for training. For image representation, we use the multi-layer CNN features extracted from conv 4 , conv 5 , fc 6 and fc 7 of a pre-trained VGG-19 model.…”

Section: E Performance Evaluation On Mscocomentioning

confidence: 99%

“…(2) Harmonized GPLVM [54]: a harmonized multi-modal GPLVM which performs topological alignment between the hyperparameter space of multi-modal GPLVM and the kernel matrix of the joint latent space. We report the results using the best variants hm-SimGP (tr) and hm-RSimGP (tr) with tracenorm kernel alignment.…”

Section: E Performance Evaluation On Mscocomentioning

confidence: 99%

Cross-Modal Hashing by l_p -Norm Multiple Subgraph Combination

2021

View full text Add to dashboard Cite

With the explosion of multi-modal Web data, effective and efficient techniques are in urgent need for cross-modal data retrieval with relevant semantics. Among all the possible solutions, the hashing techniques provide compact and measurable binary representation, thus gain much attention in related research domain. To better deal with diversified real world data, we propose MSC, a novel cross-modal hashing approach based on the generalized l p -norm Multiple Subgraph Combination. Specifically, by jointly considering the content similarity, the correspondence and other weak correlation among cross-modal documents, we build the intra-modal similarity with multiple affinity subgraphs, and encode the intermodal correlation with a bipartite subgraph. Then these subgraphs are combined into one multi-modal similarity graph for all the data from heterogeneous modalities, where the weights of multiple intra-modal visual similarity subgraphs are regularized by l p -norm penalty. The optimal hash codes and the combination coefficients are learned simultaneously by efficient alternating optimization. The hash functions for different modalities are learned separately by utilizing nonlinear classification models, encoding the complicated semantic relations among cross-modal data. Experiments on challenging real world datasets demonstrate the advantage of our method over existing approaches.INDEX TERMS Cross-modal hashing, feature combination, information fusion.

show abstract

“…The GPJM framework is based on Gaussian process latent variable models (GPLVMs) and their hierarchical extensions (22)(23)(24)(25), which pursue commonly shared latent representations across simultaneous observations from multiple measurement sources. GPLVMs have been successfully used in many machine-learning applications (26)(27)(28)(29)(30)(31).…”

Section: Gaussian Process As a Linking Functionmentioning

confidence: 99%

Gaussian process linking functions for mind, brain, and behavior

Bahg

Evans

Galdo

et al. 2020

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

The link between mind, brain, and behavior has mystified philosophers and scientists for millennia. Recent progress has been made by forming statistical associations between manifest variables of the brain (e.g., electroencephalogram [EEG], functional MRI [fMRI]) and manifest variables of behavior (e.g., response times, accuracy) through hierarchical latent variable models. Within this framework, one can make inferences about the mind in a statistically principled way, such that complex patterns of brain–behavior associations drive the inference procedure. However, previous approaches were limited in the flexibility of the linking function, which has proved prohibitive for understanding the complex dynamics exhibited by the brain. In this article, we propose a data-driven, nonparametric approach that allows complex linking functions to emerge from fitting a hierarchical latent representation of the mind to multivariate, multimodal data. Furthermore, to enforce biological plausibility, we impose both spatial and temporal structure so that the types of realizable system dynamics are constrained. To illustrate the benefits of our approach, we investigate the model’s performance in a simulation study and apply it to experimental data. In the simulation study, we verify that the model can be accurately fitted to simulated data, and latent dynamics can be well recovered. In an experimental application, we simultaneously fit the model to fMRI and behavioral data from a continuous motion tracking task. We show that the model accurately recovers both neural and behavioral data and reveals interesting latent cognitive dynamics, the topology of which can be contrasted with several aspects of the experiment.

show abstract

Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

Cited by 18 publications

References 45 publications

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Cross-Modal Hashing by l_p -Norm Multiple Subgraph Combination

Gaussian process linking functions for mind, brain, and behavior

Contact Info

Product

Resources

About

Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

Cited by 18 publications

References 45 publications

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Feature Integration via Semi-Supervised Ordinally Multi-Modal Gaussian Process Latent Variable Model

Cross-Modal Hashing by lp -Norm Multiple Subgraph Combination

Gaussian process linking functions for mind, brain, and behavior

Contact Info

Product

Resources

About

Cross-Modal Hashing by l_p -Norm Multiple Subgraph Combination