2017
DOI: 10.1007/s11042-017-4797-4
|View full text |Cite
|
Sign up to set email alerts
|

Multimedia retrieval based on non-linear graph-based fusion and partial least squares regression

Abstract: Heterogeneous sources of information, such as images, videos, text and metadata are often used to describe different or complementary views of the same multimedia object, especially in the online news domain and in large annotated image collections. The retrieval of multimedia objects, given a multimodal query, requires the combination of several sources of information in an efficient and scalable way. Towards this direction, we provide a novel unsupervised framework for multimodal fusion of visual and textual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
2
2

Relationship

3
4

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 32 publications
(48 reference statements)
0
6
0
Order By: Relevance
“…PLSR (Partial Least Squares Regression) [28] method is a regression modeling method of multidependent variable Y for multi-independent variable X. In the process of regression, the method considers both extracting the principal components in Y and X as much as possible (PCA-Principal Component Analysis) [29] and maximizing the correlation between the extracted principal components respectively (CCA).…”
Section: Introduction Of Classic Cross-media Retrieval Methodsmentioning
confidence: 99%
“…PLSR (Partial Least Squares Regression) [28] method is a regression modeling method of multidependent variable Y for multi-independent variable X. In the process of regression, the method considers both extracting the principal components in Y and X as much as possible (PCA-Principal Component Analysis) [29] and maximizing the correlation between the extracted principal components respectively (CCA).…”
Section: Introduction Of Classic Cross-media Retrieval Methodsmentioning
confidence: 99%
“…In the work of [134], the multimodal fusion of visual and textual similarities was explored from the perspective of an unsupervised framework. These similarities are based on visual features and concepts, as well as textual metadata with the purpose of integrating nonlinear graph-based fusion and PLS regression.…”
Section: Fusion Solutions For Various Iot Environmentsmentioning
confidence: 99%
“…A unifying model for unsupervised fusion of all similarities per modality has been presented in [10] and has been generalized to a non-linear fusion approach [2] that combines cross-media similarities with diffusion-based scores on the graph of items, in a non-linear but scalable way, for several modalities. Other methodologies for combining heterogeneous modalities involve Partial Least Squares [11], [12] and correlation matching, mapping multiple modalities to points in a common linear subspace. In [13] a video retrieval framework is proposed, which fuses textual and visual information in a non-linear way.…”
Section: Related Workmentioning
confidence: 99%
“…The values of α m , β m , γ m , m = 1, 2, 3 parameters are optimized following the methodology presented in [12], which involves keeping constant a set of parameters while examining the effect on the change of the others in the retrieval.…”
Section: B Fusion Of Multiple Modalitiesmentioning
confidence: 99%