Multimedia retrieval based on non-linear graph-based fusion and partial least squares regression

Gialampoukidis, Ilias; Moumtzidou, Anastasia; Liparas, Dimitris; Tsikrika, Theodora; Vrochidis, Stefanos; Kompatsiaris, Ioannis

doi:10.1007/s11042-017-4797-4

Cited by 9 publications

(6 citation statements)

References 32 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…PLSR (Partial Least Squares Regression) [28] method is a regression modeling method of multidependent variable Y for multi-independent variable X. In the process of regression, the method considers both extracting the principal components in Y and X as much as possible (PCA-Principal Component Analysis) [29] and maximizing the correlation between the extracted principal components respectively (CCA).…”

Section: Introduction Of Classic Cross-media Retrieval Methodsmentioning

confidence: 99%

A Cross-Media Retrieval Method Based on Semisupervised Learning and Alternate Optimization

Zhu

Yang

et al. 2021

Mobile Information Systems

View full text Add to dashboard Cite

With the continuous advancement in Internet technology, we are gradually stepping into an era of big data where a large amount of multimedia data is produced every day at any given time. In order to properly utilize these data, the research on big data is also constantly evolving. Cross-media retrieval is a prime example, aiming at retrieving various forms of data, for example, text, image, audio, video, and other forms. The most difficult task for cross-media retrieval lies in the potential correlation between different modalities data and how to overcome the semantic gap. This paper proposes a cross-media retrieval method based on semisupervised learning and alternate optimization (SMDCR) to overcome the abovementioned difficulties, thereby improving the retrieval accuracy. The main advantage of this method is to make full use of the degree of correlation between the semantic information of the labeled data and unlabeled data. Simultaneously, we combine the linear regression term, correlation analysis term, and feature selection term into a joint cross-media learning framework. Furthermore, the projection matrices are trained with the alternate optimization method. Finally, experimental results on two public datasets demonstrate the effectiveness of the proposed method.

show abstract

Section: Introduction Of Classic Cross-media Retrieval Methodsmentioning

confidence: 99%

A Cross-Media Retrieval Method Based on Semisupervised Learning and Alternate Optimization

Zhu

Yang

et al. 2021

Mobile Information Systems

View full text Add to dashboard Cite

show abstract

“…In the work of [134], the multimodal fusion of visual and textual similarities was explored from the perspective of an unsupervised framework. These similarities are based on visual features and concepts, as well as textual metadata with the purpose of integrating nonlinear graph-based fusion and PLS regression.…”

Section: Fusion Solutions For Various Iot Environmentsmentioning

confidence: 99%

A Review of Multisensor Data Fusion Solutions in Smart Manufacturing: Systems and Trends

Tsanousa

Bektsis

Kyriakopoulos

et al. 2022

Sensors

Self Cite

View full text Add to dashboard Cite

Manufacturing companies increasingly become “smarter” as a result of the Industry 4.0 revolution. Multiple sensors are used for industrial monitoring of machines and workers in order to detect events and consequently improve the manufacturing processes, lower the respective costs, and increase safety. Multisensor systems produce big amounts of heterogeneous data. Data fusion techniques address the issue of multimodality by combining data from different sources and improving the results of monitoring systems. The current paper presents a detailed review of state-of-the-art data fusion solutions, on data storage and indexing from various types of sensors, feature engineering, and multimodal data integration. The review aims to serve as a guide for the early stages of an analytic pipeline of manufacturing prognosis. The reviewed literature showed that in fusion and in preprocessing, the methods chosen to be applied in this sector are beyond the state-of-the-art. Existing weaknesses and gaps that lead to future research goals were also identified.

show abstract

“…A unifying model for unsupervised fusion of all similarities per modality has been presented in [10] and has been generalized to a non-linear fusion approach [2] that combines cross-media similarities with diffusion-based scores on the graph of items, in a non-linear but scalable way, for several modalities. Other methodologies for combining heterogeneous modalities involve Partial Least Squares [11], [12] and correlation matching, mapping multiple modalities to points in a common linear subspace. In [13] a video retrieval framework is proposed, which fuses textual and visual information in a non-linear way.…”

Section: Related Workmentioning

confidence: 99%

“…The values of α m , β m , γ m , m = 1, 2, 3 parameters are optimized following the methodology presented in [12], which involves keeping constant a set of parameters while examining the effect on the change of the others in the retrieval.…”

Section: B Fusion Of Multiple Modalitiesmentioning

confidence: 99%

Fusion of Compound Queries with Multiple Modalities for Known Item Video Search

Gialampoukidis

Moumtzidou

Vrochidis

et al. 2018

2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP)

Self Cite

View full text Add to dashboard Cite

Multimedia collections are ubiquitous and very often contain hundreds of hours of video information. The retrieval of a particular scene of a video (Known Item Search) in a large collection is a difficult problem, considering the multimodal character of all video shots and the complexity of the query, either visual or textual. We tackle these challenges by fusing, first, multiple modalities in a nonlinear graph-based way for each subtopic of the query. In addition, we fuse the top retrieved video shots per sub-query to provide the final list of retrieved shots, which is then re-ranked using temporal information. The framework is evaluated in popular Known Item Search tasks in the context of video shot retrieval and provides the largest Mean Reciprocal Rank scores.

show abstract

Multimedia retrieval based on non-linear graph-based fusion and partial least squares regression

Cited by 9 publications

References 32 publications

A Cross-Media Retrieval Method Based on Semisupervised Learning and Alternate Optimization

A Cross-Media Retrieval Method Based on Semisupervised Learning and Alternate Optimization

A Review of Multisensor Data Fusion Solutions in Smart Manufacturing: Systems and Trends

Fusion of Compound Queries with Multiple Modalities for Known Item Video Search

Contact Info

Product

Resources

About