A Hybrid Latent Space Data Fusion Method for Multimodal Emotion Recognition

Nemati, Shahla; Rohani, Reza; Basiri, Mohammad Ehsan; Abdar, Moloud; Yen, Neil Y.; Makarenkov, Vladimir

doi:10.1109/access.2019.2955637

Cited by 55 publications

(25 citation statements)

References 79 publications

(176 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data fusion is a critical step involved in multimodal emotion recognition for producing the estimation. The literature about emotional data fusion involves three data fusion techniques, which are early fusion (feature fusion) [42,43], late fusion (decision fusion) [44,45,46] and hybrid approaches [17,47,48].…”

Section: Background and Literature Review On Multimodal Emotion Rmentioning

confidence: 99%

“…In [64], a simple hybrid fusion was employed where the output of an early fusion classifier is feeding input to a decision-level fusion system. A recent study in [48] uses a latent space map for the fusion of audio and video modalities; and then, by using a Dempster-Shafer (DS) theory-based evidential fusion method, the projected features on the crossmodal space are fused with the textual modality.…”

Section: Background and Literature Review On Multimodal Emotion Rmentioning

confidence: 99%

See 1 more Smart Citation

Cross-Subject Multimodal Emotion Recognition Based on Hybrid Fusion

2020

View full text Add to dashboard Cite

Multimodal emotion recognition has gained traction in affective computing research community to overcome the limitations posed by the processing a single form of data and to increase recognition robustness. In this study, a novel emotion recognition system is introduced, which is based on multiple modalities including facial expressions, galvanic skin response (GSR) and electroencephalogram (EEG). This method follows a hybrid fusion strategy and yields a maximum one-subject-out accuracy of 81.2% and a mean accuracy of 74.2% on our bespoke multimodal emotion dataset (LUMED-2) for 3 emotion classes: sad, neutral and happy. Similarly, our approach yields a maximum one-subject-out accuracy of 91.5% and a mean accuracy of 53.8% on the Database for Emotion Analysis using Physiological Signals (DEAP) for varying numbers of emotion classes, 4 in average, including angry, disgust, afraid, happy, neutral, sad and surprised. The presented model is particularly useful in determining the correct emotional state in the case of natural deceptive facial expressions. In terms of emotion recognition accuracy, this study is superior to, or on par with, the reference subject-independent multimodal emotion recognition studies introduced in the literature.

show abstract

Section: Background and Literature Review On Multimodal Emotion Rmentioning

confidence: 99%

Section: Background and Literature Review On Multimodal Emotion Rmentioning

confidence: 99%

Cross-Subject Multimodal Emotion Recognition Based on Hybrid Fusion

2020

View full text Add to dashboard Cite

show abstract

“…To address this problem, we exploit an evidential fusion method based on the Dempster-Shafer (D-S) theory. D-S method is one of the most prominent score fusion methods which has been exploited in recent years for polarity detection [52], rating prediction [15], multimodal emotion recognition [53], and project risk assessment [54]. In terms of uncertainty in the validity of the hypotheses, Dempster and Shafer presented a general form of Bayesian theory in which multiple probabilities (e.g., derived from multiple classifiers' outputs) were used to determine the final output on the basis of evidence from uncertain outputs [51].…”

Section: Score Fusion Methods Using An Evidential Approachmentioning

confidence: 99%

“…This method gives different weights based on the correlation and the level of confidence. This method has been recently improved for the same task using a hybrid architecture consisting of latent information obtained through canonical correlation analysis (CCA) [61] and Marginal Fisher Analysis (MFA) [53]. As stated in some of previous studies, the original D-S theory has some limitations [51]; One of the most influential limitations is the production of contradictory results.…”

Section: Score Fusion Methods Using An Evidential Approachmentioning

confidence: 99%

Predicting the Helpfulness Score of Product Reviews Using an Evidential Score Fusion Method

2020

Self Cite

View full text Add to dashboard Cite

Everyday many online product sales websites and specialized reviewing forums publish a massive volume of human-generated product reviews. People use these reviews as valuable free source of knowledge when decide to buy products. Therefore, an accurate automated system for distinguishing useful reviews from non-useful ones is of great importance. This article presents a new model for specifying the usefulness of comments using the textual features extracted from the reviews. Various types of features including emotion-related, linguistic and text-related features, valence, arousal, and dominance (VAD) values, review-length and polarity of comments are exploited in this study. Moreover, two new algorithms are presented: an improved evidential algorithm for emotion recognition, and an algorithm for extracting VAD values for each review. Finally, the usefulness of reviews is predicted using the mentioned features and an improved Dempster-Shafer score fusion algorithm. The proposed method is applied to review datasets of Books and Video Games of Amazon. The results show that combining the features associated with emotions, features of VAD, and text-related features improves the accuracy of predicting the usefulness of reviews. Also, in comparison with the original Dempster-Shafer method, the precision of the improved Dempster-Shafer algorithm for both datasets is 15% and 11% higher, respectively.

show abstract

“…On the one hand, with regard to multi-modal data, no matter whether for supervised or semi-supervised learning, we should effectively capture the independent knowledge within each modality [35]. It is worthwhile to stress that independent knowledge from uni-modality is particularly challenging for this task since multi-modal sentiment analysis is performed on spoken language.…”

Section: Introductionmentioning

confidence: 99%

Multi-Modal Sentiment Classification With Independent and Interactive Knowledge via Semi-Supervised Learning

Zhang

Zhu

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Multi-modal sentiment analysis extends conventional text-based definition of sentiment analysis to a multi-modal setup where multiple relevant modalities are leveraged to perform sentiment analysis. In real applications, however, acquiring annotated multi-modal data is normally labor expensive and time-consuming. In this paper, we aim to reduce the annotation effort for multi-modal sentiment classification via semi-supervised learning. The key idea is to leverage the semi-supervised variational autoencoders to mine more information from unlabeled data for multi-modal sentiment analysis. Specifically, the mined information includes both the independent knowledge within single modality and the interactive knowledge among different modalities. Empirical evaluation demonstrates the great effectiveness of the proposed semi-supervised approach to multi-modal sentiment classification. INDEX TERMS Natural language processing, sentiment analysis, multimedia computing.

show abstract

A Hybrid Latent Space Data Fusion Method for Multimodal Emotion Recognition

Cited by 55 publications

References 79 publications

Cross-Subject Multimodal Emotion Recognition Based on Hybrid Fusion

Cross-Subject Multimodal Emotion Recognition Based on Hybrid Fusion

Predicting the Helpfulness Score of Product Reviews Using an Evidential Score Fusion Method

Multi-Modal Sentiment Classification With Independent and Interactive Knowledge via Semi-Supervised Learning

Contact Info

Product

Resources

About