Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis

Chauhan, Dushyant Singh; Akhtar, Md. Shad; Ekbal, Asif; Bhattacharyya, Pushpak

doi:10.18653/v1/d19-1566

Cited by 63 publications

(32 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…These embeddings have been used to train a Logistic Regression model for the two downstream tasks of Sentiment Analysis and Emotion Recognition. Following the existing literature [8,2,5,15,3,16,10,17,9,13], we report the binary accuracy and weighted averaged f1 metrics on sentiment for MOSEI, and on each of the 4 emotions of IEMOCAP in an one-vs-all manner.…”

Section: Results On Downstream Classificationmentioning

confidence: 99%

“…In this paper we follow a widely used approach for basic unimodal feature extraction and multimodal alignment, similar to the one in various proposed methods, such as [8,2,5,15,3,16,10,17,9,13]. More specifically, after extracting the features on each modality(visual, textual and aural), the procedure of word-level alignment that was firstly used for this task in [17], is performed.…”

Section: -D Multimodal Sequence Representationmentioning

confidence: 99%

“…Different types of fusion approaches, such as early fusion [1], memory fusion [2,3], multistage fusion [4] and tensor fusion [5] have been examined. Modifications on LSTM architectures for multiview learning [6] or context-dependent analysis [7] have also been proposed, as well as the concept of attention on recurrent networks [8], context-aware attention [9] and some transformer architectures [10] have all been researched in depth.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Unsupervised Multimodal Language Representations using Convolutional Autoencoders

Koromilas¹,

Giannakopoulos²

2021

Preprint

View full text Add to dashboard Cite

Multimodal Language Analysis is a demanding area of research, since it is associated with two requirements: combining different modalities and capturing temporal information. During the last years, several works have been proposed in the area, mostly centered around supervised learning in downstream tasks. In this paper we propose extracting unsupervised Multimodal Language representations that are universal and can be applied to different tasks. Towards this end, we map the word-level aligned multimodal sequences to 2-D matrices and then use Convolutional Autoencoders to learn embeddings by combining multiple datasets. Extensive experimentation on Sentiment Analysis (MOSEI) and Emotion Recognition (IEMOCAP) indicate that the learned representations can achieve near-state-of-the-art performance with just the use of a Logistic Regression algorithm for downstream classification. It is also shown that our method is extremely lightweight and can be easily generalized to other tasks and unseen data with small performance drop and almost the same number of parameters. The proposed multimodal representation models are open-sourced and will help grow the applicability of Multimodal Language.

show abstract

Section: Results On Downstream Classificationmentioning

confidence: 99%

Section: -D Multimodal Sequence Representationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Unsupervised Multimodal Language Representations using Convolutional Autoencoders

Koromilas¹,

Giannakopoulos²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The experimental results of their research show that if the multimodal features are contextually integrated, then it can result in better performance (up to an accuracy level of 91.39%) as compared to the features that are extracted by a unimodal (having an accuracy value of up to 89.24%). Chauhan et al [7] have devised a multimodal emotions analysis model based on the recurrent neural network (RNN). This model also learns the interaction within different participating models by using an auto-encoder algorithm-based mechanism.…”

Section: Literature Reviewmentioning

confidence: 99%

Urdu Sentiment Analysis via Multimodal Data Mining Based on Deep Learning Algorithms

et al. 2021

View full text Add to dashboard Cite

“…Sentiment analysis has been studied under different settings in the literature (e.g., sentence-level, aspect-level, cross-domain) Chauhan et al, 2019;Hu et al, 2019). For ABSA, the early works have performed feature engineering to produce useful features for the statistical classification models (e.g., SVM) (Wagner et al, 2014).…”

Section: Related Workmentioning

confidence: 99%

Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional Networks and Syntax-based Regulation

Veyseh

Nouri²,

Dernoncourt

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Aspect-based Sentiment Analysis (ABSA) seeks to predict the sentiment polarity of a sentence toward a specific aspect. Recently, it has been shown that dependency trees can be integrated into deep learning models to produce the state-of-the-art performance for ABSA. However, these models tend to compute the hidden/representation vectors without considering the aspect terms and fail to benefit from the overall contextual importance scores of the words that can be obtained from the dependency tree for ABSA. In this work, we propose a novel graph-based deep learning model to overcome these two issues of the prior work on ABSA. In our model, gate vectors are generated from the representation vectors of the aspect terms to customize the hidden vectors of the graph-based models toward the aspect terms. In addition, we propose a mechanism to obtain the importance scores for each word in the sentences based on the dependency trees that are then injected into the model to improve the representation vectors for ABSA. The proposed model achieves the state-of-the-art performance on three benchmark datasets.

show abstract

Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis

Cited by 63 publications

References 25 publications

Unsupervised Multimodal Language Representations using Convolutional Autoencoders

Unsupervised Multimodal Language Representations using Convolutional Autoencoders

Urdu Sentiment Analysis via Multimodal Data Mining Based on Deep Learning Algorithms

Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional Networks and Syntax-based Regulation

Contact Info

Product

Resources

About