Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1207
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

Abstract: Multimodal affective computing, learning to recognize and interpret human affect and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract levels, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fus… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
63
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 121 publications
(71 citation statements)
references
References 30 publications
0
63
0
Order By: Relevance
“…Following prior practice Gu et al 2018), we adopted the same feature extraction scheme for language, visual and acoustic modalities.…”
Section: Unimodal Feature Representationsmentioning
confidence: 99%
“…Following prior practice Gu et al 2018), we adopted the same feature extraction scheme for language, visual and acoustic modalities.…”
Section: Unimodal Feature Representationsmentioning
confidence: 99%
“…We compare HFFN with following multimodal algorithms: RMFN (Liang et al, 2018a), MFN (Zadeh et al, 2018a), MCTN (Pham et al, 2019), BC-LSTM (Poria et al, 2017b), TFN , MARN (Zadeh et al, 2018b), LMF ), MFM (Tsai et al, 2019, MR-RF (Barezi et al, 2018), FAF (Gu et al, 2018b), RAVEN (Wang et al, 2019), GMFN (Zadeh et al, 2018c), Memn2n (Sukhbaatar et al, 2015), MM-B2 , CHFusion (Majumder et al, 2018), SVM Trees (Rozgic et al, 2012), CMN , C-MKL (Poria et al, 2016b) and CAT-LSTM (Poria et al, 2017c).…”
Section: Comparison With Baselinesmentioning
confidence: 99%
“…Authors use the same feature set with the one described in subsection 3.1. FAF [20]: uses hierarchical attention with bidirectional gated recurrent units at word level and a fine tuning attention mechanism at each extracted representation. The extracted feature vector is passed to a CNN which performs the final decision.…”
Section: Baseline Modelsmentioning
confidence: 99%
“…In [19], a word level alignment between all modalities is proposed. Following the aforementioned idea, authors in [20] use a hierarchical attention architecture. Specifically, they pretrain recurrent networks in order to perform single modal sentiment classification.…”
Section: Introductionmentioning
confidence: 99%