2019
DOI: 10.48550/arxiv.1908.05787
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Integrating Multimodal Information in Large Pretrained Transformers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…MAG-BERT. The multimodal adaptation gate for BERT uses a gate structure connected to the BERT model to continuously improve the multimodal recognition accuracy of the model by modifying the BERT model with attention and adaptive vectors conditional on non-verbal behavior [38]. The AFR-BERT model consists of the following four main components.…”
Section: Plos Onementioning
confidence: 99%
“…MAG-BERT. The multimodal adaptation gate for BERT uses a gate structure connected to the BERT model to continuously improve the multimodal recognition accuracy of the model by modifying the BERT model with attention and adaptive vectors conditional on non-verbal behavior [38]. The AFR-BERT model consists of the following four main components.…”
Section: Plos Onementioning
confidence: 99%
“…The goal of multimodal sentiment analysis is to regress or classify the overall sentiment of an utterance using acoustic, visual, and language cues. Because multimodal sentiment analysis is a large and well-established field, we direct the reader to [2,21,29] for an overview of the field, and MISA [8], MAG [31], and M3ER [20] as representative of recent state of the art works. We restrict our scope to describing differences and similarities between our setting and the classical multimodal sentiment analysis setting.…”
Section: Related Work 21 Multimodal Sentiment Classificationmentioning
confidence: 99%
“…Besides, neural networks raise more attention in fusion especially since the appearance of RNN and LSTM [36,47]. More recently, transformer-based [51] fusion raises growing attention [1,48,37,16,21], especially after its application in vision [7]. In addition to that, there are also some modelagnostic fusion methods, including the simple concatenation [27,6,58] and element-wise operation [8,50].…”
Section: Related Workmentioning
confidence: 99%