2016 23rd International Conference on Pattern Recognition (ICPR) 2016
DOI: 10.1109/icpr.2016.7899608
|View full text |Cite
|
Sign up to set email alerts
|

Fusion of classifier predictions for audio-visual emotion recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
8

Relationship

3
5

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 27 publications
0
6
0
Order By: Relevance
“…In [136] the authors focused on uncovering the emotional effect on the interrelation between speech and body gestures. They used prosody and mel-frequency cepstral coefficient (MFCC) [137], [138] for speech with three types of body gestures: head motion, lower and upper body motions to study the relationship between the two communication channels affected by the emotional state. Additionally, they proposed a framework for modeling the dynamics of speech-gesture interaction.…”
Section: Emotion Recognitionmentioning
confidence: 99%
“…In [136] the authors focused on uncovering the emotional effect on the interrelation between speech and body gestures. They used prosody and mel-frequency cepstral coefficient (MFCC) [137], [138] for speech with three types of body gestures: head motion, lower and upper body motions to study the relationship between the two communication channels affected by the emotional state. Additionally, they proposed a framework for modeling the dynamics of speech-gesture interaction.…”
Section: Emotion Recognitionmentioning
confidence: 99%
“…In the early fusion model, features from both modalities are concatenated and represented as an input to a three-layer multiperceptron classifier using a simple cross-entropy loss. In the late fusion model, the decisions (i.e, the confidence outputs from each modality) are used as input to a fully connected layer for prediction in a stacked manner [79]. Table 2 and 3 compare the accuracy on emotion recognition between convergence, enhancement, and synchrony models with the above two CNN baselines.…”
Section: Effectiveness In Multisensory Emotion Recognitionmentioning
confidence: 99%
“…To create machines that can recognize emotions, a great body of research has been dedicated to investigating the neural correlates of emotions. In these efforts, emotions have been elicited by different pictorial [1], musical [2][3][4] and video [5][6][7] stimuli. Music listening comprises a variety of psychological processes, e.g.…”
Section: Introductionmentioning
confidence: 99%