Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1494
|View full text |Cite
|
Sign up to set email alerts
|

Jointly Predicting Arousal, Valence and Dominance with Multi-Task Learning

Abstract: An appealing representation of emotions is the use of emotional attributes such as arousal (passive versus active), valence (negative versus positive) and dominance (weak versus strong). While previous studies have considered these dimensions as orthogonal descriptors to represent emotions, there are strong theoretical and practical evidences showing the interrelation between these emotional attributes. This observation suggests that predicting emotional attributes with a unified framework should outperform ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
75
0
3

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 112 publications
(86 citation statements)
references
References 27 publications
4
75
0
3
Order By: Relevance
“…Most modern techniques of cross-corpus speech emotion recognition use deep learning to build representations over lowlevel acoustic features. Many of these techniques incorporate tasks in addition to emotion to be able to learn more robust representations [6], [10].…”
Section: Introductionmentioning
confidence: 99%
“…Most modern techniques of cross-corpus speech emotion recognition use deep learning to build representations over lowlevel acoustic features. Many of these techniques incorporate tasks in addition to emotion to be able to learn more robust representations [6], [10].…”
Section: Introductionmentioning
confidence: 99%
“…0.7, 0.2, and 1.0 for α, β, and γ, respectively. Our proposed MTL learning with three parameters outperforms STL and previous MTL [4]. For STL approaches, both arousal and valence obtained the highest CCC score when its attribute is optimized.…”
Section: Multitask Learning Resultsmentioning
confidence: 78%
“…where α, β, and γ are the weighting factors for each emotion dimension loss function. In a common approach, α, β, and γ are set to be 1, while in [4], γ is set to be 1 − (α + β) to minimize MSE. In that approach, all weighting factors are in range 0-1.…”
Section: Multitask Learning Based On CCC Lossmentioning
confidence: 99%
“…In the first experiment, we discuss the influence of different feature selection manners and prediction manners. Since multi-task learning shows its performance in [6,20,21], we treat the multi-task learning as the comparison approach in the second experiment. In the last experiment, we show the advantages of adding the contrastive loss during the training phase.…”
Section: Evaluation Resultsmentioning
confidence: 99%