2010
DOI: 10.1109/t-affc.2010.8
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

8
253
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 328 publications
(288 citation statements)
references
References 60 publications
8
253
0
Order By: Relevance
“…We quantified this fact by benchmarks reported on nine frequently used datasets in [7,8]. To raise these and advance speech-based emotion recognition systems' performance, we introduce a Generalized Discriminant Analysis (GerDA) [13] that is a recently proposed machine learning tool based on Deep Neural Networks (DNNs) for discriminative feature extraction from arbitrary distributed raw data.…”
Section: Introductionmentioning
confidence: 99%
“…We quantified this fact by benchmarks reported on nine frequently used datasets in [7,8]. To raise these and advance speech-based emotion recognition systems' performance, we introduce a Generalized Discriminant Analysis (GerDA) [13] that is a recently proposed machine learning tool based on Deep Neural Networks (DNNs) for discriminative feature extraction from arbitrary distributed raw data.…”
Section: Introductionmentioning
confidence: 99%
“…Most studies tend to overestimation in this respect: Acted data are often used rather than spontaneous data, results are reported on preselected prototypical data, and true speaker disjunctive partitioning is still less common than simple cross-validation. Even speaker disjunctive evaluation can give only little insight into the generalization ability of today's emotion recognition engines since training and test data as used for system development usually tend to be similar as far as recording conditions, noise overlay, language, and types of emotions are concerned [2]. For example, if a system builds upon a classifier using features extracted from adults' speech corpora to identify children's emotional state, its performance can be expected as very low.…”
Section: Related Workmentioning
confidence: 99%
“…For the comparability with FAU AEC, we additionally map the diverse emotion groups onto the valence axis in the dimensional emotion model. The mapping defined in [2] for cross-corpus experiments is used to generate labels for binary valence from the emotion categories in order to generate a unified set of labels. This mapping is given in Table I.…”
Section: Databasesmentioning
confidence: 99%
See 1 more Smart Citation
“…Such features include, but are not limited to, speech and its content, prosodic and paralinguistic features, eye gaze, facial expressions, body movements, or more advanced interpretations of such features such as the affective state, personality, mood or intentions of the user (e.g. [8,14,15]). …”
Section: Analysis Of Natural Interactionsmentioning
confidence: 99%