2009
DOI: 10.1007/978-3-642-04274-4_92
|View full text |Cite
|
Sign up to set email alerts
|

The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 19 publications
(10 citation statements)
references
References 16 publications
0
10
0
Order By: Relevance
“…Analysis of human emotions and processing recorded data, for instance the speech, facial expressions, hand gestures, body movements, etc. is a multidisciplinary field that has been emerging as a rich area of research in recent times [5,11,20,24,21,27]. In this paper multiple classifier systems for the classification of audio-visual features have been investigated, the numerical evaluation of the proposed emotion recognition systems has been carried out on the data sets of the AVEC challenge [23].…”
Section: Introductionmentioning
confidence: 99%
“…Analysis of human emotions and processing recorded data, for instance the speech, facial expressions, hand gestures, body movements, etc. is a multidisciplinary field that has been emerging as a rich area of research in recent times [5,11,20,24,21,27]. In this paper multiple classifier systems for the classification of audio-visual features have been investigated, the numerical evaluation of the proposed emotion recognition systems has been carried out on the data sets of the AVEC challenge [23].…”
Section: Introductionmentioning
confidence: 99%
“…Considering that the human perception rate for the Emo-DB was set to 84% [ 43 ], this mean value of 82.45% can be seen as a promising result. Moreover, this score outperforms the results of other works in the literature over the Emo-DB, like the scores obtained in [ 43 , 74 ], which reached accuracies of 79% and 77%, respectively, although these works analyzed the whole database and used different machine learning algorithms and audio features. The overall results demonstrate the good performance of the CSS stacking classification paradigm and confirms the robustness of this classification system to deal with the emotion recognition in speech over several conditions and datasets.…”
Section: Resultsmentioning
confidence: 50%
“…Six class-specific π -ESNs were trained independently over the sequences of the corresponding class. The π -ESNs were initialized as follows 4 . Fed by the 21 input units, the reservoir consisted of 100 state neurons (with transfer function tanh).…”
Section: Model Selection and Comparison With Static Classifiersmentioning
confidence: 99%
“…Vlasenko et al [1] apply Gaussian mixture models (GMM) and hidden Markov models (HMM) defined at both the frame-and turn-level representations of the audio signals, while Wagner et al [2] thoroughly analyze the behavior of HMMs and support vector machines (SVM) using Mel-cepstra [3] and energy-based features. Schwenker et al [4] investigate the use of the SVM-GMM Supervector approach relying on PLP and ModSpec features [5]. Dellaert et al [6] classify speech signals into 4 broad classes of emotions by applying a mixture of k-nearest neighbor [7] experts (with k = 11) estimated on different subsets of ✩ This paper has been recommended for acceptance by Sanniti di Baja.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation