Multiple windowed spectral features for emotion recognition

Attabi, Yazid; Alam, Jahangir; Dumouchel, Pierre; Kenny, Patrick; O’Shaughnessy, D.

doi:10.1109/icassp.2013.6639126

Cited by 24 publications

(9 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With speaker-specific z-normalization, we obtained a UAR of 46.36%. Respectively, these results are significantly better than 44.0% and 44.8% UAR, the current state of the art without [3] and with [4] speaker normalization (one-tailed binomial test, p ≈ 0.002).…”

Section: Introductionmentioning

confidence: 84%

“…Hassan et al [8] achieved a 42.7% UAR by applying importance weights within a SVM to compensate for differences between training and testing conditions. Attabi et al [3] used GMM to model multiple windowed spectrum estimates of Perceptual Linear Prediction (PLP) coefficients, resulting in a 44.0% UAR. The best known result, 44.8% UAR, was achieved with a two-pass system in which a high-level SVM classified each test utterance using ranking scores obtained from five low-level SVMs, one for each emotion [4].…”

Section: Aibo Benchmarkmentioning

confidence: 99%

See 1 more Smart Citation

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks

Provost

2013

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

View full text Add to dashboard Cite

Research in emotion recognition seeks to develop insights into the temporal properties of emotion. However, automatic emotion recognition from spontaneous speech is challenging due to non-ideal recording conditions and highly ambiguous ground truth labels. Further, emotion recognition systems typically work with noisy high-dimensional data, rendering it difficult to find representative features and train an effective classifier. We tackle this problem by using Deep Belief Networks, which can model complex and non-linear high-level relationships between low-level features. We propose and evaluate a suite of hybrid classifiers based on Hidden Markov Models and Deep Belief Networks. We achieve state-of-theart results on FAU Aibo, a benchmark dataset in emotion recognition [1]. Our work provides insights into important similarities and differences between speech and emotion.

show abstract

Section: Introductionmentioning

confidence: 84%

Section: Aibo Benchmarkmentioning

confidence: 99%

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks

Provost

2013

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

View full text Add to dashboard Cite

show abstract

“…Ia k (m ) k� 2 (15) viii) Spectral Slope SSL(m): It is a measure of voice quality found using linear regression given by Eq. (16), and it represents the amount of decrease of spectral amplitude based on human perception.…”

Section: Spectral Featuresmentioning

confidence: 99%

“…Multi-tapers have been widely used recently for speaker recognition and verification purposes [12][13][14]. In [15], the authors applied multi-tapers for emotion recognition purpose but using only MFCC and perceptual linear prediction (PLP) features. In this paper, various spectral features are used from both conventional and multi-taper spectral estimates to recognize speech emotions.…”

Section: Introductionmentioning

confidence: 99%

Multi-taper spectral features for emotion recognition from speech

Chapaneri

Jayaswal

2015

2015 International Conference on Industrial Instrumentation and Control (ICIC)

View full text Add to dashboard Cite

In this paper, the performance of mUlti-taper spectral estimate is investigated relative to conventional single taper estimate for the application of emotion recognition from speech signals. Typically, a single taper/window helps in reducing bias of the estimate, but due to its high variance, the resulting spectral features tend to give poor recognition performance. The weighted averages of the multi-tapered uncorrelated eigen spectra results in more discriminative spectral features, thus increasing the overall performance. We demonstrate that the application of six Multi-peak mUlti-tapers with support vector machine results in 81 % classification accuracy on seven emotions from Berlin emotion database considering only spectral features, compared to 72% using conventional Hamming window method.

show abstract

“…The multitaper approach have been used in several domains including geophysical applications [11], speaker verification [12], [13] and emotion recognition [14], [15] and it has been shown to improve the performance and robustness of different systems. However, this method has not been used in stress speech recognition applications.…”

Section: Introductionmentioning

confidence: 99%

Multitaper MFCC Features for Acoustic Stress Recognition from Speech

Besbes¹,

Lachiri²

2017

ijacsa

View full text Add to dashboard Cite

Abstract-Ameliorating the performances of speech recognition system is a challenging problem interesting recent researchers. In this paper, we compare two extraction methods of Mel Frequency Cepstral Coefficients used to represent stressed speech utterances in order to obtain best performances. The first method known as traditional is based on single window (taper) generally the Hamming window and the second one is a novel technique developed with multitapers instead of a single taper. The extracted features are then classified using the multiclass Support Vector Machines. Experimental results on the SUSAS database have shown that the multitaper MFCC features outperform the conventional MFCCs.

show abstract

Multiple windowed spectral features for emotion recognition

Cited by 24 publications

References 26 publications

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks

Multi-taper spectral features for emotion recognition from speech

Multitaper MFCC Features for Acoustic Stress Recognition from Speech

Contact Info

Product

Resources

About