Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conferenc
DOI: 10.1109/icics.2003.1292741
|View full text |Cite
|
Sign up to set email alerts
|

Detection of stress and emotion in speech using traditional and FFT based log energy features

Abstract: In this paper, a novel system for detection of human stress and emotion in speech is proposed. The system makes use of FFT based linear short time Log Frequency Power Coefficients (LFPC) and TEO based nonlinear LFPC features in both time and frequency domains. The performance of the proposed system is compared with the traditional approaches which use features of LPCC and MFCC. The comparison of each approach is performed using SUSAS (Speech Under Simulated and Actual Stress) and ESMBS (Emotional Speech of Man… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 38 publications
(16 citation statements)
references
References 11 publications
0
15
0
Order By: Relevance
“…Short-term power spectrum is presented by Mel-frequency cepstral coefficients, whereas vocal tract characteristics are presented by linear prediction coefficients. Logarithmic filtering of auditory system is characterized by log-frequency power coefficients using Fourier transform [7]. Voice quality measurements, such as jitter, harmonics-tonoise ratio, and shimmer, exploit the relation between vocal tract characteristics and emotion content.…”
Section: Introductionmentioning
confidence: 99%
“…Short-term power spectrum is presented by Mel-frequency cepstral coefficients, whereas vocal tract characteristics are presented by linear prediction coefficients. Logarithmic filtering of auditory system is characterized by log-frequency power coefficients using Fourier transform [7]. Voice quality measurements, such as jitter, harmonics-tonoise ratio, and shimmer, exploit the relation between vocal tract characteristics and emotion content.…”
Section: Introductionmentioning
confidence: 99%
“…In [51], the authors demonstrated that HMM performs better on log frequency power coefficient features than LPCC and MFCC. The emotion classification was done based on text-independent methods.…”
Section: ) Hidden Markov Model (Hmm)mentioning
confidence: 99%
“…Studies have shown that the low-level audio features of pitch, Zero-Crossing Rate (ZCR), Log-Energy (LE), Teager Energy Operator (TEO), and Mel-Frequency Cepstral Coefficients (MFCC) can determine the emotional state of music audio signals [60][61][62][63]. The extraction methods for these features are illustrated as follows:…”
Section: Low-level Audio Features Extractionmentioning
confidence: 99%