2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
DOI: 10.1109/icassp.2003.1202279
|View full text |Cite
|
Sign up to set email alerts
|

Hidden Markov model-based speech emotion recognition

Abstract: In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within the first method a global statistics framework of an utterance is classified by Gaussian mixture models using derived features of the raw pitch and energy contour of the speech signal. A second method introduces increased temporal complexity applying continuous hidden Markov models considering several states using low-level instantaneous fe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
188
0
3

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 246 publications
(191 citation statements)
references
References 4 publications
0
188
0
3
Order By: Relevance
“…Response of the subject is measured using different modalities, like psychophysiological responses (heart rate, blood pressure, skin conductance and temperature), but also through verbal and nonverbal responses. The analysis of vocal features can give important information about the emotional state of a person [11]. Emotions can be expressed non-verbally, through prosodic structure of an utterance, but also verbally, by directly expressing thoughts and feelings.…”
Section: Introductionmentioning
confidence: 99%
“…Response of the subject is measured using different modalities, like psychophysiological responses (heart rate, blood pressure, skin conductance and temperature), but also through verbal and nonverbal responses. The analysis of vocal features can give important information about the emotional state of a person [11]. Emotions can be expressed non-verbally, through prosodic structure of an utterance, but also verbally, by directly expressing thoughts and feelings.…”
Section: Introductionmentioning
confidence: 99%
“…Although utterance level approaches are the most common (Schuller et al, 2005;Cichosz and Slot, 2005;Oudeyer, 2003), segment based approaches are becoming more popular. Segment based approaches try to model the shape of acoustic contours more closely as in (Katz et al, 1996;Schuller et al, 2003;Batliner et al, 2003;Batliner et al, 2005;Rotaru and Litman, 2005). In all of the mentioned studies, a single speech corpus is used for training and testing a machine learned classifier.…”
Section: Introductionmentioning
confidence: 99%
“…This approach employs machine learning techniques such as Hidden Markov Models [13]. Speech and speaker recognition techniques: short-term features and statistical modeling (GMM, HMM) have been successfully combined with a traditional turn based level approach [15].…”
Section: Machine Learning Based Unitsmentioning
confidence: 99%
“…Indeed, the standard unit is the speaker turn level [12][13][14] which consists in the characterization of a whole sentence by a large number of features. This approach assumes that the emotional state is not changing during the speaker turn level.…”
Section: Units For Emotional Speech Characterizationmentioning
confidence: 99%