Interspeech 2015 2015
DOI: 10.21437/interspeech.2015-329
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of excitation source features of speech for emotion recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 27 publications
(5 citation statements)
references
References 21 publications
0
5
0
Order By: Relevance
“…These features were shown to be useful for discriminating phonation types in speech and singing [13], [41], [42]. SoE was shown to be proportional to the rate of glottal closure, the EoE feature was shown to capture the vocal effort, and the loudness measure was shown to capture the abruptness of the glottal closure [43], [44]. The energy of the ZFF signal at glottal closure is also used as a feature, it was shown to capture low frequency energy [13].…”
Section: ) Zero Frequency Filtering (Zff)mentioning
confidence: 99%
“…These features were shown to be useful for discriminating phonation types in speech and singing [13], [41], [42]. SoE was shown to be proportional to the rate of glottal closure, the EoE feature was shown to capture the vocal effort, and the loudness measure was shown to capture the abruptness of the glottal closure [43], [44]. The energy of the ZFF signal at glottal closure is also used as a feature, it was shown to capture low frequency energy [13].…”
Section: ) Zero Frequency Filtering (Zff)mentioning
confidence: 99%
“…In protocol 1, where the number of output classes M=2, the cross-entropy was calculated by using (12). Similarly, in the 2 nd protocol, as the number of output classes is three, the categorical cross-entropy was calculated by (13).…”
Section: H Optimization Algorithm and Cost Functionmentioning
confidence: 99%
“…To recognize human emotion, many scholars used various types of raw signals [6]. Many used EEG signals [2], [7]- [9] and facial expressions [10], [11]; and few used gesture, speech signals [12], autonomous nervous signals [13]. Subjects need to express emotion explicitly when facial expressions and speech signals are being used.…”
Section: Introductionmentioning
confidence: 99%
“…Knowledge of GCIs help in several speech processing situations such as the detection of glottal activity region [33], pitch extraction [49], estimation of formant frequencies [15], characteristics of loudness [38], analysis of non-verbal sounds (such as laughter [31] and shout [30]), pitch extraction from multi-speaker data [50] and voice source analysis [47,2,3,9]. Also, GCI-based analysis of speech can be used in several speech processing applications such as concatenative speech synthesis [42], parametric speech synthesis [1], analysis and detection of pathological speech [28,39], analysis and detection of phonation types [22,21] and emotions [13,18], time delay estimation [51], determination of number of speakers from mixed signals [43], multi-speaker separation [7] and prosody modification [37]. Due to wider range of applications, detection of GCIs directly from the speech signal has received a considerable amount of research attention.…”
Section: Introductionmentioning
confidence: 99%