2019
DOI: 10.1134/s1064226919110184
|View full text |Cite
|
Sign up to set email alerts
|

Speaker Modeling Using Emotional Speech for More Robust Speaker Identification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 28 publications
0
0
0
Order By: Relevance
“…The use of discrete Hidden Markov Models (HMMs) with Linear Prediction Cepstral Coefficients (LPCC), Log Frequency Power Coefficients (LFPC), total signal energy (E), teager energy (TE), fundamental frequency (F0) and values of the formants (FF) has reached 72% of recognition rate (Nedeljković, Ðurović, 2015). In the case of Support Vector Machine (SVM) approach, the results oscillated between 62.78% and 91.3% depending on which test setup was used (Hassan, Damper, 2010;Milošević et al, 2016). The results obtained so far on the Polish database PES have been contrasted: 50.73% using k Nearest Neighbours (kNN) and Mel Frequency Cepstral Coefficients (MFCC) (Kamińska et al, 2013), whereas phoneme level formant features combined with Binary Decision Trees (BDT) give 81.9% (Ślot et al, 2009).…”
Section: Introductionmentioning
confidence: 99%
“…The use of discrete Hidden Markov Models (HMMs) with Linear Prediction Cepstral Coefficients (LPCC), Log Frequency Power Coefficients (LFPC), total signal energy (E), teager energy (TE), fundamental frequency (F0) and values of the formants (FF) has reached 72% of recognition rate (Nedeljković, Ðurović, 2015). In the case of Support Vector Machine (SVM) approach, the results oscillated between 62.78% and 91.3% depending on which test setup was used (Hassan, Damper, 2010;Milošević et al, 2016). The results obtained so far on the Polish database PES have been contrasted: 50.73% using k Nearest Neighbours (kNN) and Mel Frequency Cepstral Coefficients (MFCC) (Kamińska et al, 2013), whereas phoneme level formant features combined with Binary Decision Trees (BDT) give 81.9% (Ślot et al, 2009).…”
Section: Introductionmentioning
confidence: 99%