2006
DOI: 10.1007/11892755_44
|View full text |Cite
|
Sign up to set email alerts
|

Practical Considerations for Real-Time Implementation of Speech-Based Gender Detection

Abstract: This paper describes a detailed analysis and implementation of a robust gender detector for audio stream applications. The implementation, based on melcepstral features and a Gaussian mixture model classifier, is designed to maximize gender classification performance in continuous speech. The described detector outperforms other reported systems based on statistically significant numbers of gender verifications (2136 unique speakers) obtained from the FISHER speech corpus. The system yields high accuracies for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0
1

Year Published

2015
2015
2020
2020

Publication Types

Select...
3
1
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 5 publications
0
4
0
1
Order By: Relevance
“…This distinction, motivated by the higher cost incurred by moving erroneously than by not moving at all, leads to AER ignoring false negative predictions (incorrect predictions of no motion class). Equations (5) and (6) show the computation of TER and AER, respectively. Parameters n, N, p n , and l n are the index of the test feature vector, the total number of test feature vectors, the class prediction of the classifier for feature vector n, and the true class label of feature vector n, respectively.…”
Section: Performance Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…This distinction, motivated by the higher cost incurred by moving erroneously than by not moving at all, leads to AER ignoring false negative predictions (incorrect predictions of no motion class). Equations (5) and (6) show the computation of TER and AER, respectively. Parameters n, N, p n , and l n are the index of the test feature vector, the total number of test feature vectors, the class prediction of the classifier for feature vector n, and the true class label of feature vector n, respectively.…”
Section: Performance Evaluationmentioning
confidence: 99%
“…With adequate spatial resolution and a proper EMG pattern recognition pipeline, motions can be deciphered with remarkably high accuracy (>90% accuracy). Myoelectric control has been used in a variety of human-computer interfaces such as upper-limb prostheses or orthoses [1,2], electric wheelchairs [3], muscle-derived speech decoding devices [4,5], virtual reality control devices [6] and other clinical and consumer device designs [7]. While myoelectric control has been touted for decades as an intuitive means of control for assistive-devices, performance of these devices in daily living conditions has been notably inferior to benchmarks achieved in controlled laboratory environments.…”
Section: Introductionmentioning
confidence: 99%
“…With adequate spatial resolution and a proper EMG pattern recognition pipeline, motions can be deciphered with remarkably high accuracy (>90% accuracy). Myoelectric control has been used in a variety of human-computer interfaces such as electric wheelchairs [1], orthoses [2,3], muscle-derived speech decoding devices [4,5], virtual reality control devices [6] and other clinical and consumer device designs [7]. While myoelectric control has been touted for decades as an intuitive means of control for assistive-devices, performance of these devices in daily living conditions has been notably inferior to benchmarks achieved in controlled laboratory environments.…”
Section: Introductionmentioning
confidence: 99%
“…Для решения задачи распознавания пола по речи человека известно множество подходов к выбору признаков и правил принятия решений. В работах [1,2] в качестве признаков при определении пола дикторов применяются совместное использование оценок частоты основного тона и кепстральных признаков.…”
Section: литературный обзорunclassified