2016
DOI: 10.1109/taslp.2016.2544660
|View full text |Cite
|
Sign up to set email alerts
|

Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
23
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(24 citation statements)
references
References 25 publications
1
23
0
Order By: Relevance
“…373 nearest neighbors, vector quantization [6], hidden Markov model (HMM) [9], Gaussian mixture model (GMM) [10], artificial neural network [4], and deep neural network (DNN) [11]. Of the various classifiers available, in this research we selected GMM as our baseline for speaker recognition.…”
Section: Development Of Quranic Reciter Identification System Using Mmentioning
confidence: 99%
“…373 nearest neighbors, vector quantization [6], hidden Markov model (HMM) [9], Gaussian mixture model (GMM) [10], artificial neural network [4], and deep neural network (DNN) [11]. Of the various classifiers available, in this research we selected GMM as our baseline for speaker recognition.…”
Section: Development Of Quranic Reciter Identification System Using Mmentioning
confidence: 99%
“…In several ASR, the clients are hesitant to provide sufficient voice data, especially for testing, in phone banking. In different circumstances, it is profoundly hard to gather adequate speech data, for instance in legal applications [9].…”
Section: A Challenges With Limited Speech Data In Asrmentioning
confidence: 99%
“…The recent research advocate if the speech data utilised during testing phase bring down 10% (from 20 sec of speech data to 2 sec of speech data) the performance of ASR degraded abruptly from 6.34% to 23.89% in terms of equal error rate (EER) [9]. In ASR application once testing speech data is less than 2 sec the performance of the system in terms of EER 35% has been reported by Mak et al [10].…”
Section: A Challenges With Limited Speech Data In Asrmentioning
confidence: 99%
See 1 more Smart Citation
“…Alternatively, several approaches have been proposed that leverage phonetic information to perform content matching. The work in Li et al (2016) proposes a GMM based subregion framework where speaker models are trained for each subregion defined by phonemes. Test utterances are then scored with subregion models.…”
Section: Introductionmentioning
confidence: 99%