2014
DOI: 10.3989/loquens.2014.007
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Automatic Speaker Recognition systems: An overview of the NIST Speaker Recognition Evaluations (1996-2014)

Abstract: Automatic Speaker Recognition systems show interesting properties, such as speed of processing or repeatability of results, in contrast to speaker recognition by humans. But they will be usable just if they are reliable. Testability, or the ability to extensively evaluate the goodness of the speaker detector decisions, becomes then critical. In the last 20 years, the US National Institute of Standards and Technology (NIST) has organized, providing the proper speech data and evaluation protocols, a series of te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
14
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(14 citation statements)
references
References 45 publications
0
14
0
Order By: Relevance
“…State-of-the-art SV systems are trained to generally make two decisions, acceptance or rejection. There are two types of decision errors, the FAR referred to accept an impostor speaker (Type I error), and the FRR related to the incorrect rejection of a true speaker (Type II error) [1,2]. These errors compared to a threshold determines the system performance.…”
Section: Introductionmentioning
confidence: 99%
“…State-of-the-art SV systems are trained to generally make two decisions, acceptance or rejection. There are two types of decision errors, the FAR referred to accept an impostor speaker (Type I error), and the FRR related to the incorrect rejection of a true speaker (Type II error) [1,2]. These errors compared to a threshold determines the system performance.…”
Section: Introductionmentioning
confidence: 99%
“…Recent work in speaker identification relies on frontend processing that extracts short-time spectral information, usually using the MEL frequency cepstral coefficient (MFCC) features [1]. MFCC is the most prevalent short-term spectral feature despite the fact that it was developed for speakerindependent speech recognition.…”
Section: Introduction and Previous Workmentioning
confidence: 99%
“…It is logical, then, that much speaker-related information is missing from MFCC. Attempts to add voice source information back into speaker identification systems to improve them have met limited success [1], [2], [3], [4], probably due to the difficulty and reliability of estimating the voice source waveform itself. We previously introduced the GLOMM method [5], which was based on detecting glottal events (glottal opening/closing) by detecting times of high linear prediction error.…”
Section: Introduction and Previous Workmentioning
confidence: 99%
“…A sample ROC curve illustrating the trade-off between selectivity and sensitivity of a classifier[30] …”
mentioning
confidence: 99%