This paper describes an adaptive threshold estimation mechanism for speaker authentication systems. The mechanism estimates speaker-dependent thresholds based on successful verifications considering the minimization of a cost function. Speaker authentication systems commonly use a threshold to decide whether a claimed identity matches a voice-print previously enrolled. Speaker independent threshold is a common option but it does not consider specific speaker characteristics that are relevant to achieve better system performance. Speaker dependent threshold on the contrary, uses speaker-specific data to estimate individual thresholds but the system performance can also suffer from suboptimal threshold conditioned by limited number of true scores. The algorithm reported in this paper starts with the speaker dependent threshold and use an adaptive algorithm to perform online re-estimation of the initial threshold based on speaker-dependent data. The threshold is reestimated in each successful authentication transaction according to a custom-made confidence score. The reported technique keep the voice print up-to-date while is less sensitive to score outliers than traditional speaker dependent threshold. The algorithm provided a performance enhancement of up to 36.2% when compared to traditional speaker independent. An ad-hoc database obtained with a practical system was used involving cell and land-line utterances from male and female speakers.
Audible inspiration is a type of speech perturbation used in conjunction with other acoustic observations to assess different types of pathologic conditions of speech associated with neurological or vocal cord disorders. The perception of this voice perturbation is very subjective and difficult to appraise in a consistent form across multiple utterances, subjects and disorders. This work reports an algorithm to model the perception of audible inspirations. It automatically segments the inspirations in continuous speech based on time-frequency characteristics and estimates the magnitude of the perturbation through a linear combination of the number, duration and the intensity of the inspirations. The algorithm was evaluated with the Massachusetts Eye and Ear Infirmary Voice database and two other databases containing recording from motor speech disorders. Results: a new method to automatically segment inspiratory phonation was developed. It provided an average segmentation accuracy of 84.4% and enabled accurate objective judgments of the perturbations associated with audible inspirations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.