Jǐŕı Navrátil scite author profile

In this paper, we present a conditional pronunciation niodeling method for the speaker detection task that does not rely on acoustic vectors. Aiming at exploiting higherlevel information carried by the speech signal, it uses timealigned streams of phones and phonemes to model a speaker's specific Pronunciation. Our system uses phonemes drawn from a lexicon of pronunciations of words recognized by an automatic speech recognition system to generate the phoneme stream and an open-loop phone recognizer to generate a phone stream. The phoneme and phone streams are aligned at the frame level and conditional probabilities of a phone, given a phoneme, are estimated using c*occnrrence counts. A likelihood detector is then applied to these probabilities.Performance is measured using the NIST Extended Data paradigm and the Switchboard-I corpus. Using 8 training conversations for enrollment, a 2.1% equal error rate was achieved. Extensions and alternatives, as well as fusion experiments, are presented and discussed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jǐŕı Navrátil

The SuperSID project: exploiting high-level information for high-accuracy speaker recognition

Short-time Gaussianization for robust speaker verification

Using prosodic and conversational features for high-performance speaker recognition: report from JHU WS'02

A hybrid GMM/SVM approach to speaker identification

Conditional pronunciation modeling in speaker detection

Contact Info

Product

Resources

About