Phonetic baseforms are the basic recognition units in most large vocabulary speech recognition systems. These baseforms are usually determined by hand once a vocabulary is chosen and not modified thereafter. However, many applications of speech recognition, such as dictation transcription, are hampered by a fixed vocabulary and require the user be able to add new words to the vocabulary. At least one phonetic baseform must be assigned to each new word to properly integrate the word into the recognition system. Dictionary lookup is often unsuccessful in determining a phonetic baseform because new words are often names or task-specific jargon; also, talkers tend to have idiosyncratic pronunciations for a substantial fraction of words. This paper describes a series of experiments in which the phonetic baseform is deduced automatically for new words by utilizing actual utterances of the new word in conjunction with a set of automatically derived spelling-tosound rules. We evaluated recognition performance on new words spoken by two different talkers when the phonetic baseforms were extracted via the above approach. The error rates on these new words were found to be comparable to or better than when the phonetic baseforms were derived by hand, thus validating the basic approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.