Abstract:WORD CORRECTNESSThis paper proposes a probabilistic framework t o define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in a word hypothesis, via a neural network. We also propose a measure of the joint performance of the recognition and confidence systems. The definitions and algorithms are illustrated with results on the Switchboard Corpus.
“…The most popular paradigm to evaluate confidence is as a probability, [6,41,46,65,49,66,51,9,53,12,30,61]. Thus, each of these papers models confidence in a fashion such that confidence to conform to the axioms of probability.…”
Section: Confidence Paradigmsmentioning
confidence: 99%
“…However, it is also useful to have a way to measure how confident the system is on any given indication. Much of the current literature, [6,9,12,41,65,66], uses either a posterior probability or something similar to a posterior probability to measure confidence in a given declaration. However, there are problems with simply using posterior probabilities.…”
Section: Literature Overviewmentioning
confidence: 99%
“…Arribas and Cid-Sueiro [6] state right up front that a posterior probability can be used as a measure of confidence in a decision. Weintraub et al [66] use the posterior probability as the measure of the confidence in a word hypothesis. Goh et al [37] state that confidence is not equal to the posterior probability; instead, it is proportional to the maximum posterior probability across the possible classes.…”
“…The most popular paradigm to evaluate confidence is as a probability, [6,41,46,65,49,66,51,9,53,12,30,61]. Thus, each of these papers models confidence in a fashion such that confidence to conform to the axioms of probability.…”
Section: Confidence Paradigmsmentioning
confidence: 99%
“…However, it is also useful to have a way to measure how confident the system is on any given indication. Much of the current literature, [6,9,12,41,65,66], uses either a posterior probability or something similar to a posterior probability to measure confidence in a given declaration. However, there are problems with simply using posterior probabilities.…”
Section: Literature Overviewmentioning
confidence: 99%
“…Arribas and Cid-Sueiro [6] state right up front that a posterior probability can be used as a measure of confidence in a decision. Weintraub et al [66] use the posterior probability as the measure of the confidence in a word hypothesis. Goh et al [37] state that confidence is not equal to the posterior probability; instead, it is proportional to the maximum posterior probability across the possible classes.…”
“…In other words, these approaches do not propose to train the model so as to maximize the spotting performance, and the keyword spotting task is only introduced in the inference step after training. Only few studies have proposed discriminative parameter training approaches to circumvent this weakness (Benayed et al 2003;Sandness and Hetherington 2000;Sukkar et al 1996;Weintraub et al 1997). Sukkar et al (1996) proposed to maximize the likelihood ratio between the keyword and garbage models for keyword utterances and to minimize it over a set of false alarms generated by a first keyword spotter.…”
Section: Previous Workmentioning
confidence: 99%
“…Other discriminative approaches have been focused on combining different HMM-based keyword detectors. For instance, Weintraub et al (1997) trained a neural network to combine likelihood ratios from different models. Benayed et al (2003) relied on support vector machines to combine different averages of phone-level likelihoods.…”
This chapter introduces a discriminative method for detecting and spotting keywords in spoken utterances. Given a word represented as a sequence of phonemes and a spoken utterance, the keyword spotter predicts the best time span of the phoneme sequence in the spoken utterance along with a confidence. If the prediction confidence is above certain level the keyword is declared to be spoken in the utterance within the predicted time span, otherwise the keyword is declared as not spoken. The problem of keyword spotting training is formulated as a discriminative task where the model parameters are chosen so the utterance in which the keyword is spoken would have higher confidence than any other spoken utterance in which the keyword is not spoken. It is shown theoretically and empirically that the proposed training method resulted with a high area under the receiver operating characteristic (ROC) curve, the most common measure to evaluate keyword spotters. We present an iterative algorithm to train the keyword spotter efficiently. The proposed approach contrasts with standard spotting strategies based on HMMs, for which the training procedure does not maximize a loss directly related to the spotting performance. Several experiments performed on TIMIT and WSJ corpora show the advantage of our approach over HMM-based alternatives.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.