International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.1989.266505
|View full text |Cite
|
Sign up to set email alerts
|

Continuous hidden Markov modeling for speaker-independent word spotting

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
92
0
4

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 197 publications
(99 citation statements)
references
References 2 publications
1
92
0
4
Order By: Relevance
“…If the area under the ROC is A upto a false alarm rate of f , then the percentage area under the ROC curve is given by P A ROC = The purpose of doing this normalization is to get rid of variations of area under the ROC because of differences in the support of the ROC curves. Another performance measure typically used to evaluate KWS performance is the Figure of Merit (FOM) [12] score, which is the mean of a modified ROC curve sampled at ten points. We have found P A ROC to be better than FOM in capturing gradual improvements of the model because FOM takes the value of the ROC curve at certain number of points and does not quantify the overall change (improvement or deterioration) of the curve.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…If the area under the ROC is A upto a false alarm rate of f , then the percentage area under the ROC curve is given by P A ROC = The purpose of doing this normalization is to get rid of variations of area under the ROC because of differences in the support of the ROC curves. Another performance measure typically used to evaluate KWS performance is the Figure of Merit (FOM) [12] score, which is the mean of a modified ROC curve sampled at ten points. We have found P A ROC to be better than FOM in capturing gradual improvements of the model because FOM takes the value of the ROC curve at certain number of points and does not quantify the overall change (improvement or deterioration) of the curve.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…These factors also influence the reliability of detections and hence can be utilized in estimating the confidence of detections. Research has been conducted on confidence estimation utilizing various informative factors, in both automatic speech recognition and keyword spotting, e.g., Rohlicek et al (1989); Cox and Rose (1996); Bergen and Ward (1997); Kemp and Schaaf (1997); Ou et al (2001); Ayed et al (2002); Jiang (2005), and various methods have been employed to combine the heterogeneous informative factors, including decision trees (DT), general linear models (GLMs), generalized additive models (GAMs) and multi-layer perceptrons (MLPs) (Chase, 1997;Gillick et al, 1997;Zhang and Rudnicky, 2001). It has been found that features derived from multiple sources -with appropriate normalization -can be combined to serve as a good measure of confidence, which can in turn be used to evaluate the correctness of a recognition hypothesis or a keyword detection.…”
Section: Motivation and Organization Of This Papermentioning
confidence: 99%
“…For keyword spotting/spoken term detection, Rohlicek et al (1989) proposed to use duration-normalized acoustic likelihood as a confidence measure for each keyword in a filler model-based keyword spotting system, and Manos and Zue (1997) studied various features in a filler model-based keyword spotting system to compute the confidence score for each word, which included a segment phonemic match score, a score based on the probability of the particular segmentation, a lexical weight, a phone duration-based score, and a bigram transition score. Ou et al (2001) employed word posterior likelihoods derived from keyword, anti-keyword and non-keyword models and duration as features in a neural network classifier for utterance verification within a filler model-based keyword spotting system whereas Ayed et al (2002) used information based on the number of frames, the phone posterior probability, the frame-based phone posterior probability and the duration-based phone posterior probability along with the number of phones as lexical feature in an SVM classifier for utterance verification in an LVCSR-based keyword spotting system.…”
Section: Feature Collectionmentioning
confidence: 99%
“…While KWS is an active research area, most techniques are not suitable with our constraints. A common KWS approach is the Keyword/Filler Hidden Markov Model (HMM) [3,4,5,6,7]. It first builds a special decoding graph that contains both keywords and filler words, and then uses Viterbi decoding to determine the best path through the graph.…”
Section: Introductionmentioning
confidence: 99%