2011
DOI: 10.1109/tasl.2010.2058800
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic Pronunciation Modeling for Out-of-Vocabulary Spoken Term Detection

Abstract: Abstract-Spoken term detection (STD) is the name given to the task of searching large amounts of audio for occurrences of spoken terms, which are typically single words or short phrases. One reason that STD is a hard task is that search terms tend to contain a disproportionate number of out-of-vocabulary (OOV) words. The most common approach to STD uses subword units. This, in conjunction with some method for predicting pronunciations of OOVs from their written form, enables the detection of OOV terms but perf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2011
2011
2014
2014

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 36 publications
(36 reference statements)
0
8
0
Order By: Relevance
“…We tested two scenarios for OOV terms: one with 1-best pronunciations and the other with multiple pronunciations based on SPM [9]. We define a maximum group of overlapping detections as a cluster, the averaged number of detection per cluster as the 'overlap ratio', and the size of the largest cluster as 'max overlap'.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We tested two scenarios for OOV terms: one with 1-best pronunciations and the other with multiple pronunciations based on SPM [9]. We define a maximum group of overlapping detections as a cluster, the averaged number of detection per cluster as the 'overlap ratio', and the size of the largest cluster as 'max overlap'.…”
Section: Discussionmentioning
confidence: 99%
“…We have shown in previous studies [7] that an ATWV-oriented decision strategy [8] is essential for OOV STD, however accumulated confidences are not normalized and thus cannot be applied together with ATWV-oriented decisions, although normalization techniques may provide some compensation [7]. Furthermore, stochastic pronunciation modeling (SPM) [9], which has been shown to be highly effective for OOV term detection, may further complicate the pattern of overlaps since more pronunciation candidates are taken into account.…”
Section: Introductionmentioning
confidence: 99%
“…There have been many studies on speech recognition errors, OOV, and term pronunciation problems in STD [11], [12], [13]. This study addresses only the recognition errors and OOV problems.…”
Section: Related Workmentioning
confidence: 99%
“…These variants can occur in a continuum ranging from generally accepted alternate pronunciations to barely perceptible phonetic variations. While incorporating necessary variants can improve both automatic speech recognition (ASR) and spoken term detection (STD) performance [1,2], introducing superfluous variants can lead to increased confusability and a decrease in performance [3]. The use of pronunciation variants in state-of-the-art speech-recognition systems has received significant attention in recent years, with a number of studies investigating the effect of variant prediction on either ASR or STD accuracy [4,5,1,3,6].…”
Section: Introductionmentioning
confidence: 99%