2016
DOI: 10.1109/taslp.2015.2496222
|View full text |Cite
|
Sign up to set email alerts
|

Using Pronunciation-Based Morphological Subword Units to Improve OOV Handling in Keyword Search

Abstract: Out-of-vocabulary (OOV) keywords present a challenge for keyword search (KWS) systems especially in the low-resource setting. Previous research has centered around approaches that use a variety of subword units to recover OOV words. This work systematically investigates morphology-based subword modeling approaches on seven low-resource languages. We show that using morphological subword units (morphs) in speech recognition decoding is substantially better than expanding word-decoded lattices into subword units… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
15
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(16 citation statements)
references
References 32 publications
1
15
0
Order By: Relevance
“…Morphs therefore represent precisely the kind of unit that would be expected to be learned from passive exposure to ambient speech, and there exist a number of statistical learning algorithms that demonstrate how this process might work 30 , 31 . Furthermore, morphs can be learnable even with a relatively low level of language exposure, and can provide a foundation for recognizing or analyzing words that have never been experienced before 32 , 33 .…”
Section: Experiments 2: Well-formedness Rating Tasksmentioning
confidence: 99%
“…Morphs therefore represent precisely the kind of unit that would be expected to be learned from passive exposure to ambient speech, and there exist a number of statistical learning algorithms that demonstrate how this process might work 30 , 31 . Furthermore, morphs can be learnable even with a relatively low level of language exposure, and can provide a foundation for recognizing or analyzing words that have never been experienced before 32 , 33 .…”
Section: Experiments 2: Well-formedness Rating Tasksmentioning
confidence: 99%
“…The novel problem of lexeme-set KWS is related to work on out-of-vocabulary KWS, which has been approached by handling sub-word units such as syllables and morphemes (Trmal et al, 2014;Narasimhan et al, 2014;van Heerden et al, 2017;He et al, 2016). In contrast to KWS with sub-word granularity, our approach is to generate likely full-word inflections given a lemma.…”
Section: Abstractearchmentioning
confidence: 99%
“…As has been documented in a number of papers [1,2,3,4,5,6,7], the best techniques for OOV detection involve decoding with a variety of units, such as syllables, morphemes and 1-n phone units. The sets of hits (postings lists) generated from these decodings are subsequently combined together (system fusion) using techniques such as the one described in [8].…”
Section: Introductionmentioning
confidence: 99%
“…A number of papers that utilize sub-word units for decoding and search [3,4] just search for the exact sequence of units. Other papers [6,9,10,1,11] use a confusion model to come up with "proxy" keywords which are phonetically close (or, alternatively, allow fuzzy match).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation