Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1367
|View full text |Cite
|
Sign up to set email alerts
|

Fast and Accurate OOV Decoder on High-Level Features

Abstract: This work proposes a novel approach to out-of-vocabulary (OOV) keyword search (KWS) task. The proposed approach is based on using high-level features from an automatic speech recognition (ASR) system, so called phoneme posterior based (PPB) features, for decoding. These features are obtained by calculating time-dependent phoneme posterior probabilities from word lattices, followed by their smoothing. For the PPB features we developed a special novel very fast, simple and efficient OOV decoder. Experimental res… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
3
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
3
3
1

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 31 publications
0
3
0
Order By: Relevance
“…The aforementioned problems of OOV words handling and low-resource data conditions need to be addressed when building an ASR system. If the system is a conventional hybrid (HMM-DNN-based acoustic model and word-based n-gram language model), the OOV problem is often solved by dynamically expanding the system’s vocabulary and/or adapting the language model (e.g., [ 11 , 12 , 13 ]). A less common approach is to use a subword-based n-gram language model [ 14 ].…”
Section: Introductionmentioning
confidence: 99%
“…The aforementioned problems of OOV words handling and low-resource data conditions need to be addressed when building an ASR system. If the system is a conventional hybrid (HMM-DNN-based acoustic model and word-based n-gram language model), the OOV problem is often solved by dynamically expanding the system’s vocabulary and/or adapting the language model (e.g., [ 11 , 12 , 13 ]). A less common approach is to use a subword-based n-gram language model [ 14 ].…”
Section: Introductionmentioning
confidence: 99%
“…The BBN system [10] combined several acoustic models based on DNN, LSTM and CNN on subword units to perform joint decoding and handled OOV queries on the sub-word unit. The STC keyword search system [11] combined 9 different acoustic models based on DNN and GMM with a phone-posterior based OOV decoder [12].…”
Section: Introductionmentioning
confidence: 99%
“…If the system is a conventional hybrid (HMM-DNN-based acoustic model and word-based ngram language model), the OOV problem is often solved by dynamic expanding the system's vocabulary and/or adapting the language model (e.g. Khokhlov et al, 2017;Gandhe et al, 2018;Malkovsky et al, 2020). A less common approach is to use a subword-based n-gram language model (Smit et al, 2017).…”
mentioning
confidence: 99%