2007
DOI: 10.1007/978-3-540-74695-9_23
|View full text |Cite
|
Sign up to set email alerts
|

An Application of Recurrent Neural Networks to Discriminative Keyword Spotting

Abstract: Abstract. Keyword spotting is a detection task consisting in discovering the presence of specific spoken words in unconstrained speech. The majority of keyword spotting systems are based on generative hidden Markov models and lack discriminative capabilities. However, discriminative keyword spotting systems are based on the estimation of a posteriori probabilities at the frame-level, hence they make use of information from short time spans. This paper presents a discriminative keyword spotting system based on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
150
0
5

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 196 publications
(157 citation statements)
references
References 13 publications
2
150
0
5
Order By: Relevance
“…This performed better than traditional RNN as the network can learn from experience given appropriate input weight matrix to classify and predict time series in a long time variance sequence. It has outperformed HMM and RNN as a sequence learning method in some applications, such as in unsegmented cursive hand writing [34] and speech applications [35]. It architecture is made up of RNN + LSTM blocks which augment the network by remembering arbitrary value in a long period of time.…”
Section: Neural Network and Some Extensionsmentioning
confidence: 99%
“…This performed better than traditional RNN as the network can learn from experience given appropriate input weight matrix to classify and predict time series in a long time variance sequence. It has outperformed HMM and RNN as a sequence learning method in some applications, such as in unsegmented cursive hand writing [34] and speech applications [35]. It architecture is made up of RNN + LSTM blocks which augment the network by remembering arbitrary value in a long period of time.…”
Section: Neural Network and Some Extensionsmentioning
confidence: 99%
“…Combining bidirectional networks with LSTM gives Bidirectional LSTM (BLSTM), which has demonstrated excellent performance in phoneme recognition [5] and keyword spotting [6].…”
Section: Bidirectional Lstmmentioning
confidence: 99%
“…In this work we build in context information by including the outputs of a bidirectional Long Short-Term Memory (BLSTM) recurrent neural network [4,5] in the feature functions. Similar neural network architectures have been successfully applied to speech or emotion recognition related tasks [6,5,7], where they exploit contextual information whenever speech production or perception is influenced by emotion, strong accents, or background noise. In contrast to [6], our keyword spotting approach uses BLSTM for phoneme discrimination and not for the recognition of whole keywords.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Hybrid or Tandem architectures that combine discriminatively trained neural networks with Gaussian mixture modeling are widely used for speech recognition [5,6]. However, BLSTM is a relatively new architecture that has so far been applied to keyword spotting in only three works: in [7] and [8] the framewise phoneme predictions of BLSTM (without CTC) were shown to enhance keyword spotting performance of discriminative and generative models, respectively; and in [9] a keyword spotter using only BLSTM-CTC was introduced. The disadvantage of the latter method is that it has a separate output unit for each keyword, which requires excessive amounts of training data for large vocabularies, and also means the network must be retrained when new keywords are added.…”
Section: Introductionmentioning
confidence: 99%