Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…Unlike [3] and [12] solutions that also use gated memory networks, the proposed approach does not require the use of a secondary algorithm like the CTC algorithm to carry out the sequence labeling over the classified phone labels. Instead, in our approach we directly classify the input data into either target or background words in a left-to-right online manner.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Unlike [3] and [12] solutions that also use gated memory networks, the proposed approach does not require the use of a secondary algorithm like the CTC algorithm to carry out the sequence labeling over the classified phone labels. Instead, in our approach we directly classify the input data into either target or background words in a left-to-right online manner.…”
Section: Discussionmentioning
confidence: 99%
“…Phonemes have regular, but relatively simple structures that are modeled well by the recurrence in BLSTMs. Word-spotting with BLSTMs has focused on text-or phoneme-sequence-based word specification: the BLSTM first generates a phoneme sequence or lattice from learned phoneme-level models, and the word spotter either scans the phoneme lattice generated with the BLSTMs for the specified words [12], or uses a secondlevel discriminative classifier that employs features derived from the lattice to detect the words [13]. As such, these methods are two-level classifiers.…”
Section: Introductionmentioning
confidence: 99%
“…Whole word (Morgan et al, 1990;Rose and Paul, 1990;Naylor et al, 1992;Rohlicek et al, 1993;Cuayáhuitl and Serridge, 2002;Baljekar et al, 2014;Chen et al, 2014a;Zehetner et al, 2014;Hou et al, 2016;Manor and Greenberg, 2017;Fernández-Marqués et al, 2018;Myer and Tomar, 2018) Monophone (Rose and Paul, 1990;Rohlicek et al, 1993;Cuayáhuitl and Serridge, 2002;Heracleous and Shimizu, 2003;Szöke et al, 2005;Lehtonen, 2005;Silaghi and Vargiya, 2005;Wöllmer et al, 2009b;Jansen and Niyogi, 2009a,c;Wöllmer et al, 2009a;Szöke et al, 2010;Shokri et al, 2011;Tabibian et al, 2011;Hou et al, 2016;Kumatani et al, 2017;Gruenstein et al, 2017;Tabibian et al, 2018;Myer and Tomar, 2018) Triphone (Rose and Paul, 1990;Szöke et al, 2005) Part of the word (Naylor et al, 1992;Li and Wang, 2014;Chen et al, 2014a) State unit (Zeppenfeld and Waibel, 1992) Part of the phoneme (Rohlicek et al, 1989;Kosonocky and Mammone, 1995;Leow et al, 2012) Syllable (Klemm et al, 1995;…”
Section: Acoustic Unit Sourcesmentioning
confidence: 99%
“…The primary objective of these approaches is, by means of learning from large amounts of training data, to either obtain cleaner signals and features from noisy speech audio, or directly perform recognition of noisy speech. To this end, deep learning, which is mainly based on deep neural networks, has had a central role in the recent developments [13]- [16]. Deep learning has been consistently found to be a powerful learning approach in exploiting large-scale training data to build complex and dedicated analysis systems [17], and has achieved considerable success in a variety of fields, such as gaming [18], visual recognition [19], [20], language translation [21], music information retrieval [22], and ASR [23], [24].…”
Section: Introductionmentioning
confidence: 99%