Interspeech 2010 2010
DOI: 10.21437/interspeech.2010-383
|View full text |Cite
|
Sign up to set email alerts
|

A comparative large scale study of MLP features for Mandarin ASR

Abstract: MLP based front-ends have shown significant complementary properties to conventional spectral features. As part of the DARPA GALE program, different MLP features were developed for Mandarin ASR. In this paper, all the proposed frontends are compared in systematic manner and we extensively investigate the scalability of these features in terms of the amount of training data (from 100 hours to 1600 hours) and system complexity (maximum likelihood training, SAT, lattice level combination, and discriminative train… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 17 publications
(5 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…The cutoff frequency for both filter-banks is approximatively 10Hz. The output of the MRASTA filtering is then processed according to a hierarchy of MLPs progressively moving from high to low modulation frequencies or equivalently from short to long temporal context [7]. The effect of this sequential processing is that the first MLP trained on short temporal context is effective on most of the phonetic classes apart stops and affricatives.…”
Section: Mlp Architecturesmentioning
confidence: 99%
See 3 more Smart Citations
“…The cutoff frequency for both filter-banks is approximatively 10Hz. The output of the MRASTA filtering is then processed according to a hierarchy of MLPs progressively moving from high to low modulation frequencies or equivalently from short to long temporal context [7]. The effect of this sequential processing is that the first MLP trained on short temporal context is effective on most of the phonetic classes apart stops and affricatives.…”
Section: Mlp Architecturesmentioning
confidence: 99%
“…Because of the large dimension of these time windows, a number of techniques for efficiently encoding the information have been proposed like MRASTA [3], DCT-TRAPS [4], and wLP-TRAPS [5]. The second direction includes a number of heterogeneous techniques that aim at overcoming the pitfalls of the three-layer MLP classifier, including bottleneck architectures [6], hierarchical architectures [7], and multi-stream approaches [8].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Since in the state-of-the-art ASR systems [9,10] MLPs are mainly used in TANDEM approach as features, in this paper our main goal is to compare the two different acoustic modeling techniques not only on MFCC, but also on concatenated cepstral and posterior features. Therefore, evaluating several feature transformation techniques (SAT, LDA) developed for GMM, the study is also extended to context-dependent MLP features.…”
Section: Introductionmentioning
confidence: 99%