2009
DOI: 10.1587/transinf.e92.d.671
|View full text |Cite
|
Sign up to set email alerts
|

Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network

Abstract: SUMMARYThis paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLN LF−DPF , which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLN Dyn , which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalitie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2009
2009
2011
2011

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 14 publications
(18 citation statements)
references
References 13 publications
0
18
0
Order By: Relevance
“…In this method, three MLNs instead of two MLNs [5] MLN LF-DPF , outputs DPFs [11,12] for the inputted acoustic features, LFs [15], while the second MLN, MLN cntxt , reduces misclassification at phoneme boundaries by taking seven frame context (from t-3 to t+3) as input, and the third MLN, MLN Dyn , restricts the DPF dynamics by incorporating dynamic parameters ( DPF and DPF) into its input. Here, the MLN LF-DPF , which is trained using the standard back-propagation learning algorithm, has two hidden layers of 256 and 96 units, respectively and takes three input vectors (t-3, t, t+3) of LFs of 25 dimensions each.…”
Section: A Dpf Extractormentioning
confidence: 99%
See 2 more Smart Citations
“…In this method, three MLNs instead of two MLNs [5] MLN LF-DPF , outputs DPFs [11,12] for the inputted acoustic features, LFs [15], while the second MLN, MLN cntxt , reduces misclassification at phoneme boundaries by taking seven frame context (from t-3 to t+3) as input, and the third MLN, MLN Dyn , restricts the DPF dynamics by incorporating dynamic parameters ( DPF and DPF) into its input. Here, the MLN LF-DPF , which is trained using the standard back-propagation learning algorithm, has two hidden layers of 256 and 96 units, respectively and takes three input vectors (t-3, t, t+3) of LFs of 25 dimensions each.…”
Section: A Dpf Extractormentioning
confidence: 99%
“…This phoneme misclassification sometimes occurs when the values of DPF peaks and DPF dips are closer to each other. Therefore, a mechanism, which is called In/En network [5,6,7], is needed to obtain clearly separable DPF peaks and dips. An algorithm for this network is given below:…”
Section: B Inhibition/enhancement Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…Various methods of phoneme recognition based on Inhibition/Enhancement (In/En) network were proposed by Huda, et al [1][2][3][4]. These papers introduced In/En functionality to discriminate whether the distinctive phonetic features (DPFs) dynamic patterns of trajectories are convex or concave.…”
Section: Introductionmentioning
confidence: 99%
“…These papers showed that In/En network has an effect of improving phoneme recognition performance in clean acoustic environment. The impact of In/En network in practical condition was not analyzed in [1][2][3][4].…”
Section: Introductionmentioning
confidence: 99%