Efficient Speech Recognition Techniques for the Finals of Mandarin Syllables

A Two-Level Time-Delay Neural Network ("LTDNN) technique has been developed to recognize all Mandarin Finals of the entire Chinese syllables. The first level discriminates the vowel-group based on (a,e,i,o,u,v) and the nasal-group based on nasal ending, (-n,-ng,-others). Orthogonal combination of the two groupings in the first level enables the second level discrimination of all 35 Mandarin Finals. The technique was thoroughly tested with 8 sets of 1265 isolated Hanyu Pinyin syllables, with 6 sets used for training and 2 sets used for testing. The overall result shows that a high recognition rate of 95.3% for inside testing and 93.9% for outside testing is achievable.

show abstract

Linear Predictive Coefficients-Based Feature to Identify Top-Seven Spoken Languages

Mukherjee

Dhar

Obaidullah

et al. 2019

Int. J. Patt. Recogn. Artif. Intell.

View full text Add to dashboard Cite

Speech recognition in multilingual scenario is not trivial in the case when multiple languages are used in one conversation. Language must be identified before we process speech recognition as such tools are language-dependent. We present a language identification system (or AI tool) to distinguish top-seven world languages namely Chinese, Spanish, English, Hindi, Arabic, Bangla and Portuguese [G. F. Simons and C. D. Fennig (eds.), Ethnologue: Laguage of the Americas and the Pacific, Twentieth Edn. (SIL Internatinal, 2017)]. The system uses linear predictive coefficients-based feature, i.e. the line spectral pair–grade ratio (LSP–GR) feature, and ensemble learning for classification. Experiments were performed on more than 200[Formula: see text]h of real-world YouTube data and the highest possible accuracy of 96.95% was received. The results can be compared with other machine learning classifiers.

show abstract

Efficient Speech Recognition Techniques for the Finals of Mandarin Syllables

Cited by 3 publications

References 0 publications

Large vocabulary Mandarin Final recognition based on Two-Level Time-Delay Neural Networks (TLTDNN)

Large vocabulary Mandarin Final recognition based on Two-Level Time-Delay Neural Networks (TLTDNN)

A two-level TDNN (TLTDNN) technique for large vocabulary Mandarin final recognition

Linear Predictive Coefficients-Based Feature to Identify Top-Seven Spoken Languages

Contact Info

Product

Resources

About