Xie Sun scite author profile

Xie Sun

5Publications

11Citation Statements Received

52Citation Statements Given

How they've been cited

How they cite others

Affiliations

Nanjing University of Science and Technology, University of Missouri, Nuance Communications (United States)

Publications

Order By: Most citations

Assessment of non-native speech using vowel space characteristics

Chen

Evanini

Sun

2010

View full text Add to dashboard Cite

In this paper, we propose the idea of using the characteristics of a speaker's vowel space for automated assessment of second language (L2) proficiency. Specifically, we adpot features that were shown in previous studies to be good indicators of native speaker intelligibility and clarity and apply them to L2 speech from non-native speakers. The features focus on three peripheral vowels (IY, AA, and OW) and measure a speaker's coverage of the vowel space. A pilot study and a large-scale corpus study involving read speech produced by native and non-native speakers were conducted in which the vowel space features were rank correlated with pronunciation scores provided by human listeners for the non-native speech and an assumed higher score for the native speech. The results of the studies show that several of the features achieve moderately high correlations with the pronunciation scores, supporting their usefulness for automated assessment of non-native speech. The feature with the best performance in the largescale study was the F2 − F1 distance for IY, which achieved a correlation of 0.78 with pronunciation proficiency scores.

show abstract

Integrate template matching and statistical modeling for continuous speech recognition

Zhao

Sun

View full text Add to dashboard Cite

Integrated exemplar-based template matching and statistical modeling for continuous speech recognition

Sun

Zhao

2014

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

We propose a novel approach of integrating exemplar-based template matching with statistical modeling to improve continuous speech recognition. We choose the template unit to be context-dependent phone segments (triphone context) and use multiple Gaussian mixture model (GMM) indices to represent each frame of speech templates. We investigate two different local distances, log likelihood ratio (LLR) and Kullback-Leibler (KL) divergence, for dynamic time warping (DTW)-based template matching. In order to reduce computation and storage complexities, we also propose two methods for template selection: minimum distance template selection (MDTS) and maximum likelihood template selection (MLTS). We further propose to fine tune the MLTS template representatives by using a GMM merging algorithm so that the GMMs can better represent the frames of the selected template representatives. Experimental results on the TIMIT phone recognition task and a large vocabulary continuous speech recognition (LVCSR) task of telehealth captioning demonstrated that the proposed approach of integrating template matching with statistical modeling significantly improved recognition accuracy over the hidden Markov modeling (HMM) baselines for both TIMIT and telehealth tasks. The template selection methods also provided significant accuracy gains over the HMM baseline while largely reducing the computation and storage complexities. When all templates or MDTS were used, using the LLR local distance gave better performance than the KL local distance. For MLTS and template compression, KL local distance gave better performance than the LLR local distance, and template compression further improved the recognition accuracy on top of MLTS while having less computational cost.

show abstract

Attention-Aware Feature Pyramid Ordinal Hashing for Image Retrieval

Sun

Lü

2019

View full text Add to dashboard Cite

On the effectiveness of statistical modeling based template matching approach for continuous speech recognition

Sun¹,

Chen²,

Zhao³

2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xie Sun

Assessment of non-native speech using vowel space characteristics

Integrate template matching and statistical modeling for continuous speech recognition

Integrated exemplar-based template matching and statistical modeling for continuous speech recognition

Attention-Aware Feature Pyramid Ordinal Hashing for Image Retrieval

On the effectiveness of statistical modeling based template matching approach for continuous speech recognition

Contact Info

Product

Resources

About