The 2010 CMU GALE speech-to-text system

Metze, Florian; Hsiao, Roger; Qin, Jin; Nallasamy, Udhyakumar; Schultz, Tanja

doi:10.21437/interspeech.2010-439

Cited by 10 publications

References 19 publications

(18 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

Towards single pass discriminative training for speech recognition

Hsiao

Schultz

2012

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

This paper describes how we can combine our previously proposed fast extended Baum-Welch algorithm and generalized discriminative feature transformation to achieve single pass discriminative training, which we only process the data once. Compared to the state of the art training procedure, which uses feature space maximum mutual information (fMMI) and boosted maximum mutual information (BMMI), our proposed training procedure can achieve around 80% of the improvement available from discriminative training. We also show that if we are allowed to process the data twice, it is possible to achieve almost all of the improvement. We evaluate different training procedures on various large scale tasks using Iraqi and modern standard Arabic speech recognition systems.

show abstract

Towards single pass discriminative training for speech recognition

Hsiao

Schultz

2012

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

show abstract

Active learning for accent adaptation in Automatic Speech Recognition

Nallasamy

Metze

Schultz

2012

2012 IEEE Spoken Language Technology Workshop (SLT)

View full text Add to dashboard Cite

We introduce a novel active learning algorithm for speech recognition in the context of accent adaptation. We adapt a source recognizer on the target accent by selecting a matched subset of utterances from a large, untranscribed and multiaccented corpus for human transcription. Traditionally, active learning in speech recognition has relied on uncertainty based sampling to choose the most informative samples for manual labeling. Such an approach doesn't include explicit relevance criterion during data selection, which is crucial for choosing utterances to match the target accent, from datasets with wide-ranging speakers of different accents. We formulate a cross-entropy based relevance measure to complement uncertainty based sampling for active learning to aid accent adaptation. We evaluate the algorithm on two different setups for Arabic and English accents and show that our approach performs favorably to conventional data selection. We analyze the results to show the effectiveness of our approach in finding the most relevant subset of utterances for improving the speech recognizer on the target accent.

show abstract

Analysis of Dialectal Influence in Pan-Arabic ASR

Nallasamy

Garbus

Metze

et al. 2011

View full text Add to dashboard Cite

The 2010 CMU GALE speech-to-text system

Cited by 10 publications

References 19 publications

Towards single pass discriminative training for speech recognition

Towards single pass discriminative training for speech recognition

Active learning for accent adaptation in Automatic Speech Recognition

Analysis of Dialectal Influence in Pan-Arabic ASR

Contact Info

Product

Resources

About