David Qiu scite author profile

David Qiu

5Publications

48Citation Statements Received

98Citation Statements Given

How they've been cited

How they cite others

131

Affiliations

Google (United States), MIT Lincoln Laboratory, Massachusetts Institute of Technology

Publications

Order By: Most citations

Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition

Qiu

et al. 2021

View full text Add to dashboard Cite

For various speech-related tasks, confidence scores from a speech recogniser are a useful measure to assess the quality of transcriptions. In traditional hidden Markov model-based automatic speech recognition (ASR) systems, confidence scores can be reliably obtained from word posteriors in decoding lattices. However, for an ASR system with an auto-regressive decoder, such as an attentionbased sequence-to-sequence model, computing word posteriors is difficult. An obvious alternative is to use the decoder softmax probability as the model confidence. In this paper, we first examine how some commonly used regularisation methods influence the softmaxbased confidence scores and study the overconfident behaviour of end-to-end models. Then we propose a lightweight and effective approach named confidence estimation module (CEM) on top of an existing end-to-end ASR model. Experiments on LibriSpeech show that CEM can mitigate the overconfidence problem and can produce more reliable confidence scores with and without shallow fusion of a language model. Further analysis shows that CEM generalises well to speech from a moderately mismatched domain and can potentially improve downstream tasks such as semi-supervised learning.

show abstract

Improving The Latency And Quality Of Cascaded Encoders

Sainath

Narayanan

et al. 2022

View full text Add to dashboard Cite

Learning Word-Level Confidence for Subword End-To-End ASR

Qiu

et al. 2021

View full text Add to dashboard Cite

We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend naturally to systems that operate on word-pieces (WP) as their vocabulary. In particular, ground truth WP correctness labels are needed for training confidence models, but the non-unique tokenization from word to WP causes inaccurate labels to be generated. This paper proposes and studies two confidence models of increasing complexity to solve this problem. The final model uses self-attention to directly learn word-level confidence without needing subword tokenization, and exploits full context features from multiple hypotheses to improve confidence accuracy. Experiments on Voice Search and long-tail test sets show standard metrics (e.g., NCE, AUC, RMSE) improving substantially. The proposed confidence module also enables a model selection approach to combine an on-device E2E model with a hybrid model on the server to address the rare word recognition problem for the E2E model.

show abstract

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

Qiu

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

Phase and power estimation for per-hop multi-user detection in frequency-hopping systems

Qiu

Royster

Block

2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

David Qiu

Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition

Improving The Latency And Quality Of Cascaded Encoders

Learning Word-Level Confidence for Subword End-To-End ASR

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

Phase and power estimation for per-hop multi-user detection in frequency-hopping systems

Contact Info

Product

Resources

About