Multi-Task Language Modeling for Improving Speech Recognition of Rare Words

Yang, Chao-Han Huck; Liu, Linda; Gandhe, Ankur; Gu, Yile; Raju, Anirudh; Filimonov, Denis; Bulyko, Ivan

doi:10.1109/asru51503.2021.9688282

Cited by 11 publications

(2 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[56] conducted research on the possibility of directly attaching an audio encoder to LLMs to convert them into automatic speech recognition (ASR) systems, allowing their text counterparts to be used in the exact same manner. It has also been confirmed that when combining LLM with prompt engineering and fine-tuning, they can function as post-recognition processors for speech, conducting revision and error adjustment [57]. Topic 3 was labelled as "Research on tuning LLM for improving the efficiency" through "high," "time," "parameter," "structure," "simulation," and "flow."…”

Section: Results Of the Topic Analysis For Web Of Sciencementioning

confidence: 96%

Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news

Jung,

Lee,

Woo

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

This study presents a comprehensive exploration of topic modeling methods tailored for large language model (LLM) using data obtained from Web of Science and LexisNexis from June 1, 2020, to December 31, 2023. The data collection process involved queries focusing on LLMs, including “Large language model,” “LLM,” and “ChatGPT.” Various topic modeling approaches were evaluated based on performance metrics, including diversity and coherence. latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), combined topic models (CTM), and bidirectional encoder representations from Transformers topic (BERTopic) were employed for performance evaluation. Evaluation metrics were computed across platforms, with BERTopic demonstrating superior performance in diversity and coherence across both LexisNexis and Web of Science. The experiment result reveals that news articles maintain a balanced coverage across various topics and mainly focus on efforts to utilize LLM in specialized domains. Conversely, research papers are more concise and concentrated on the technology itself, emphasizing technical aspects. Through the insights gained in this study, it becomes possible to investigate the future path and the challenges that LLMs should tackle. Additionally, they could offer considerable value to enterprises that utilize LLMs to deliver services.

show abstract

Section: Results Of the Topic Analysis For Web Of Sciencementioning

confidence: 96%

Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news

Jung,

Lee,

Woo

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…End-to-end automatic speech recognition [1,2] (ASR) has many applications in human society, empowering voice-based intelligent control, spoken language understanding [3], on-device services [4], and web-based speech interactions [5]. These high-performance speech applications benefit from neural network-based ASR systems that have highly accurate on-device performance with fixed model parameters.…”

Section: Introductionmentioning

confidence: 99%

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition

Yang¹,

Ahmed²,

Gu³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

In this work, we aim to enhance the system robustness of end-toend automatic speech recognition (ASR) against adversarially-noisy speech examples. We focus on a rigorous and empirical "closedmodel adversarial robustness" setting (e.g., on-device or cloud applications). The adversarial noise is only generated by closed-model optimization (e.g., evolutionary and zeroth-order estimation) without accessing gradient information of a targeted ASR model directly. We propose an advanced Bayesian neural network (BNN) based adversarial detector, which could model latent distributions against adaptive adversarial perturbation with divergence measurement. We further simulate deployment scenarios of RNN Transducer, Conformer, and wav2vec-2.0 based ASR systems with proposed adversarial detection system. Leveraging the proposed BNN based detection system, we improve detection rate by +2.77 to +5.42% (relative +3.03 to +6.26%) and reduce the word error rate by 5.02 to 7.47% on LibriSpeech datasets compared to the current model enhancement methods against the adversarial speech examples.

show abstract

Low-Rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

Yu,

Yang,

Kolehmainen

et al. 2023

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

View full text Add to dashboard Cite

Multi-Task Language Modeling for Improving Speech Recognition of Rare Words

Cited by 11 publications

References 26 publications

Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news

Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition

Low-Rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

Contact Info

Product

Resources

About