2019
DOI: 10.1109/taslp.2018.2888814
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment

Abstract: Article:Deena, S. orcid.org/0000-0001-5417-0556, Hasan, M., Doulaty, M. et al. (2 more authors) (2019) Recurrent neural network language model adaptation for multi-genre broadcast speech recognition and alignment. IEEE/ACM Transactions on Audio, Speech and Language Processing, 27 (3).Abstract-Recurrent neural network language models (RNNLMs) generally outperform n-gram language models when used in automatic speech recognition. Adapting RNNLMs to new domains is an open problem and current approaches can be cate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 25 publications
(13 citation statements)
references
References 35 publications
0
12
0
Order By: Relevance
“…Neural network language modelling [270] has become stateof-the-art, in particular recurrent neural network language models (RNNLMs) [271]. There has been a range of work on adaptation of RNNLMs, including the use of topic or genre information as auxiliary features [272], [273] or combined as marginal distributions [274], domain specific embeddings [275], and the use of curriculum learning and fine-tuning to take account of shifting contexts [276], [277]. Approaches based on acoustic model adaptation, such as LHUC [277] and LHN [273], have also been explored.…”
Section: Language Model Adaptationmentioning
confidence: 99%
“…Neural network language modelling [270] has become stateof-the-art, in particular recurrent neural network language models (RNNLMs) [271]. There has been a range of work on adaptation of RNNLMs, including the use of topic or genre information as auxiliary features [272], [273] or combined as marginal distributions [274], domain specific embeddings [275], and the use of curriculum learning and fine-tuning to take account of shifting contexts [276], [277]. Approaches based on acoustic model adaptation, such as LHUC [277] and LHN [273], have also been explored.…”
Section: Language Model Adaptationmentioning
confidence: 99%
“…The decoding process of the LAS model, which is based on output, is different from that of weighted finite-state transducer (WFST), which is based on the frames [31]- [33]. When the Speller starts decoding, the first output token is <sos>, which means the beginning of the sentence.…”
Section: B Decoding Strategymentioning
confidence: 99%
“…The idea is that aLDA captures acoustic similarities in the data and tLDA can further help with linguistic content's similarity. tLDA is already shown to improve classification accuracy in LDA based acoustic information retrieval [5] as well as language modelling tasks [15,16,17,18,19]. Training tLDA models followed a similar procedure to aLDA and a comparable number of latent topics and vocabulary size was used.…”
Section: Combining Text Lda With Aldamentioning
confidence: 99%