Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-11318
|View full text |Cite
|
Sign up to set email alerts
|

Low Resource Comparison of Attention-based and Hybrid ASR Exploiting wav2vec 2.0

Abstract: Low resource speech recognition can potentially benefit a lot from exploiting a pretrained model such as wav2vec 2.0. These pretrained models have learned useful representations in an unsupervised or self-supervised task, often leveraging a very large corpus of untranscribed speech. The pretrained models can then be used in various ways. In this work we compare two approaches which exploit wav2vec 2.0: an attention-based end-to-end model (AED), where the wav2vec 2.0 model is used in the model encoder, and a hy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…Previously, we applied wav2vec 2.0 pretrained Transformers to North Sámi [1]. We found that the hidden Markov model / deep neural network (HMM/DNN) approach outperformed the attention-based encoder-decoder (AED) approach, and thus are continuing our work here focusing on HMM/DNN-systems.…”
Section: Introductionmentioning
confidence: 82%
See 2 more Smart Citations
“…Previously, we applied wav2vec 2.0 pretrained Transformers to North Sámi [1]. We found that the hidden Markov model / deep neural network (HMM/DNN) approach outperformed the attention-based encoder-decoder (AED) approach, and thus are continuing our work here focusing on HMM/DNN-systems.…”
Section: Introductionmentioning
confidence: 82%
“…Here, we are able to leverage an additional North Sámi text resource. This resource, called Freecorpus (FC), consists of freely available texts, collected by Giellatekno and Divvun 1 .…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…To the best of our knowledge, this proposed multihead inference is a novel improvement for the HMM/DNN approach, though it resembles an efficient form of model combination. We presented initial results using this approach in [44] and explore it here in more detail. Table I compares the various acoustic model training criteria and output heads used during inference.…”
Section: A Hybrid Hidden Markov Model / Deep Neural Network Systemsmentioning
confidence: 99%