Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages 2022
DOI: 10.18653/v1/2022.computel-1.21
|View full text |Cite
|
Sign up to set email alerts
|

Fine-tuning pre-trained models for Automatic Speech Recognition, experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)

Abstract: This is a report on results obtained in the development of speech recognition tools intended to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of Japhug, an endangered language of the Trans-Himalayan (Sino-Tibetan) family. The goal is to reduce the transcription workload of field linguists. The method used is a deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture. We note dif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 9 publications
1
6
0
Order By: Relevance
“…For speech processing, fine-tuning can work not only for language model adaptation [51], [54], but also for tuning acoustic models [52], [53], [55], [56]. Fine-tuning language models in speech processing is same as its use in NLP.…”
Section: Fine-tuning In Speech Processingmentioning
confidence: 99%
See 1 more Smart Citation
“…For speech processing, fine-tuning can work not only for language model adaptation [51], [54], but also for tuning acoustic models [52], [53], [55], [56]. Fine-tuning language models in speech processing is same as its use in NLP.…”
Section: Fine-tuning In Speech Processingmentioning
confidence: 99%
“…Fine-tuning language models in speech processing is same as its use in NLP. Guillaume et al [54] developed a method using a transformer architecture to tune a generic pre-trained representation model for phonemic recognition. For acoustic model adaptation, Violeta et al [52] proposed an intermediate fine-tuning step that uses imperfect synthetic speech to close the domain shift gap between the pre-training and target data.…”
Section: Fine-tuning In Speech Processingmentioning
confidence: 99%
“…Automatic Speech Recognition of "minority", "underresourced" languages is not only extremely important for the field of language documentation [7,8,9]: it also raises various scientific challenges [10,11]. Specifically, this area constitutes a particularly interesting test bed for evaluating and analyzing the properties of unsupervised language representations uncovered by neural networks such as wav2vec.…”
Section: Language Documentation: a Task That Presents Major Challenge...mentioning
confidence: 99%
“…Transfer learning through pretrain-finetune is also prevalent in medical imaging (He et al, 2023) for tasks like disease diagnosis and organ segmentation. Additionally, it is widely employed in recommendation systems (Zhang et al, 2023b), speech recognition (Guillaume et al, 2022), and reinforcement learning (Luo et al, 2023).…”
Section: Applications Of Pretrain-finetune Frameworkmentioning
confidence: 99%