Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages 2020
DOI: 10.18653/v1/2020.iwclul-1.5
|View full text |Cite
|
Sign up to set email alerts
|

Towards a Speech Recognizer for Komi, an Endangered and Low-Resource Uralic Language

Abstract: In this paper, we present and evaluate a first pass speech recognition model for Komi, an endangered and low-resource Uralic language spoken in Russia. We compare a transfer learning approach from English with a baseline model trained from scratch using DeepSpeech (an end-to-end ASR model) and evaluate the impact of fine tuning a language model for correcting the output of the network. We also provides an overview of previous research and perform an error analysis with a focus on the language model and the cha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
5
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 7 publications
0
5
0
1
Order By: Relevance
“…For this reason, such materials are often referred to as "endangered data". It may be possible to solve (or at least reduce) the transcription bottleneck using speech-to-text technology, thus speeding up the process of language documentation (Hjortnaes, Partanen, Rießler and Tyers, 2020;Zahrer, Zgank and Schuppler, 2020). However, in order to reach high accuracy, traditional approaches require large amounts of annotated training data (on the order of thousands of hours (Baevski, Zhou, Mohamed and Auli, 2020)), which is typically not available in a language documentation scenario.…”
Section: Speech Transcription Technology For Language Documentationmentioning
confidence: 99%
“…For this reason, such materials are often referred to as "endangered data". It may be possible to solve (or at least reduce) the transcription bottleneck using speech-to-text technology, thus speeding up the process of language documentation (Hjortnaes, Partanen, Rießler and Tyers, 2020;Zahrer, Zgank and Schuppler, 2020). However, in order to reach high accuracy, traditional approaches require large amounts of annotated training data (on the order of thousands of hours (Baevski, Zhou, Mohamed and Auli, 2020)), which is typically not available in a language documentation scenario.…”
Section: Speech Transcription Technology For Language Documentationmentioning
confidence: 99%
“…L'utilisation d'architectures neuronales de type Transformer pour apprendre des modèles multilingues du texte et de la parole, couplée à des méthodes de spécialisation (fine-tuning) de ces représentations génériques, a ouvert la possibilité de développer des approches de traitement automatique pour de nombreuses langues pour lesquelles il n'existe que peu de données annotées. Cette approche est particulièrement intéressante pour les tâches d'aide à la documentation linguistique : le développement de méthodes de transcription et annotation semi-automatique, voire automatique, à partir d'une petite quantité de données annotées, permettrait en effet de réduire l'effort d'annotation des linguistes de terrain�; ces dernier•ère•s pourraient alors concentrer leur attention sur des tâches significatives au plan linguistique et au plan humain (Michaud et al, 2018�;Partanen et al, 2020�;Prud'hommeaux et al, 2021�;Morris et al, 2021).…”
Section: Introductionunclassified
“…Transcription of speech is an important part of language documentation, and yet speech recognition technology has not been widely harnessed to aid linguists. Despite revolutionary progress in the performance of speech recognition systems in the past decade (Hinton et al, 2012;Hannun et al, 2014;Zeyer et al, 2018;Hadian et al, 2018;Ravanelli et al, 2019;Zhou et al, 2020), including in the application to low-resource languages (Besacier et al, 2014;Blokland et al, 2015;Lim et al, 2018;van Esch et al, 2019;Hjortnaes et al, 2020), these advances are yet to play a common role in language documentation workflows. Speech recognition software often requires effective command line skills and a reasonable understanding of the underlying modeling.…”
Section: Introductionmentioning
confidence: 99%