ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413719
|View full text |Cite
|
Sign up to set email alerts
|

Cascaded Models with Cyclic Feedback for Direct Speech Translation

Abstract: Direct speech translation describes a scenario where only speech inputs and corresponding translations are available. Such data are notoriously limited. We present a technique that allows cascades of automatic speech recognition (ASR) and machine translation (MT) to exploit in-domain direct speech translation data in addition to out-of-domain MT and ASR data. After pre-training MT and ASR, we use a feedback cycle where the downstream performance of the MT system is used as a signal to improve the ASR system by… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…Inspired by random online backtranslation , we created our version, explained in Algorithm 1, to help our model better utilize the training dataset, and the 892 monolingual Bambara sentences from Wikipedia. Our approach, dubbed Cyclic backtranslation (Lam et al, 2021), would theoretically enable the model to leverage the available training and monolingual dataset by compelling the MT model for each direction, at each step k, to learn from a concatenation of the original training dataset, its synthetically generated sentences, and those generated by the MT model of the opposite direction in the previous step. Despite its potential benefits, implementing backtranslation presented several challenges.…”
Section: Team Alphamentioning
confidence: 99%
“…Inspired by random online backtranslation , we created our version, explained in Algorithm 1, to help our model better utilize the training dataset, and the 892 monolingual Bambara sentences from Wikipedia. Our approach, dubbed Cyclic backtranslation (Lam et al, 2021), would theoretically enable the model to leverage the available training and monolingual dataset by compelling the MT model for each direction, at each step k, to learn from a concatenation of the original training dataset, its synthetically generated sentences, and those generated by the MT model of the opposite direction in the previous step. Despite its potential benefits, implementing backtranslation presented several challenges.…”
Section: Team Alphamentioning
confidence: 99%
“…Speech translation can be broadly categorized into cascade system and end-to-end speech translation (E2E ST). Cascade system (Sperber et al, 2017;Lam et al, 2021) usually combines automatic speech recognition (ASR) and machine translation (MT). The MT subsystem uses ASR transcripts as input, which provide clear expression but may contain errors stemming from ASR.…”
Section: Introductionmentioning
confidence: 99%