ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054585
|View full text |Cite
|
Sign up to set email alerts
|

Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation

Abstract: There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available. We study a related problem in which revisions to the hypothesis beyond strictly appending words are permitted. This is suitable for applications such as live captioning an audio feed. In this setting, we compare custom streaming approaches to re-translation, a straightforward strategy where each new source token triggers a d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
41
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 45 publications
(41 citation statements)
references
References 15 publications
0
41
0
Order By: Relevance
“…Context-aware ST extends the sentence-level ST towards streaming ST which allows models to access unlimited previous audio inputs. Instead of improving contextual modeling, many studies on streaming ST aim at developing better sentence/word segmentation policies to avoid segmentation errors that greatly hurt translation (Matusov et al, 2007;Rangarajan Sridhar et al, 2013;Iranzo-Sánchez et al, 2020;Arivazhagan et al, 2020b). Very recently, Ma et al (2020b) proposed a memory augmented Transformer encoder for streaming ST, where the previous audio features are summarized into a growing continuous memory to improve the model's context awareness.…”
Section: Related Workmentioning
confidence: 99%
“…Context-aware ST extends the sentence-level ST towards streaming ST which allows models to access unlimited previous audio inputs. Instead of improving contextual modeling, many studies on streaming ST aim at developing better sentence/word segmentation policies to avoid segmentation errors that greatly hurt translation (Matusov et al, 2007;Rangarajan Sridhar et al, 2013;Iranzo-Sánchez et al, 2020;Arivazhagan et al, 2020b). Very recently, Ma et al (2020b) proposed a memory augmented Transformer encoder for streaming ST, where the previous audio features are summarized into a growing continuous memory to improve the model's context awareness.…”
Section: Related Workmentioning
confidence: 99%
“…Another type is translation based on dynamic refresh without the need to adjust the machine translation model. Whenever the input increases, translate all input and overwrite the translation result that has been generated last time (Niehues et al, 2016) (Arivazhagan et al, 2020b) (Arivazhagan et al, 2020a).…”
Section: Related Workmentioning
confidence: 99%
“…MT Wrapper has a parameter to control the stability and latency. It can mask the last k words of incomplete sentences from the ASR output, as in Ma et al (2019) and Arivazhagan et al (2019), considering only the currently completed sentences, or only the "stable" sentences, which are beyond the ASR and punctuator processing window and never change. We do not tune these parameters in the validation.…”
Section: Mt Wrappermentioning
confidence: 99%