Proceedings of the Third Workshop on Automatic Simultaneous Translation 2022
DOI: 10.18653/v1/2022.autosimtrans-1.2
|View full text |Cite
|
Sign up to set email alerts
|

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

Abstract: Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL). In this paper we highlight that, despite its widespread adoption, AL provides underestimated scores for systems that generate longer predictions compared to the corresponding references. We also show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate. As a solution, we propose LAA… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 24 publications
(42 reference statements)
0
3
0
Order By: Relevance
“…• Average Lagging (AL; Ma et al, 2019 • Length Adaptive Average Lagging (LAAL; Polák et al, 2022;Papi et al, 2022) • Average Token Delay (ATD; • Average Proportion (AP; Cho and Esipova, 2016) • Differentiable Average Lagging (DAL; Cherry and Foster, 2019) We also measured the computation aware version of the latency metrics, as described by . However, due to the new synchronized SimulEval agent pipeline design, the actual computation aware latency can be smaller with carefully designed parallelism.…”
Section: Discussionmentioning
confidence: 99%
“…• Average Lagging (AL; Ma et al, 2019 • Length Adaptive Average Lagging (LAAL; Polák et al, 2022;Papi et al, 2022) • Average Token Delay (ATD; • Average Proportion (AP; Cho and Esipova, 2016) • Differentiable Average Lagging (DAL; Cherry and Foster, 2019) We also measured the computation aware version of the latency metrics, as described by . However, due to the new synchronized SimulEval agent pipeline design, the actual computation aware latency can be smaller with carefully designed parallelism.…”
Section: Discussionmentioning
confidence: 99%
“…All models are evaluated using Simuleval [19] toolkit. For the translation quality, we report detokenized case-sensitive BLEU [37], and for the latency, we report length-aware average lagging (LAAL) [7,38]. In all our experiments, we use beam search with size 6.…”
Section: Modelsmentioning
confidence: 99%
“…This not only affects the resulting quality but also negatively impacts the AL latency evaluation reliability. Therefore, we proposed an improved version of the AL metric, which was later independently proposed under name length-adaptive average lagging (LAAL; Papi et al, 2022). To remedy the over-generation problem, we proposed an improved version of the beam search algorithm in Polák et al (2023b).…”
Section: Quality-latency Tradeoff In Sstmentioning
confidence: 99%