STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

Ma, Mingbo; Huang, Liang; Xiong, Hao; Zheng, Renjie; Liu, Kaibo; Zheng, Baigong; Zhang, Chuanqiang; He, Zuyuan; Liu, Hairong; Li, Xing; Wu, Hua; Wang, Haifeng

doi:10.18653/v1/p19-1289

Cited by 143 publications

(352 citation statements)

References 16 publications

(32 reference statements)

Supporting

Mentioning

348

Contrasting

Order By: Relevance

“…6 Our test-time wait-results are better than those in Ma et al (2019), because the added source side <eos> token helps the full-sentence model to learn when the source sentence higher BLEU scores when latency AL is small, which we think will be the most useful scenarios of simultaneous translation. Furthermore, this figure also shows that our model can achieve good performance on different latency conditions by controlling the threshold , so we do not need to train multiple models for different latency requirements.…”

Section: Performance Comparisonmentioning

confidence: 88%

“…

Simultaneous translation is widely useful but remains challenging. Previous work falls into two main categories: (a) fixed-latency policies such as Ma et al (2019) and (b) adaptive policies such as Gu et al (2017). The former are simple and effective, but have to aggressively predict future content due to diverging source-target word order; the latter do not anticipate, but suffer from unstable and inefficient training.

…”

mentioning

confidence: 99%

See 1 more Smart Citation

Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation

Zheng¹,

Zheng²,

Ma³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

Self Cite

View full text Add to dashboard Cite

Simultaneous translation is widely useful but remains challenging. Previous work falls into two main categories: (a) fixed-latency policies such as Ma et al. (2019) and (b) adaptive policies such as Gu et al. (2017). The former are simple and effective, but have to aggressively predict future content due to diverging source-target word order; the latter do not anticipate, but suffer from unstable and inefficient training. To combine the merits of both approaches, we propose a simple supervisedlearning framework to learn an adaptive policy from oracle READ/WRITE sequences generated from parallel text. At each step, such an oracle sequence chooses to WRITE the next target word if the available source sentence context provides enough information to do so, otherwise READ the next source word. Experiments on German↔English show that our method, without retraining the underlying NMT model, can learn flexible policies with better BLEU scores and similar latencies compared to previous work.

show abstract

Section: Performance Comparisonmentioning

confidence: 88%

“…

…”

mentioning

confidence: 99%

Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation

Zheng¹,

Zheng²,

Ma³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

Self Cite

View full text Add to dashboard Cite

show abstract

“…• The schedule is simple and fixed and can thus be easily integrated into MT training, as typified by wait-k approaches (Dalvi et al, 2018;Ma et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

“…2. We extend the recently-proposed Average Lagging latency metric (Ma et al, 2018), making it differentiable and calculable in expectation, which allows it to be used as a training objective.…”

Section: Introductionmentioning

confidence: 99%

“…However, due to computational challenges, they pre-train an NMT model on full sentences and then train an agent that sees the fixed NMT model as part of its environment. Dalvi et al (2018) and Ma et al (2018) use fixed schedules and train their NMT systems accordingly. In particular, Ma et al (2018) advocate for a wait-k strategy, wherein the system always waits for exactly k tokens before beginning to translate, and then alternates between reading and writing at a constant pre-specified emission rate.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

Arivazhagan¹,

Cherry²,

Macherey³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

125

197

View full text Add to dashboard Cite

Simultaneous machine translation begins to translate each source sentence before the source speaker is finished speaking, with applications to live and streaming scenarios. Simultaneous systems must carefully schedule their reading of the source sentence to balance quality against latency. We present the first simultaneous translation system to learn an adaptive schedule jointly with a neural machine translation (NMT) model that attends over all source tokens read thus far. We do so by introducing Monotonic Infinite Lookback (MILk) attention, which maintains both a hard, monotonic attention head to schedule the reading of the source sentence, and a soft attention head that extends from the monotonic head back to the beginning of the source. We show that MILk's adaptive schedule allows it to arrive at latency-quality trade-offs that are favorable to those of a recently proposed wait-k strategy for many latency values.

show abstract

KLSI Methods for Human Simultaneous Interpretation and Towards Building a Simultaneous Machine Translation System Reflecting the KLSI Methods

Lin

Ming

2021

Artificial Intelligence in HCI

View full text Add to dashboard Cite

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

Cited by 143 publications

References 16 publications

Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation

Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

KLSI Methods for Human Simultaneous Interpretation and Towards Building a Simultaneous Machine Translation System Reflecting the KLSI Methods

Contact Info

Product

Resources

About