Learning to Actively Learn Neural Machine Translation

Liu, Ming; Buntine, Wray L.; Haffari, Gholamreza

doi:10.18653/v1/k18-1033

Cited by 30 publications

(23 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They assume that the classifier is a convolutional neural network and use expected gradient length (Settles et al, 2008) to choose sentences that contain words with the most label-discriminative embeddings. Besides text classification, AL has been applied to neural models for semantic parsing (Duong et al, 2018), named entity recognition (Shen et al, 2018), and machine translation (Liu et al, 2018).…”

Section: Related Workmentioning

confidence: 99%

Cold-start Active Learning through Self-supervised Language Modeling

Yuan¹,

Lin²,

Boyd-Graber³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Active learning strives to reduce annotation costs by choosing the most critical examples to label. Typically, the active learning strategy is contingent on the classification model. For instance, uncertainty sampling depends on poorly calibrated model confidence scores. In the cold-start setting, active learning is impractical because of model instability and data scarcity. Fortunately, modern NLP provides an additional source of information: pretrained language models. The pre-training loss can find examples that surprise the model and should be labeled for efficient fine-tuning. Therefore, we treat the language modeling loss as a proxy for classification uncertainty. With BERT, we develop a simple strategy based on the masked language modeling loss that minimizes labeling costs for text classification. Compared to other baselines, our approach reaches higher accuracy within less sampling iterations and computation time.

show abstract

Section: Related Workmentioning

confidence: 99%

Cold-start Active Learning through Self-supervised Language Modeling

Yuan¹,

Lin²,

Boyd-Graber³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…Peris and Casacuberta (2018) applied attention based acquisition functions for NMT. Liu et al (2018) introduced reinforcement learning to actively train an NMT model. Wang and Neubig (2019) proposed a method to select relevant sentences from other languages to bring performance gains in low resource NMT.…”

Section: Related Workmentioning

confidence: 99%

Active Learning Approaches to Enhancing Neural Machine Translation

Zhao¹,

Zhang²,

Zhou³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Active learning is an efficient approach for mitigating data dependency when training neural machine translation (NMT) models. In this paper we explore new training frameworks by incorporating active learning into various techniques such as transfer learning and iterative back-translation (IBT) under a limited human translation budget. We design a word frequency based acquisition function and combine it with a strong uncertainty based method. The combined method steadily outperforms all other acquisition functions in various scenarios. As far as we know, we are the first to do a large-scale study on actively training Transformer (Vaswani et al., 2017) for NMT. Specifically, with a human translation budget of only 20% of the original parallel corpus, we manage to surpass Transformer trained on the entire parallel corpus in three language pairs.

show abstract

“…More recent work has explored bandit optimization for scheduling tasks in a multi-task problem (Graves et al, 2017), and reinforcement learning for selecting examples in a co-trained classifier . Finally, Liu et al (2018) apply imitation learning to actively select monolingual training sentences for labeling in NMT, and show that the learned strategy can be transferred to a related language pair.…”

Section: Related Workmentioning

confidence: 99%

Reinforcement Learning based Curriculum Optimization for Neural Machine Translation

Kumar¹,

Foster²,

Cherry³

et al. 2019

Proceedings of the 2019 Conference of the North

View full text Add to dashboard Cite

We consider the problem of making efficient use of heterogeneous training data in neural machine translation (NMT). Specifically, given a training dataset with a sentence-level feature such as noise, we seek an optimal curriculum, or order for presenting examples to the system during training. Our curriculum framework allows examples to appear an arbitrary number of times, and thus generalizes data weighting, filtering, and fine-tuning schemes. Rather than relying on prior knowledge to design a curriculum, we use reinforcement learning to learn one automatically, jointly with the NMT system, in the course of a single training run. We show that this approach can beat uniform and filtering baselines on Paracrawl and WMT English-to-French datasets by up to +3.4 BLEU, and match the performance of a hand-designed, state-of-theart curriculum.

show abstract

Learning to Actively Learn Neural Machine Translation

Cited by 30 publications

References 22 publications

Cold-start Active Learning through Self-supervised Language Modeling

Cold-start Active Learning through Self-supervised Language Modeling

Active Learning Approaches to Enhancing Neural Machine Translation

Reinforcement Learning based Curriculum Optimization for Neural Machine Translation

Contact Info

Product

Resources

About