2020
DOI: 10.1609/aaai.v34i04.6097
|View full text |Cite
|
Sign up to set email alerts
|

Transductive Ensemble Learning for Neural Machine Translation

Abstract: Ensemble learning, which aggregates multiple diverse models for inference, is a common practice to improve the accuracy of machine learning tasks. However, it has been observed that the conventional ensemble methods only bring marginal improvement for neural machine translation (NMT) when individual models are strong or there are a large number of individual models. In this paper, we study how to effectively aggregate multiple NMT models under the transductive setting where the source sentences of the test set… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Ensemble Learning Ensemble learning has been applied to various applications or scenarios for enhanced learning performance. In language translation tasks, transductive ensemble learning (TEL) is proposed to surpass marginal improvement on accuracy of traditional ensemble algorithms [21]. In zeroshot learning scenarios, multi-patch generative adversarial nets (MPGAN) with novel weighted voting strategies are also proposed for improvement of current ensemble learning algorithms for better performance [4].…”
Section: Related Workmentioning
confidence: 99%
“…Ensemble Learning Ensemble learning has been applied to various applications or scenarios for enhanced learning performance. In language translation tasks, transductive ensemble learning (TEL) is proposed to surpass marginal improvement on accuracy of traditional ensemble algorithms [21]. In zeroshot learning scenarios, multi-patch generative adversarial nets (MPGAN) with novel weighted voting strategies are also proposed for improvement of current ensemble learning algorithms for better performance [4].…”
Section: Related Workmentioning
confidence: 99%
“…There have been numerous works applying ensemble/knowledge distillation (Hinton et al, 2015) to machine translation (Kim and Rush, 2016;Freitag et al, 2017;Nguyen et al, 2020;Wang et al, 2020, dependency parsing (Kuncoro et al, 2016) and question answering (Mun et al, 2018;Ze et al, 2020;You et al, 2021;Chen et al, 2012). Regarding ensembling AMR graphs, Barzdins and Gosko (2016) propose choosing the AMR with highest average sentence Smatch to all other AMRs.…”
Section: Related Workmentioning
confidence: 99%
“…In this paper, we use Transductive Ensemble Learning (TEL) [23] to aggregate multiple individual models for better performance. Note that, TEL is applied under the transductive setting, i.e., the model can observe the input sentences in the test set.…”
Section: Combining Improvementsmentioning
confidence: 99%
“…At last, we insert constituent attention (CA) module [22] to the Transformer encoder, which adds an extra constraint to attention heads to follow tree structures that can better capture the inherent dependency structure of input sentences. We also aggregate multiple models of these methods for inference following transductive ensemble learning (TEL) [23].…”
Section: Introductionmentioning
confidence: 99%