Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.164
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Supervised and Unsupervised Rewards in Machine Translation

Abstract: Reinforcement Learning (RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time. When applied to neural Machine Translation (MT), it minimises the mismatch between the cross-entropy loss and non-differentiable evaluation metrics like BLEU. However, the suitability of these metrics as reward function at training time is questionable: they tend to be sparse and biased towards the specific words used in the referen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…Finally, we adopt a multilingual approach to automatically analyze the sources of COVID‐19 legislation, limiting the bias associated with the translation of the original texts into English. This decision is based on previous work that has shown limitations in the use of human‐language technologies when it comes to domain‐specific texts (Duboue et al, 2016; Ive et al, 2020; Vieira et al, 2020). To select which of the NLP techniques is best suited to support human coding, we proceed in two steps.…”
Section: From Tool To Problem‐driven Applications Of Nlp: Analyzing C...mentioning
confidence: 99%
“…Finally, we adopt a multilingual approach to automatically analyze the sources of COVID‐19 legislation, limiting the bias associated with the translation of the original texts into English. This decision is based on previous work that has shown limitations in the use of human‐language technologies when it comes to domain‐specific texts (Duboue et al, 2016; Ive et al, 2020; Vieira et al, 2020). To select which of the NLP techniques is best suited to support human coding, we proceed in two steps.…”
Section: From Tool To Problem‐driven Applications Of Nlp: Analyzing C...mentioning
confidence: 99%
“…More advanced AC models with Q-Learning are rarely applied to language generation problems. However, there are exceptions (e.g., entropy-regularised AC models that promote exploration of actions (Dai et al, 2018;Ive et al, 2021)). This could be explained by the difficulty of approximating the Q-function for large action space.…”
Section: Reinforcement Learning Algorithms For Neuralmentioning
confidence: 99%
“…Unsupervised Rewards for Language Generation Tasks Recent work on unsupervised rewards in NLP has explored both dynamic (Ive et al, 2021) and static rewards (Gao et al, 2020;Garg et al, 2021). For example, Ive et al (2021) introduces a dynamic distribution over latent frequency classes as a reward signal. This distribution is shaped to promote more rare words in the policy search space.…”
Section: Reinforcement Learning Algorithms For Neuralmentioning
confidence: 99%