Exploring Supervised and Unsupervised Rewards in Machine Translation

Ive, Julia; Wang, Zixu; Fomicheva, Marina; Specia, Lucia

doi:10.18653/v1/2021.eacl-main.164

Cited by 2 publications

(3 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, we adopt a multilingual approach to automatically analyze the sources of COVID‐19 legislation, limiting the bias associated with the translation of the original texts into English. This decision is based on previous work that has shown limitations in the use of human‐language technologies when it comes to domain‐specific texts (Duboue et al, 2016; Ive et al, 2020; Vieira et al, 2020). To select which of the NLP techniques is best suited to support human coding, we proceed in two steps.…”

Section: From Tool To Problem‐driven Applications Of Nlp: Analyzing C...mentioning

confidence: 99%

Extracting and classifying exceptional COVID‐19 measures from multilingual legal texts: The merits and limitations of automated approaches

Egger,

Caselli,

Tziafas

et al. 2023

Regulation & Governance

View full text Add to dashboard Cite

This paper contributes to ongoing scholarly debates on the merits and limitations of computational legal text analysis by reflecting on the results of a research project documenting exceptional COVID‐19 management measures in Europe. The variety of exceptional measures adopted in countries characterized by different legal systems and natural languages, as well as the rapid evolution of such measures, pose considerable challenges to manual textual analysis methods traditionally used in the social sciences. To address these challenges, we develop a supervised classifier to support the manual coding of exceptional policies by a multinational team of human coders. After presenting the results of various natural language processing (NLP) experiments, we show that human‐in‐the‐loop approaches to computational text analysis outperform unsupervised approaches in accurately extracting policy events from legal texts. We draw lessons from our experience to ensure the successful integration of NLP methods into social science research agendas.

show abstract

Section: From Tool To Problem‐driven Applications Of Nlp: Analyzing C...mentioning

confidence: 99%

Extracting and classifying exceptional COVID‐19 measures from multilingual legal texts: The merits and limitations of automated approaches

Egger,

Caselli,

Tziafas

et al. 2023

Regulation & Governance

View full text Add to dashboard Cite

show abstract

“…More advanced AC models with Q-Learning are rarely applied to language generation problems. However, there are exceptions (e.g., entropy-regularised AC models that promote exploration of actions (Dai et al, 2018;Ive et al, 2021)). This could be explained by the difficulty of approximating the Q-function for large action space.…”

Section: Reinforcement Learning Algorithms For Neuralmentioning

confidence: 99%

“…Unsupervised Rewards for Language Generation Tasks Recent work on unsupervised rewards in NLP has explored both dynamic (Ive et al, 2021) and static rewards (Gao et al, 2020;Garg et al, 2021). For example, Ive et al (2021) introduces a dynamic distribution over latent frequency classes as a reward signal. This distribution is shaped to promote more rare words in the policy search space.…”

Section: Reinforcement Learning Algorithms For Neuralmentioning

confidence: 99%

SURF: Semantic-level Unsupervised Reward Function for Machine Translation

Anuchitanukul¹,

Ive²

2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

Self Cite

View full text Add to dashboard Cite

The performance of Reinforcement Learning (RL) for natural language tasks including Machine Translation (MT) is crucially dependent on the reward formulation. This is due to the intrinsic difficulty of the task in the high-dimensional discrete action space as well as the sparseness of the standard reward functions defined for limited set of groundtruth sequences biased towards singular lexical choices. To address this issue, we formulate SURF, a maximally dense semantic-level unsupervised reward function which mimics human evaluation by considering both sentence fluency and semantic similarity. We demonstrate the strong potential of SURF to leverage a family of Actor-Critic Transformerbased Architectures with synchronous and asynchronous multi-agent variants. To tackle the problem of large action-state spaces, each agent is equipped with unique exploration strategies, promoting diversity during its exploration of the hypothesis space. When BLEU scores are compared, our dense unsupervised reward outperforms the standard sparse reward by 2% on average for in-and out-of-domain settings.

show abstract

Exploring Supervised and Unsupervised Rewards in Machine Translation

Cited by 2 publications

References 27 publications

Extracting and classifying exceptional COVID‐19 measures from multilingual legal texts: The merits and limitations of automated approaches

Extracting and classifying exceptional COVID‐19 measures from multilingual legal texts: The merits and limitations of automated approaches

SURF: Semantic-level Unsupervised Reward Function for Machine Translation

Contact Info

Product

Resources

About