The Curious Case of Hallucinations in Neural Machine Translation

Raunak, Vikas; Menezes, Arul; Junczys-Dowmunt, Marcin

doi:10.18653/v1/2021.naacl-main.92

Cited by 75 publications

(99 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hallucinations. To estimate the number of hallucinations produced by the systems evaluated, we follow the procedure proposed by and used by Raunak et al (2021). Although their interest was in detecting those sentences that induced the generation of hallucinations after introducing spurious tokens in the input, we adapted it to automatically measure the number of input sentences in a test set for which the corresponding output seems to be an hallucination.…”

Section: Explainabilitymentioning

confidence: 99%

Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach

Sánchez-Cartagena

Esplà-Gomis

Pérez-Ortiz

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

In the context of neural machine translation, data augmentation (DA) techniques may be used for generating additional training samples when the available parallel data are scarce. Many DA approaches aim at expanding the support of the empirical data distribution by generating new sentence pairs that contain infrequent words, thus making it closer to the true data distribution of parallel sentences. In this paper, we propose to follow a completely different approach and present a multi-task DA approach in which we generate new sentence pairs with transformations, such as reversing the order of the target sentence, which produce unfluent target sentences. During training, these augmented sentences are used as auxiliary tasks in a multi-task framework with the aim of providing new contexts where the target prefix is not informative enough to predict the next word. This strengthens the encoder and forces the decoder to pay more attention to the source representations of the encoder. Experiments carried out on six lowresource translation tasks show consistent improvements over the baseline and over DA methods aiming at extending the support of the empirical data distribution. The systems trained with our approach rely more on the source tokens, are more robust against domain shift and suffer less hallucinations.

show abstract

Section: Explainabilitymentioning

confidence: 99%

Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach

Sánchez-Cartagena

Esplà-Gomis

Pérez-Ortiz

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…First, back-translation transformation remove content style but does not necessarily replace attribute markers like style transfer models, for example, given a text "me and my husband ...", style trans-fer models are more likely to change "husband" to "wife" but back-translation will not. Second, our back-translation technique also inherit some of the problems of machine translation generated texts like hallucination (Raunak et al, 2021). We provide examples highlighting these issues in Appendix C.…”

Section: Discussionmentioning

confidence: 99%

Preventing Author Profiling through Zero-Shot Multilingual Back-Translation

Adelani,

Zhang,

Shen

et al. 2021

Preprint

View full text Add to dashboard Cite

Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e.g. their gender or ethnicity. Style transfer is an effective way of transforming texts in order to remove any information that enables author profiling. However, for a number of current state-of-theart approaches the improved privacy is accompanied by an undesirable drop in the downstream utility of the transformed data.In this paper, we propose a simple, zero-shot way to effectively lower the risk of author profiling through multilingual back-translation using off-the-shelf translation models. We compare our models with five representative text style transfer models on three datasets across different domains. Results from both an automatic and a human evaluation show that our approach achieves the best overall performance while requiring no training data. We are able to lower the adversarial prediction of gender and race by up to 22% while retaining 95% of the original utility on downstream tasks.

show abstract

“…Lastly, unlike the aforementioned tasks, the categorizations of hallucinations in machine translation vary within the task. Most relevant literature agrees that translated text is considered a hallucination when the source text is completely disconnected from the translated target [91,125,145]. For further details, please refer to Section 11.…”

Section: Task Comparisonmentioning

confidence: 99%

“…They discovered that such likelihood maximization approaches could result in degeneration, which refers generated output that is bland, incoherent, or gets stuck in repetitive loops [71,185]. Concurrently, it is discovered that NLG models often generate texts that are nonsensical, or unfaithful to the provided source input [82,145,150,178]. Researchers started referring to such undesirable generation as hallucination [117] 1 .…”

Section: Introductionmentioning

confidence: 99%

Survey of Hallucination in Natural Language Generation

Ji¹,

Lee²,

Frieske³

et al. 2022

Preprint

View full text Add to dashboard Cite

Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent natural language generation, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended texts, which degrades the system performance and fail to meet user expectations in many real-world scenarios. In order to address this issue, there have been studies in measuring and mitigating hallucinated texts. However there has not been a comprehensive review of the state-of-the-art in hallucination detection and mitigation.In this survey, we provide a broad overview of the research progress and challenges in the hallucination problem of NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; (2) an overview of task-specific research progress for hallucinations in a large set of downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.

show abstract

The Curious Case of Hallucinations in Neural Machine Translation

Cited by 75 publications

References 24 publications

Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach

Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach

Preventing Author Profiling through Zero-Shot Multilingual Back-Translation

Survey of Hallucination in Natural Language Generation

Contact Info

Product

Resources

About