Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1256
|View full text |Cite
|
Sign up to set email alerts
|

A Simple Recipe towards Reducing Hallucination in Neural Surface Realisation

Abstract: Recent neural language generation systems often hallucinate contents (i.e., producing irrelevant or contradicted facts), especially when trained on loosely corresponding pairs of the input structure and text. To mitigate this issue, we propose to integrate a language understanding module for data refinement with selftraining iterations to effectively induce strong equivalence between the input data and the paired text. Experiments on the E2E challenge dataset show that our proposed framework can reduce more th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
47
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 60 publications
(64 citation statements)
references
References 21 publications
0
47
0
Order By: Relevance
“…Contemporaneous with our work is the effort of Nie et al (2019), who focus on automatic data cleaning using a NLU iteratively bootstrapped from the noisy data. Their analysis similarly finds that omissions are more common than hallucinations.…”
Section: Discussion and Related Workmentioning
confidence: 99%
“…Contemporaneous with our work is the effort of Nie et al (2019), who focus on automatic data cleaning using a NLU iteratively bootstrapped from the noisy data. Their analysis similarly finds that omissions are more common than hallucinations.…”
Section: Discussion and Related Workmentioning
confidence: 99%
“…The former path is risky as it easily results in ungrammatical targets. The latter approach of enforcing a stronger alignment between inputs and outputs has been tried previously but it assumes a moderate amount of noise in the data (Nie et al, 2019;Dušek et al, 2019). Alternatively, one can leave the data as is and try to put more pressure on the decoder to pay attention to the input at every generation step (Tian et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…It is worth noting that SR can be regarded as a variant of self-training due to its structural similarity, except that it takes the target sentences rather than the source sentences as input to the model. The algorithm itself is the key difference from existing methods based on selftraining (Wang, 2019;Nie et al, 2019;Xie et al, 2020).…”
Section: Proposed Denoising Methodsmentioning
confidence: 99%