Improving Factual Consistency Between a Response and Persona Facts

Mesgar, Mohsen; Simpson, Edwin; Gurevych, Iryna

doi:10.18653/v1/2021.eacl-main.44

Cited by 17 publications

(13 citation statements)

References 19 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As pointed out by Ranzato et al [142], word-level maximum likelihood training leads to the problem of exposure bias. Some research [3,73,84,102,120,135,163] adopt reinforcement learning to solve the hallucination problem, which utilizes different rewards to optimize the model. The reward is crucial and bottleneck of reinforcement learning and the approach to calculate reward score is related to exploring automatic metrics to evaluate the generated results.…”

Section: Trainingmentioning

confidence: 99%

“…By conditioning the response generation on the persona description, a chit-chat model is expected to require an ability to generate a more persona-consistent response. Lately, the application of NLI methods [100,159] or reinforcement learning frameworks [120] have been investigated. Although these conditioning methods using PersonaChat datasets are successful, further investigation of approaches that do not rely on the given set of persona descriptions is necessary because the former is not always available, and covering every aspect of persona with them is impossible.…”

Section: External Consistencymentioning

confidence: 99%

See 1 more Smart Citation

Survey of Hallucination in Natural Language Generation

Ji¹,

Lee²,

Frieske³

et al. 2022

Preprint

View full text Add to dashboard Cite

Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent natural language generation, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended texts, which degrades the system performance and fail to meet user expectations in many real-world scenarios. In order to address this issue, there have been studies in measuring and mitigating hallucinated texts. However there has not been a comprehensive review of the state-of-the-art in hallucination detection and mitigation.In this survey, we provide a broad overview of the research progress and challenges in the hallucination problem of NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; (2) an overview of task-specific research progress for hallucinations in a large set of downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.

show abstract

Section: Trainingmentioning

confidence: 99%

Section: External Consistencymentioning

confidence: 99%

Survey of Hallucination in Natural Language Generation

Ji¹,

Lee²,

Frieske³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…DialogNLI (Welleck et al, 2019b), Arun et al ( 2020), Ghazvininejad et al (2018) DECODE (Nie et al, 2021), CI-ToD (Qin et al, 2021) TransferTransfo (Mesgar et al, 2021), UL (Li et al, 2020a) Blender (Roller et al, 2021) DialogNLI (Welleck et al, 2019b), DECODE (Nie et al, 2021) KvBERT (Song et al, 2020a), RCDG TransferTransfo (Mesgar et al, 2021), UL (Li et al, 2020a) GDR (Song et al, 2020b) Structured Knowledge KvBERT (Song et al, 2020a), CI-ToD (Qin et al, 2021) (e.g. Knowledge Graph) NPH (Dziri et al, 2021a) User Query CI-ToD (Qin et al, 2021) (i.e.…”

Section: History Dialoguementioning

confidence: 99%

“…The consistency evaluation is based on an NLI classifier to compute the entailment score. Mesgar et al (2021) also propose an RL-based model TransferTransfo-RL for improving consistency between generated responses and personas. Differently, TransferTransfo-RL take the advantage of Actor-Critic (Mnih et al, 2016) learning approach, which also utilizes the entailment score as reward.…”

Section: Auxiliary Tasksmentioning

confidence: 99%

Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

Li¹,

Wu²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

Natural Language Generation (NLG) has made great progress in recent years due to the development of deep learning techniques such as pre-trained language models. This advancement has resulted in more fluent, coherent and even properties controllable (e.g. stylistic, sentiment, length etc.) generation, naturally leading to development in downstream tasks such as abstractive summarization, dialogue generation, machine translation, and data-to-text generation. However, the faithfulness problem that the generated text usually contains unfaithful or non-factual information has become the biggest challenge, which makes the performance of text generation unsatisfactory for practical applications in many real-world scenarios. Many studies on analysis, evaluation, and optimization methods for faithfulness problems have been proposed for various tasks, but have not been organized, compared and discussed in a combined manner. In this survey, we provide a systematic overview of the research progress on the faithfulness problem of NLG, including problem analysis, evaluation metrics and optimization methods. We organize the evaluation and optimization methods for different tasks into a unified taxonomy to facilitate comparison and learning across tasks. Several research trends are discussed further. NLG Fluency Grammatical Coherence InformativenessDiversity Specificity Redundnacy Controllability Stylistic Attributes Content Faithfulness Consistency Fidelity Factuality Figure 1: Four aspects of the NLG challenge. Faithfulness has become the biggest challenge in modern natural language generation.

show abstract

“…Existing works on building reliable dialog systems are generally divided into two categories: chit-chat open-domain dialog generation [38,72,73] and task-oriented dialog generation [79]. Attempts to open-domain dialog generation include generating more coherent [1,41,42], diverse [5,77], personalized [40,55] utterances. With the emergence of task-oriented datasets [7,17,66,74], more practice has been devoted to task-oriented dialog generation, which usually involves a pipeline of intent classification [67], dialog state tracking [25][26][27], dialog policy making [10,45] and dialog generation [15].…”

Section: Textual Dialog Generationmentioning

confidence: 99%

Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation

Wang¹,

Meng²,

Sun³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multi-modal dialog modeling is of growing interest. In this work, we propose frameworks to resolve a specific case of multi-modal dialog generation that better mimics multi-modal dialog generation in the real world, where each dialog turn is associated with the visual context in which it takes place. Specifically, we propose to model the mutual dependency between text-visual features, where the model not only needs to learn the probability of generating the next dialog utterance given preceding dialog utterances and visual contexts, but also the probability of predicting the visual features in which a dialog utterance takes place, leading the generated dialog utterance specific to the visual context. We observe significant performance boosts over vanilla models when the mutual dependency between text and visual features is modeled. 1

show abstract

Improving Factual Consistency Between a Response and Persona Facts

Cited by 17 publications

References 19 publications

Survey of Hallucination in Natural Language Generation

Survey of Hallucination in Natural Language Generation

Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation

Contact Info

Product

Resources

About