Read before Generate! Faithful Long Form Question Answering with Machine Reading

Su, Dan; Li, Xiaoguang; Zhang, Jindi; Shang, Lifeng; Jiang, Xin; Li, Qun; Fung, Pascale

doi:10.18653/v1/2022.findings-acl.61

Cited by 15 publications

(15 citation statements)

References 14 publications

(16 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This shares motivation with a line of work studying query-focused summarization (Xu and Lapata, 2020). Concurrent to our work, Su et al (2022) studies improving faithfulness of long-form answer through predicting and focusing on salient information in retrieved evidence document. Lastly, our work build up on three datasets containing longform answers (Kwiatkowski et al, 2019;Fan et al, 2019;Nakano et al, 2021) and extends the analysis of long-form answers from earlier studies (Krishna et al, 2021).…”

Section: Related Workmentioning

confidence: 73%

How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

Xu¹,

Li²,

Choi³

2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

Long-form answers, consisting of multiple sentences, can provide nuanced and comprehensive answers to a broader set of questions. To better understand this complex and understudied task, we study the functional structure of long-form answers collected from three datasets, ELI5 (Fan et al., 2019), We-bGPT (Nakano et al., 2021) and Natural Questions (Kwiatkowski et al., 2019). Our main goal is to understand how humans organize information to craft complex answers. We develop an ontology of six sentence-level functional roles for long-form answers, and annotate 3.9k sentences in 640 answer paragraphs. Different answer collection methods manifest in different discourse structures. We further analyze model-generated answers -finding that annotators agree less with each other when annotating model-generated answers compared to annotating human-written answers. Our annotated data enables training a strong classifier that can be used for automatic analysis. We hope our work can inspire future research on discourselevel modeling and evaluation of long-form QA systems. 1

show abstract

Section: Related Workmentioning

confidence: 73%

How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

Xu¹,

Li²,

Choi³

2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

show abstract

“…Most developed language models produced nonfactual information [23], [24], [25] and [26] similar to other LLMs [27], [28], [29] and [30], the ChatGPT tool hallucinated some facts. For example, in Figure 6 below, when the researchers asked the tool about the flag of Yemen, a part of the generated response was correct, while another part of the information was revealed to be incorrect when verifying with the source.…”

Section: Redundancy Authenticity and Relatedness Principlesmentioning

confidence: 99%

Proposed Framework for Human-like Language Processing of ChatGPT in Academic Writing

Mahyoob,

Algaraady,

Alblwi

2023

Int. J. Emerg. Technol. Learn.

View full text Add to dashboard Cite

The study proposed a framework for analyzing and measuring the ChatGPT capabilities as a generic language model. This study aims to examine the capabilities of the emerging technological Artificial Intelligence tool (ChatGPT) in generating effective academic writing. The proposed framework consists of six principles (Relatedness, Adequacy, Limitation, Authenticity, Cognition, and Redundancy) related to Artificial Language Processing which would explore the accuracy and proficiency of this algorithm-generated writing. The researchers used ChatGPT to obtain some academic texts and paragraphs in different genres as responses to some textbased academic queries. A critical analysis of the content of these academic texts was conducted based on the proposed framework principles. The results show that despite ChatGPT’s exceptional capabilities, its serious defects are evident, as many issues in academic writing are raised. The major issues include information repetition, nonfactual inferences, illogical reasoning, fake references, hallucination, and lack of pragmatic interpretation. The proposed framework would be a valuable guideline for researchers and practitioners interested in analyzing and evaluating recently emerging machine languages of AI language models.

show abstract

“…We choose BART-large (Lewis et al, 2020), a transformer-based (Vaswani et al, 2017) generative pre-trained language model, as our backbone model for the generator because of its remarkable performance on text summarization benchmarks. Following the idea of Fusion-in-Decoder (FiD) and its applications in generation tasks (Izacard and Grave, 2021;Su et al, 2022;Vig et al, 2022), we employ FiD-BART for encoding multiple segments independently in the encoder and fuse information from all segments in the decoder jointly through the encoder-decoder attention.…”

Section: Generatormentioning

confidence: 99%

Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge

Yu,

Ji,

Fung

2023

Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)

View full text Add to dashboard Cite

Query-Focused Meeting Summarization (QFMS) aims to generate a summary of a given meeting transcript conditioned upon a query. The main challenges for QFMS are the long input text length and sparse query-relevant information in the meeting transcript. In this paper, we propose a knowledge-enhanced two-stage framework called Knowledge-Aware Summarizer (KAS) to tackle the challenges. In the first stage, we introduce knowledge-aware scores to improve the query-relevant segment extraction. In the second stage, we incorporate query-relevant knowledge in the summary generation.Experimental results on the QMSum dataset show that our approach achieves state-of-the-art performance. Further analysis proves the competency of our methods in generating relevant and faithful summaries. 1

show abstract

Read before Generate! Faithful Long Form Question Answering with Machine Reading

Cited by 15 publications

References 14 publications

How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

Proposed Framework for Human-like Language Processing of ChatGPT in Academic Writing

Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge

Contact Info

Product

Resources

About