2022
DOI: 10.48550/arxiv.2203.11147
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Teaching language models to support answers with verified quotes

Abstract: Recent large language models often answer factual questions correctly. But users can't trust any given claim a model makes without fact-checking, because language models can hallucinate convincing nonsense. In this work we use reinforcement learning from human preferences (RLHP) to train "open-book" QA models that generate answers whilst also citing specific evidence for their claims, which aids in the appraisal of correctness. Supporting evidence is drawn from multiple documents found via a search engine, or … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 18 publications
(20 citation statements)
references
References 20 publications
0
12
0
Order By: Relevance
“…A similar approach could be derived for our setting. Enabling the model to support outputs with reference to particular locations within the visual inputs, or to external verified quotes is also interesting direction (Menick et al, 2022;Thoppilan et al, 2022). Finally, in Figure 11, we provide qualitative examples demonstrating that Flamingo can explain its own outputs, suggesting avenues to explainability and interpretability using the model's text interface.…”
Section: Risks and Mitigation Strategiesmentioning
confidence: 91%
“…A similar approach could be derived for our setting. Enabling the model to support outputs with reference to particular locations within the visual inputs, or to external verified quotes is also interesting direction (Menick et al, 2022;Thoppilan et al, 2022). Finally, in Figure 11, we provide qualitative examples demonstrating that Flamingo can explain its own outputs, suggesting avenues to explainability and interpretability using the model's text interface.…”
Section: Risks and Mitigation Strategiesmentioning
confidence: 91%
“…Rationales are typically defined as masks on the input passage (Lei et al, 2016), with the goal of finding the minimal rationale that is sufficient to identify the ground truth label (DeYoung et al, 2020). Such masks can be learned from human annotations (Zaidan et al, 2007;Menick et al, 2022) or from unsupervised objectives such as information bottleneck (Paranjape et al, 2020). We depart from fully extractive rationales by adding decontextualizing markup, unlike prior work in which decontextualization is performed inline (Choi et al, 2021), obscuring the relationship to the original text.…”
Section: Related Workmentioning
confidence: 99%
“…1 and 2a), which contains all the information necessary to solve the problem, as well as potentially irrelevant distractors. In the future this assumption can be relaxed, for example by extracting the necessary information through search (Lazaridou et al, 2022;Menick et al, 2022). We also assume that all questions are well posed and definitively answerable given the context.…”
Section: The Selection-inference (Si) Frameworkmentioning
confidence: 99%