Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.47
|View full text |Cite
|
Sign up to set email alerts
|

Reframing Human-AI Collaboration for Generating Free-Text Explanations

Abstract: Large language models are increasingly capable of generating fluent-appearing text with relatively little task-specific supervision. But can these models accurately explain classification decisions? We consider the task of generating free-text explanations using human-written examples in a few-shot manner. We find that (1) authoring higher quality prompts results in higher quality generations; and (2) surprisingly, in a head-to-head comparison, crowdworkers often prefer explanations generated by GPT-3 to crowd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(39 citation statements)
references
References 24 publications
0
23
0
Order By: Relevance
“…Concurrent to our work, Yordanov et al ( 2021) study self-rationalization transfer from a highresource task to a task with only a few humanauthored explanations. Wiegreffe et al (2022) analyze explanations obtained by prompting GPT-3 multiple times to get multiple explanation candidates, and then filter these candidates using a model trained to predict acceptability of explanations. Their prompt consists of a few examples with high-quality explanations written by the authors and a new instance together with its gold label.…”
Section: Related Workmentioning
confidence: 99%
“…Concurrent to our work, Yordanov et al ( 2021) study self-rationalization transfer from a highresource task to a task with only a few humanauthored explanations. Wiegreffe et al (2022) analyze explanations obtained by prompting GPT-3 multiple times to get multiple explanation candidates, and then filter these candidates using a model trained to predict acceptability of explanations. Their prompt consists of a few examples with high-quality explanations written by the authors and a new instance together with its gold label.…”
Section: Related Workmentioning
confidence: 99%
“…has shown that task performance can be improved by sampling multiple language model outputs for ensembling, (2) prompt-order ensembling, where previous work (Lu et al, 2021;Zhao et al, 2021) has shown that task performance is sensitive to the order of the exemplars in the prompts, and (3) input-rationale ensembling, where human-written rationales can be replaced by model-generated rationales, leveraging the ability of language models to generate high-quality explanations (Wiegreffe et al, 2022). Figure 1 provides an overview of rationale-augmented ensembling approaches.…”
Section: Language Modelmentioning
confidence: 99%
“…Rationale-augmented ensembles Input/Prompt Output Self-consistency (Wang et al, 2022) fixed sampled Prompt-order ensemble (Lu et al, 2021;Zhao et al, 2021) shuffled greedy/sampled Input-rationale ensemble, adapted from (Wiegreffe et al, 2022) sampled greedy/sampled Table 2: Methods for generating rationale-augmented ensembles in language models.…”
Section: Rationale-augmented Ensembles In Language Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…GPT-3 Rationales for Gold Labels. Wiegreffe et al (2022) collected 250 high quality free-text rationales generated by few-shot prompting with GPT-3 (Brown et al, 2020) for CQA (given gold labels). Each example was assessed by 3 crowdworkers.…”
Section: Evaluating Rationales In Few-shot Promptingmentioning
confidence: 99%