Findings of the Association for Computational Linguistics: EACL 2023 2023
DOI: 10.18653/v1/2023.findings-eacl.63
|View full text |Cite
|
Sign up to set email alerts
|

PLACES: Prompting Language Models for Social Conversation Synthesis

Maximillian Chen,
Alexandros Papangelis,
Chenyang Tao
et al.

Abstract: Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expertwritten conversations as in-context examples to synthesize a social conversation dataset using prompting. We perform several thorough evaluations of our synthetic conversations compared to human-co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…These conversations contain dialogues and chitchat from sources such as TV shows, Vlogs, and other types of videos from Bilibili, a Chinese video-sharing platform. For English dialogues, as no existing domain-comparable dialogue is available, we compare against a set of 200 realistic, human-written dialogues reflecting daily communication that covers various topics about daily life, DailyDialog (Li et al, 2017); a comparison that has been used for previous evaluations of synthetic dialogue quality (Chen et al, 2023).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…These conversations contain dialogues and chitchat from sources such as TV shows, Vlogs, and other types of videos from Bilibili, a Chinese video-sharing platform. For English dialogues, as no existing domain-comparable dialogue is available, we compare against a set of 200 realistic, human-written dialogues reflecting daily communication that covers various topics about daily life, DailyDialog (Li et al, 2017); a comparison that has been used for previous evaluations of synthetic dialogue quality (Chen et al, 2023).…”
Section: Discussionmentioning
confidence: 99%
“…LLMs for Synthetic Data Generation. Prompting LLMs to synthesize and augment language data for existing tasks (Li et al, 2022;Møller et al, 2023;Chen et al, 2023) has emerged as a viable, cost-effective alternative in lieu of crowd-sourced annotation at scale or alternative strategies such as fine-tuning language generators (Papangelis et al, 2021;Zhang et al, 2020) in the dialogue domain. LLMs, trained on massive amounts of web text, suffer from representational and allocational harms (Blodgett et al, 2020;Weidinger et al, 2021).…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation