Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning

Adewumi, Tosin; Abid, Nosheen; Pahlavan, Maryam; Brännvall, Rickard; Sabry, Sana Sabah; Liwicki, Foteini; Liwicki, Marcus

doi:10.7557/18.6231

Cited by 12 publications

(16 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A turn (or utterance) in a conversation is each single contribution from a speaker (Schegloff, 1968;Jurafsky and Martin, 2020). The data may be from written conversations, such as the MultiWOZ (Eric et al, 2020), transcripts of human-human spoken conversations, such as the Gothenburg Dialogue Corpus (GDC) (Allwood et al, 2003), crowdsourced conversations, such as the EmpatheticDialogues (Rashkin et al, 2019), and social media conversations like Familjeliv 1 or Reddit 2 (Adewumi et al, 2022c;Adewumi et al, 2022a). As already acknowledged that the amount of data needed for training deep ML models is usually large, they are normally first pretrained on large, unstructured text or conversations before being finetuned on specific conversational data.…”

Section: Introductionmentioning

confidence: 99%

State-of-the-art in Open-domain Conversational AI: A Survey

Adewumi¹,

Liwicki²,

Liwicki³

2022

Preprint

Self Cite

View full text Add to dashboard Cite

We survey SoTA open-domain conversational AI models with the purpose of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI are known to have several challenges, including bland responses and performance degradation when prompted with figurative language, among others. First, we provide some background by discussing some topics of interest in conversational AI. We then discuss the method applied to the two investigations carried out that make up this study. The first investigation involves a search for recent SoTA open-domain conversational AI models while the second involves the search for 100 conversational AI to assess their gender. Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversational AI. One main take-away is that hybrid models of conversational AI offer more advantages than any single architecture. The key contributions of this survey are 1) the identification of prevailing challenges in SoTA open-domain conversational AI, 2) the unusual discussion about opendomain conversational AI for low-resource languages, and 3) the discussion about the ethics surrounding the gender of conversational AI.

show abstract

Section: Introductionmentioning

confidence: 99%

State-of-the-art in Open-domain Conversational AI: A Survey

Adewumi¹,

Liwicki²,

Liwicki³

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…A turn (or utterance) in a conversation is each single contribution from a speaker [2,9]. The data may be from written conversations, such as the MultiWOZ [10], transcripts of human-human spoken conversations, such as the Gothenburg Dialogue Corpus (GDC) [11], crowdsourced conversations, such as the EmpatheticDialogues [12], and social media conversations such as Familjeliv (familjeliv.se) or Reddit (reddit.com) [13,14]. As already acknowledged that the amount of data needed for training deep ML models is usually large, they are normally first pretrained on large, unstructured text or conversations before being fine-tuned on specific conversational data.…”

Section: Introductionmentioning

confidence: 99%

State-of-the-Art in Open-Domain Conversational AI: A Survey

Adewumi

Liwicki

2022

Information

Self Cite

View full text Add to dashboard Cite

We survey SoTA open-domain conversational AI models with the objective of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI models are known to have several challenges, including bland, repetitive responses and performance degradation when prompted with figurative language, among others. First, we provide some background by discussing some topics of interest in conversational AI. We then discuss the method applied to the two investigations carried out that make up this study. The first investigation involves a search for recent SoTA open-domain conversational AI models, while the second involves the search for 100 conversational AI to assess their gender. Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversational AI. One main takeaway is that hybrid models of conversational AI offer more advantages than any single architecture. The key contributions of this survey are (1) the identification of prevailing challenges in SoTA open-domain conversational AI, (2) the rarely held discussion on open-domain conversational AI for low-resource languages, and (3) the discussion about the ethics surrounding the gender of conversational AI.

show abstract

“…The rest of the paper is organized as follows. The Background Section (2) presents brief details about some topics in conversational AI; the Benefits of Conversational AI Section (3) highlights some of the benefits that motivate research in conversational AI; the Methods Section (4) describes the details of the approach for the two investigations carried out in this survey; two Results of the Survey Sections (5 & 6) then follow with details of the outcome of the methods; thereafter, the Existing Challenges Section (7) shares the prevailing challenges to obtaining "human" performance; Open-domain Conversational AI for Low-resource Languages Section (8) discusses this critical challenge and some of the attempts at solving it; the Related Work Section (9) highlights previous related reviews‚ the Conclusion Section (11) summarizes the study after the limitations are given in the Limitation Section.…”

Section: Introductionmentioning

confidence: 99%

“…A turn (or utterance) in a conversation is each single contribution from a speaker [2,7]. The data may be from written conversations, such as the MultiWOZ [8], transcripts of human-human spoken conversations, such as the Gothenburg Dialogue Corpus (GDC) [9], crowdsourced conversations, such as the EmpatheticDialogues [10], and social media conversations like Familjeliv 1 or Reddit 2 [11,12]. As already acknowledged that the amount of data needed for training deep ML models is usually large, they are normally first pretrained on large, unstructured text or conversations before being finetuned on specific conversational data.…”

Section: Introductionmentioning

confidence: 99%

State-of-the-art in Open-domain Conversational AI: A Survey

Adewumi¹,

Liwicki²,

Liwicki³

2022

Preprint

View full text Add to dashboard Cite

We survey SoTA open-domain conversational AI models with the purpose of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI are known to have several challenges, including bland responses and performance degradation when prompted with figurative language, among others. First, we provide some background by discussing some topics of interest in conversational AI. We then discuss the method applied to the two investigations carried out that make up this study. The first investigation involves a search for recent SoTA open-domain conversational AI models while the second involves the search for 100 conversational AI to assess their gender. Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversational AI. One main take-away is that hybrid models of conversational AI offer more advantages than any single architecture. The key contributions of this survey are 1) the identification of prevailing challenges in SoTA open-domain conversational AI, 2) the unusual discussion about open-domain conversational AI for low-resource languages, and 3) the discussion about the ethics surrounding the gender of conversational AI.

show abstract

Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning

Cited by 12 publications

References 15 publications

State-of-the-art in Open-domain Conversational AI: A Survey

State-of-the-art in Open-domain Conversational AI: A Survey

State-of-the-Art in Open-Domain Conversational AI: A Survey

State-of-the-art in Open-domain Conversational AI: A Survey

Contact Info

Product

Resources

About