2022
DOI: 10.48550/arxiv.2208.03188
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

Abstract: We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a longterm memory, and having been trained on a large number of user defined tasks. We release both the model weights and code, and have also deployed the model on a public web page to interact with organic users. This technical report describes how the model was built (architecture, model and training scheme), and details of its deployment, including safety mechanisms. Human evaluations… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(28 citation statements)
references
References 56 publications
0
28
0
Order By: Relevance
“…Recently several language models, such as Blenderbot [18] Lamda [20], and ChatGPT [13] have been introduced that are specifically tuned for dialog applications, but achieving conversational interaction can be achieved via prompt engineering with general purpose large language models as well. Valvoda et al found that fine-tuning a large language model for dialog resulted in duller and more repetitive output, while generating dynamic prompts resulted in more novel and diverse responses [21].…”
Section: Related Workmentioning
confidence: 99%
“…Recently several language models, such as Blenderbot [18] Lamda [20], and ChatGPT [13] have been introduced that are specifically tuned for dialog applications, but achieving conversational interaction can be achieved via prompt engineering with general purpose large language models as well. Valvoda et al found that fine-tuning a large language model for dialog resulted in duller and more repetitive output, while generating dynamic prompts resulted in more novel and diverse responses [21].…”
Section: Related Workmentioning
confidence: 99%
“…• OPT-175B (Zhang et al, 2022) and BB3-175B (Shuster et al, 2022b): we compare the 175B language model OPT (either 0-shot or few-shot 2 ) with BlenderBot 3, which is fine-tuned with conversational datasets including modular supervision, and internetaugmentation, from our task. This setting examines if our experiments and results are applicable to very large language models.…”
Section: Deployed Modelsmentioning
confidence: 99%
“…Very large models benefit from feedback from smaller models OPT-175B, either in zero-shot or few-shot variants is only pre-trained on dialogue data, and not fine-tuned on our task, and performs reasonably -but not better than smaller models that are fine-tuned. BlenderBot 3 (Shuster et al, 2022b) is trained with the modular supervision feedback data collected from the smaller (3B parameter) models, in addition to fine-tuning on other standard dialogue datasets. This model provides the best human evaluation metrics of all the systems we test, with a good response rate of 64.8% and a rating of 4.08.…”
Section: Modular Superior To Non-modular Feedbackmentioning
confidence: 99%
See 1 more Smart Citation
“…Our goal here was to learn a direct mapping from a intent-attribute-value input text to a human-readable Portuguese text. We tested four text-to-text transformer-based models: Bart [Lewis et al 2019], T5 [Xue et al 2020], Blenderbot [Shuster et al 2022] and GPT2 [Radford et al 2019]. While GPT2 utilizes a decoder only module, the first three models use a encoder-decoder scheme [Cho et al 2014].…”
Section: End-to-end Architecturementioning
confidence: 99%