Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

Dinan, Emily; Abercrombie, Gavin; Bergman, A. Stevie; Spruit, Shannon; Hovy, Dirk; Boureau, Y-Lan; Rieser, Verena

doi:10.48550/arxiv.2107.03451

Cited by 17 publications

(26 citation statements)

References 98 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar issues have also been discussed specifically for dialog models [53]. For instance, examples of bias, offensiveness, and hate speech have been found both in training data drawn from social media, and consequently in the output of dialog models trained on such data [83].…”

Section: Related Workmentioning

confidence: 74%

“…Safety and safety of dialog models: Inappropriate and unsafe risks and behaviors of language models have been extensively discussed and studied in previous works (e.g., [53,54]). Issues encountered include toxicity (e.g., [55,56,57]), bias (e.g., [58,59,60,61,62,63,64,65,66,67,68,69,70,71,72]), and inappropriately revealing personally identifying information (PII) from training data [73].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

LaMDA: Language Models for Dialog Applications

Thoppilan¹,

Freitas²,

Hall³

et al. 2022

Preprint

282

270

View full text Add to dashboard Cite

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformerbased neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency. * Work done while at Google.

show abstract

Section: Related Workmentioning

confidence: 74%

Section: Related Workmentioning

confidence: 99%

LaMDA: Language Models for Dialog Applications

Thoppilan¹,

Freitas²,

Hall³

et al. 2022

Preprint

282

270

View full text Add to dashboard Cite

show abstract

“…Figure 7 (Left) shows that in recent years the compute required for large-scale AI experiments has increased by more than 300, 000X relative to a decade ago. 19 Along with this rise in resource intensity, we see a corresponding (and sharp) fall in the proportion of these results that come from academia (Figure 7, Right). This suggests that, although academics may be strongly motivated by scientific curiosity, and well-poised to research safety issues, they may be significantly challenged by the high financial and engineering costs.…”

Section: Rising Gap Between Industry and Academiamentioning

confidence: 80%

“…This lack of standards compounds the problems caused by the four distinguishing features of generative models we identify in Section 2, as well as the safety issues discussed above. At the same time, there's a growing field of research oriented around identifying the weaknesses of these models, as well as potential problems with their associated development practices [7,67,9,19,72,41,50,62,66].…”

Section: Lack Of Standards and Normsmentioning

confidence: 99%

“…Although we focus on scaling laws, many of our points complement existing views on the societal risks of deploying large models [7,67,9,19,72,41]. However, similarly to [72], we do not consider here the costs of human labor involved in creating and annotating training data [28], the ethics of supply chains involved in creating the requisite hardware on which to train models [18], or the environmental costs of training models [7,50,62,66].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Predictability and Surprise in Large Generative Models

Ganguli¹,

Hernandez²,

Lovitt³

et al. 2022

Preprint

View full text Add to dashboard Cite

Large-scale pre-training has recently emerged as a technique for creating capable, generalpurpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, these generative models have an unusual combination of predictable loss on a broad training distribution (as embodied in their "scaling laws"), and unpredictable specific capabilities, inputs, and outputs. We believe that the high-level predictability and appearance of useful capabilities drives rapid development of such models, while the unpredictable qualities make it difficult to anticipate the consequences of model deployment. We go through examples of how this combination can lead to socially harmful behavior with examples from the literature and real world observations, and we also perform two novel experiments to illustrate our point about harms from unpredictability. Furthermore, we analyze how these conflicting properties combine to give model developers various motivations for deploying these models, and challenges that can hinder deployment. We conclude with a list of possible interventions the AI community may take to increase the chance of these models having a beneficial impact. We intend this paper to be useful to policymakers who want to understand and regulate AI systems, technologists who care about the potential policy impact of their work, and academics who want to analyze, critique, and potentially develop large generative models.

show abstract

Large language models as decision aids in neuro-oncology: a review of shared decision-making applications

Lawson McLean,

Wu,

Lawson McLean

et al. 2024

J Cancer Res Clin Oncol

View full text Add to dashboard Cite

Shared decision-making (SDM) is crucial in neuro-oncology, fostering collaborations between patients and healthcare professionals to navigate treatment options. However, the complexity of neuro-oncological conditions and the cognitive and emotional burdens on patients present significant barriers to achieving effective SDM. This discussion explores the potential of large language models (LLMs) such as OpenAI's ChatGPT and Google's Bard to overcome these barriers, offering a means to enhance patient understanding and engagement in their care. LLMs, by providing accessible, personalized information, could support but not supplant the critical insights of healthcare professionals. The hypothesis suggests that patients, better informed through LLMs, may participate more actively in their treatment choices. Integrating LLMs into neuro-oncology requires navigating ethical considerations, including safeguarding patient data and ensuring informed consent, alongside the judicious use of AI technologies. Future efforts should focus on establishing ethical guidelines, adapting healthcare workflows, promoting patient-oriented research, and developing training programs for clinicians on the use of LLMs. Continuous evaluation of LLM applications will be vital to maintain their effectiveness and alignment with patient needs. Ultimately, this exploration contends that the thoughtful integration of LLMs into SDM processes could significantly enhance patient involvement and strengthen the patient-physician relationship in neuro-oncology care.

show abstract

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

Cited by 17 publications

References 98 publications

LaMDA: Language Models for Dialog Applications

LaMDA: Language Models for Dialog Applications

Predictability and Surprise in Large Generative Models

Large language models as decision aids in neuro-oncology: a review of shared decision-making applications

Contact Info

Product

Resources

About