2021
DOI: 10.48550/arxiv.2110.08466
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(15 citation statements)
references
References 0 publications
0
14
0
Order By: Relevance
“…For further clarifying what safety problems cover, [1496] proposes a classification of safety issues in open-domain conversational systems including three general categories and emphasizes the importance of context. More elaborately, [1497] recently proposes a more fine-grained safety issue taxonomy that divides personal and non-personal unsafe behaviors in dialogues and defines 7 sub-categories of unsafe responses. In summary, there are some safety issues of the dialogue system as follows.…”
Section: Safety and Ethical Riskmentioning
confidence: 99%
“…For further clarifying what safety problems cover, [1496] proposes a classification of safety issues in open-domain conversational systems including three general categories and emphasizes the importance of context. More elaborately, [1497] recently proposes a more fine-grained safety issue taxonomy that divides personal and non-personal unsafe behaviors in dialogues and defines 7 sub-categories of unsafe responses. In summary, there are some safety issues of the dialogue system as follows.…”
Section: Safety and Ethical Riskmentioning
confidence: 99%
“…Inheriting from pre-trained language models, dialog safety issues, including toxicity and offensiveness (Baheti et al, 2021;Cercas Curry and Rieser, 2018;, bias (Henderson et al, 2018;Barikeri et al, 2021;Lee et al, 2019), privacy (Weidinger et al, 2021), and sensitive topics Sun et al, 2021), are exceeding studied and increasingly drawing attention. In the conversational unsafety measurement (Cercas Curry and Rieser, 2018;Sun et al, 2021;Edwards et al, 2021), adversarial learning for safer bots Gehman et al, 2020) and bias mitigation Thoppilan et al, 2022) strategies, unsafety behaviour detecting task plays an important role.…”
Section: Dialog Safety and Social Biasmentioning
confidence: 99%
“…However, neural open-domain conversational agents trained on large-scale unlabeled data may pick up many unsafe features in the corpora, e.g., offensive languages, social biases, violet, etc Barikeri et al, 2021;Weidinger et al, 2021;Sun et al, 2021). Unlike other unsafety problems, social biases that convey negative stereotypes or prejudices on specific populations are usually stated in implicit expressions rather than explicit words Blodgett et al, 2020), thus is a challenging task to deal with.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations