Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
DOI: 10.18653/v1/2022.acl-long.469
|View full text |Cite
|
Sign up to set email alerts
|

ParaDetox: Detoxification with Parallel Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(18 citation statements)
references
References 0 publications
0
18
0
Order By: Relevance
“…Moreover, our results suggest that solely cleaning the generated continuations leads to a remarkable PPL deterioration. As Logacheva et al (2022) demonstrates that their method produces fluent cleaned text, this deterioration could be attributed to a loss of context relevance, rather than fluency issues. Further, this approach does not significantly reduce toxicity.…”
Section: Continuationmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreover, our results suggest that solely cleaning the generated continuations leads to a remarkable PPL deterioration. As Logacheva et al (2022) demonstrates that their method produces fluent cleaned text, this deterioration could be attributed to a loss of context relevance, rather than fluency issues. Further, this approach does not significantly reduce toxicity.…”
Section: Continuationmentioning
confidence: 99%
“…Nevertheless, one can obtain nontoxic texts by generating them and then detoxifying them. Thus, we provide an additional experiment comparing one text detoxification method, bartdetox-base (Logacheva et al, 2022), with our LM detoxification one.…”
Section: Continuationmentioning
confidence: 99%
“…Some researchers found that only a small proportion of the whole response (e.g., one or two words) needs to be fixed. Thus, An edition module takes effect after the generation to fix some problems in some works [75,78]. Similarly, text style transfer or rephrasing from toxicity to non-toxicity can also be plugged in this stage [26,66].…”
Section: Towards Pipeline-based Systemmentioning
confidence: 99%
“…In previous work on detoxification methods, such kind of datasets were used to develop and test unsupervised text style transfer approaches (Wu et al, 2019;Tran et al, 2020;Hallinan et al, 2022). However, lately a parallel dataset ParaDetox for training supervised text detoxification models for English was released (Logacheva et al, 2022b) similar to previous parallel TST datasets for formality (Rao and Tetreault, 2018;Briakou et al, 2021). Pairs of toxic-neutral sentences were collected with a pipeline based on three crowdsourcing tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Firstly, several unsupervised methods based on masked language modelling (Tran et al, 2020; and disentangled representations for style and content (John et al, 2019;dos Santos et al, 2018) were explored. More recently, Logacheva et al (2022b) showed the superiority of supervised seq2seq models for detoxification trained on a parallel corpus of crowdsourced toxic ↔ neutral sentence pairs. Afterwards, there were experiments in multilingual detoxification.…”
Section: Introductionmentioning
confidence: 99%