2021
DOI: 10.48550/arxiv.2109.08914
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Text Detoxification using Large Pre-trained Neural Models

Abstract: We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small styleconditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 9 publications
(15 reference statements)
0
4
0
Order By: Relevance
“…In it a language model performs text generation guided by another language model conditioned for the specific topic or style or topic. More precisely, in our work, we adopt the extension of this method presented in (Dale et al, 2021), where the authors enable the model not only to generate but to paraphrase the input text. Below, a brief description of the method is given.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…In it a language model performs text generation guided by another language model conditioned for the specific topic or style or topic. More precisely, in our work, we adopt the extension of this method presented in (Dale et al, 2021), where the authors enable the model not only to generate but to paraphrase the input text. Below, a brief description of the method is given.…”
Section: Methodsmentioning
confidence: 99%
“…GeDi (Krause et al, 2020) uses a small external language model classifier (or simply GeDi-classifier) to guide the generation of the main language model, re-weighting next token probabilities and, thus, increasing the probabilities of words in the given style. ParaGeDi (Dale et al, 2021) adopts this idea to the paraphrasing task by applying the GeDi approach in combination not with the standard language model but with the paraphraser fine-tuned to rephrase the original text preserving its original meaning.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, STRAP has been shown to require large amounts of stylespecific training data (Patel, Andrews, and Callison-Burch 2022). Prior work has explored applying controllable text generation techniques to style transfer (Dale et al 2021;Kumar et al 2021;Mireshghallah, Goyal, and Berg-Kirkpatrick 2022). Our approach is most similar in spirit to Mireshghallah, Goyal, and Berg-Kirkpatrick (2022).…”
Section: Related Workmentioning
confidence: 99%