Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.522
|View full text |Cite
|
Sign up to set email alerts
|

DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts

Abstract: Despite recent advances in natural language generation, it remains challenging to control attributes of generated text. We propose DEX-PERTS: Decoding-time Experts, a decodingtime method for controlled text generation that combines a pretrained language model with "expert" LMs and/or "anti-expert" LMs in a product of experts. Intuitively, under the ensemble, tokens only get high probability if they are considered likely by the experts and unlikely by the anti-experts. We apply DEXPERTS to language detoxificati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
69
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(148 citation statements)
references
References 34 publications
(42 reference statements)
1
69
0
Order By: Relevance
“…Decoding-time methods (Dathathri et al, 2019;Gehman et al, 2020;Schick et al, 2021;Krause et al, 2020;Xu et al, 2021;Liu et al, 2021a) focus on manipulating the decoding-time behavior of the LMs without changing the model parameters. Simple approaches such as word filtering and vocabulary shifting (Gehman et al, 2020) directly lower the probability of toxic words (e.g., swearwords, slurs, vulgar slang) being generated.…”
Section: Existing Detoxification Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Decoding-time methods (Dathathri et al, 2019;Gehman et al, 2020;Schick et al, 2021;Krause et al, 2020;Xu et al, 2021;Liu et al, 2021a) focus on manipulating the decoding-time behavior of the LMs without changing the model parameters. Simple approaches such as word filtering and vocabulary shifting (Gehman et al, 2020) directly lower the probability of toxic words (e.g., swearwords, slurs, vulgar slang) being generated.…”
Section: Existing Detoxification Methodsmentioning
confidence: 99%
“…However, it requires an external LM trained on non-toxic data, which is not easy to access in practice. DEXPERT (Liu et al, 2021a) controls the generation of large-scale pre-trained LM with an "expert" LM and "anti-expert" LM in a product of experts (Hinton, 2002), which achieves the state-of-the-art detoxification results so far. In this work, we focus on exploring the limits of domainadaptive training methods to reduce the toxicity of language models, because they have the advantages that 1) they achieve time and memory-efficient inference, which is especially important for deploying large-scale LMs, 2) the detoxified LMs checkpoints are flexible to be shared for future down-stream tasks, and 3) they can largely reduce the model toxicity while still maintaining good LM quality measured by perplexity and downstream task performance as we will show in the following section.…”
Section: Existing Detoxification Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Specifically, these methods are compatible with any pre-trained language model for generation without additional training. Given recent development of inference-time methods for control that can reduce toxicity (e.g., PPLM (Dathathri et al, 2019), GeDi (Krause et al, 2020), DExperts (Liu et al, 2021)), there is potential for extending these methods to bias mitigation. Bias Mitigation For autocomplete and dialogue generation, formulate bias triggers using gradient-based methods of Wallace et al (2019).…”
Section: Inference Methodsmentioning
confidence: 99%
“…Inspired by the GEDI [Krause et al 2020], a batch of similar work have emerged. DEXPERTS [Liu et al 2021c] re-ranks the predictions of the PLM based on expert (and anti-expert) opinions during the decoding stage to steer the language model towards the desired generation. FUDGE [Yang and Klein 2021] learns an attribute predictor operating on a partial sequence to adjust the original PLM's probabilities, and obtain an improved performance on the tasks of couplet completion in poetry, topic control in language generation, and formality change in machine translation.…”
Section: Post-processingmentioning
confidence: 99%