DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts

Liu, Alisa; Sap, Maarten; Lu, Ximing; Swayamdipta, Swabha; Bhagavatula, Chandra; Smith, Noah A.; Choi, Yejin

doi:10.18653/v1/2021.acl-long.522

Cited by 69 publications

(148 citation statements)

References 34 publications

(42 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Decoding-time methods (Dathathri et al, 2019;Gehman et al, 2020;Schick et al, 2021;Krause et al, 2020;Xu et al, 2021;Liu et al, 2021a) focus on manipulating the decoding-time behavior of the LMs without changing the model parameters. Simple approaches such as word filtering and vocabulary shifting (Gehman et al, 2020) directly lower the probability of toxic words (e.g., swearwords, slurs, vulgar slang) being generated.…”

Section: Existing Detoxification Methodsmentioning

confidence: 99%

“…However, it requires an external LM trained on non-toxic data, which is not easy to access in practice. DEXPERT (Liu et al, 2021a) controls the generation of large-scale pre-trained LM with an "expert" LM and "anti-expert" LM in a product of experts (Hinton, 2002), which achieves the state-of-the-art detoxification results so far. In this work, we focus on exploring the limits of domainadaptive training methods to reduce the toxicity of language models, because they have the advantages that 1) they achieve time and memory-efficient inference, which is especially important for deploying large-scale LMs, 2) the detoxified LMs checkpoints are flexible to be shared for future down-stream tasks, and 3) they can largely reduce the model toxicity while still maintaining good LM quality measured by perplexity and downstream task performance as we will show in the following section.…”

Section: Existing Detoxification Methodsmentioning

confidence: 99%

“…Large-scale pre-trained language models (LMs) (Radford et al, 2019;Raffel et al, 2019;Shoeybi et al, 2019;Brown Existing methods on reducing the toxicity of LMs can be categorized as: decoding-time methods, pre-training-based methods, and domain-adaptive training methods. Decodingtime methods (Dathathri et al, 2019;Gehman et al, 2020;Schick et al, 2021;Krause et al, 2020;Xu et al, 2021;Liu et al, 2021a) manipulate the output distribution or input prompts at the inference stage without modifying the original model parameters. These methods can be flexible, but they either resort to some simple word filtering strategies (Gehman et al, 2020), or increase the computational cost at the inference stage.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Wang¹,

Wei²,

Xiao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. We conduct this study on three dimensions: training corpus, model size, and parameter efficiency. For the training corpus, we propose to leverage the generative power of LMs and generate nontoxic datasets for domain-adaptive training, which mitigates the exposure bias and is shown to be more data-efficient than using a curated pretraining corpus. We demonstrate that the selfgeneration method consistently outperforms the existing baselines across various model sizes on both automatic and human evaluations, even when it uses a 1 3 smaller training corpus. We then comprehensively study detoxifying LMs with parameter sizes ranging from 126M up to 530B (3× larger than GPT-3), a scale that has never been studied before. We find that i) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to detoxify. We also explore parameter-efficient training methods for detoxification. We demonstrate that adding and training adapter-only layers in LMs not only saves a lot of parameters but also achieves a better trade-off between toxicity and perplexity than whole model adaptation for the large-scale models.

show abstract

Section: Existing Detoxification Methodsmentioning

confidence: 99%

Section: Existing Detoxification Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Wang¹,

Wei²,

Xiao³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Specifically, these methods are compatible with any pre-trained language model for generation without additional training. Given recent development of inference-time methods for control that can reduce toxicity (e.g., PPLM (Dathathri et al, 2019), GeDi (Krause et al, 2020), DExperts (Liu et al, 2021)), there is potential for extending these methods to bias mitigation. Bias Mitigation For autocomplete and dialogue generation, formulate bias triggers using gradient-based methods of Wallace et al (2019).…”

Section: Inference Methodsmentioning

confidence: 99%

Societal Biases in Language Generation: Progress and Challenges

Sheng¹,

Chang²,

Natarajan

et al. 2021

Preprint

View full text Add to dashboard Cite

Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. While techniques can effectively generate fluent text, they can also produce undesirable societal biases that can have a disproportionately negative impact on marginalized populations. Language generation presents unique challenges for biases in terms of direct user interaction and the structure of decoding techniques. To better understand these challenges, we present a survey on societal biases in language generation, focusing on how data and techniques contribute to biases and progress towards reducing biases. Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques. By further discussing general trends and open challenges, we call to attention promising directions for research and the importance of fairness and inclusivity considerations for language generation applications.

show abstract

“…Inspired by the GEDI [Krause et al 2020], a batch of similar work have emerged. DEXPERTS [Liu et al 2021c] re-ranks the predictions of the PLM based on expert (and anti-expert) opinions during the decoding stage to steer the language model towards the desired generation. FUDGE [Yang and Klein 2021] learns an attribute predictor operating on a partial sequence to adjust the original PLM's probabilities, and obtain an improved performance on the tasks of couplet completion in poetry, topic control in language generation, and formality change in machine translation.…”

Section: Post-processingmentioning

confidence: 99%

A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models

Zhang¹,

Song²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

Controllable Text Generation (CTG) is emerging area in the field of natural language generation (NLG). It is regarded as crucial for the development of advanced text generation technologies that are more natural and better meet the specific constraints in practical applications. In recent years, methods using large-scale pre-trained language models (PLMs), in particular the widely used transformer-based PLMs, have become a new paradigm of NLG, allowing generation of more diverse and fluent text. However, due to the lower level of interpretability of deep neural networks, the controllability of these methods need to be guaranteed. To this end, controllable text generation using transformer-based PLMs has become a rapidly growing yet challenging new research hotspot. A diverse range of approaches have emerged in the recent 3-4 years, targeting different CTG tasks which may require different types of controlled constraints. In this paper, we present a systematic critical review on the common tasks, main approaches and evaluation methods in this area. Finally, we discuss the challenges that the field is facing, and put forward various promising future directions. To the best of our knowledge, this is the first survey paper to summarize CTG techniques from the perspective of PLMs. We hope it can help researchers in related fields to quickly track the academic frontier, providing them with a landscape of the area and a roadmap for future research.

show abstract

DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts

Cited by 69 publications

References 34 publications

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Societal Biases in Language Generation: Progress and Challenges

A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models

Contact Info

Product

Resources

About