Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2023
DOI: 10.18653/v1/2023.acl-long.493
|View full text |Cite
|
Sign up to set email alerts
|

The CRINGE Loss: Learning what language not to model

Abstract: Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples. Growing evidence shows that even with very large amounts of positive training data, issues remain that can be alleviated with relatively small amounts of negative data -examples of what the model should not do. In this work, we propose a novel procedure to train with such data called the CRINGE loss (ContRastive Iterative Negative GEneration).We show the effectivenes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 19 publications
(40 reference statements)
0
2
0
Order By: Relevance
“…Unlikelihood training has been used in controllable text generation applications to avoid undesirable tokens with a high probability (Welleck et al, 2019). CLICK (Zheng et al, 2023), SLiC (Zhao et al, 2022), BRIO and CRINGE (Adolphs et al, 2022) also use unlikelihood training for various text generation applications such as summarization and sentiment control. Despite the popularity of unlikelihood training in text generation, it has not been widely applied to the text style transfer task.…”
Section: Penalizing Negative Examplesmentioning
confidence: 99%
“…Unlikelihood training has been used in controllable text generation applications to avoid undesirable tokens with a high probability (Welleck et al, 2019). CLICK (Zheng et al, 2023), SLiC (Zhao et al, 2022), BRIO and CRINGE (Adolphs et al, 2022) also use unlikelihood training for various text generation applications such as summarization and sentiment control. Despite the popularity of unlikelihood training in text generation, it has not been widely applied to the text style transfer task.…”
Section: Penalizing Negative Examplesmentioning
confidence: 99%
“…F1. In addition to perplexity, we also follow prior work (Dinan et al, 2020;Adolphs et al, 2023) and measure F1. Namely, using 2,000 Wikipedia sentences as prompts, we measure the harmonic mean between precision and recall of our model's output, where precision is the fraction of Note that our interventions depend on how much we scale each vector (α).…”
Section: Interventions Using Toxic Vectorsmentioning
confidence: 99%
“…Perhaps most commonly, human feedback data is used (Stiennon et al, 2020;Ouyang et al, 2022;Touvron et al, 2023) for methods such as PPO (Schulman et al, 2017) or DPO (Rafailov et al, 2023). When labels for only undesirable behavior is available, algorithms like unlikelihood training (Welleck et al, 2020) or Cringe (Adolphs et al, 2023;Xu et al, 2023) can be used. We study DPO because it is easy to use and currently widely used.…”
Section: Alignment Algorithmsmentioning
confidence: 99%