2022
DOI: 10.48550/arxiv.2203.07472
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Uncertainty Estimation for Language Reward Models

Abstract: Language models can learn a range of capabilities from unsupervised training on text corpora. However, to solve a particular problem (such as text summarization) it is typically necessary to fine-tune them on a task-specific dataset. It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons. However, collecting a large preference comparison dataset is still expensive-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 26 publications
0
0
0
Order By: Relevance
“…Still, it can be safely stated already that the two Softmax Ensemble techniques (KLD and VE) perform far worse than any other technique. This confirms the work of Gleave and Irving (2022), where the usefulness of softmax ensemble methods for Transformer models was investigated, with the same conclusion as ours: ensemble methods perform far worse than even random sampling for Transformer models.…”
Section: Test Accuraciessupporting
confidence: 91%
See 3 more Smart Citations
“…Still, it can be safely stated already that the two Softmax Ensemble techniques (KLD and VE) perform far worse than any other technique. This confirms the work of Gleave and Irving (2022), where the usefulness of softmax ensemble methods for Transformer models was investigated, with the same conclusion as ours: ensemble methods perform far worse than even random sampling for Transformer models.…”
Section: Test Accuraciessupporting
confidence: 91%
“…The aforementioned Uncertainty-measures can directly be used in AL strategies to sort the pool of unlabeled samples to select exactly those samples for labeling that have the lowest confidence/highest uncertainty. As repeatedly reported by others Karamcheti et al (2021); Gleave and Irving (2022); We propose three easily implementable methods on improving uncertainty-based AL strategies by preventing potentially harmful outliers from being selected for labeling. An uncertainty-based AL strategy always selects those samples for labeling first, where the uncertainty is the highest.…”
Section: Uncertainty-clipping (Uc)mentioning
confidence: 86%
See 2 more Smart Citations
“…While scholars have studied model uncertainty, prior work has focused on more accurately extracting model confidence (Kuhn et al, 2023;Sun et al, 2022;Gleave and Irving, 2022), measuring (Kwiatkowski et al, 2019b;Radford et al, 2019;Liang et al, 2023) and improving model calibration 2022) teach models to be linguistically calibrated when answering math questions.…”
Section: Related Workmentioning
confidence: 99%