Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.157
|View full text |Cite
|
Sign up to set email alerts
|

How Certain is Your Transformer?

Abstract: In this work, we consider the problem of uncertainty estimation for Transformer-based models. We investigate the applicability of uncertainty estimates based on dropout usage at the inference stage (Monte Carlo dropout). The series of experiments on natural language understanding tasks shows that the resulting uncertainty estimates improve the quality of detection of error-prone instances. Special attention is paid to the construction of computationally inexpensive estimates via Monte Carlo dropout and Determi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 23 publications
(20 reference statements)
0
11
0
Order By: Relevance
“…(He et al 2020) combined mix-up, selfensembling and dropout to achieve more accurate uncertainty score for text classification. (Shelmanov et al 2021) proposed to incorporate determinantal point process (DPP) to MC dropout to quantify the uncertainty of transformers. Different to the above-mentioned approaches, we inject stochasticity into the vanilla transformer with Gumbel-Softmax tricks.…”
Section: Related Workmentioning
confidence: 99%
“…(He et al 2020) combined mix-up, selfensembling and dropout to achieve more accurate uncertainty score for text classification. (Shelmanov et al 2021) proposed to incorporate determinantal point process (DPP) to MC dropout to quantify the uncertainty of transformers. Different to the above-mentioned approaches, we inject stochasticity into the vanilla transformer with Gumbel-Softmax tricks.…”
Section: Related Workmentioning
confidence: 99%
“…Although transformers show excellent capability in processing long sequences of data, one of their main drawbacks is that they are not able to provide mathematicallygrounded estimates of their uncertainty for predictions. To address this issue, Bayesian transformers have been proposed [22,25,32] with the ability to quantify their uncertainty. Among various Bayesian approaches, Monte Carlo Dropout (MCD) [9] has become a wide-spread Bayesian inference scheme [7,22,25,27].…”
Section: Background and Related Work 21 Bayesian Transformermentioning
confidence: 99%
“…Invalid uncertainty estimates can result in overconfident and uncalibrated decisions, which present hazards for deploying NNs in safety-critical applications such as in healthcare or autonomous driving [12,16]. To overcome this drawback, Bayesian transformers [11,22,32] have been introduced with the mathematical grounding for reliable uncertainty estimation. An illustrative example is presented in Figure 1.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The other popular alternative is dropout which adds stochasticity to a standard neural network via randomly setting some of the weights to zero. This technique leads to the regularization of training [22] and can provide uncertainty estimates if applied at prediction time [6,24,23,21].…”
Section: Introduction and Related Workmentioning
confidence: 99%