2019
DOI: 10.48550/arxiv.1911.13273
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Confidence Calibration and Predictive Uncertainty Estimation for Deep Medical Image Segmentation

Alireza Mehrtash,
William M. Wells,
Clare M. Tempany
et al.

Abstract: Fully convolutional neural networks (FCNs), and in particular U-Nets, have achieved state-of-the-art results in semantic segmentation for numerous medical imaging applications. Moreover, batch normalization and Dice loss have been used successfully to stabilize and accelerate training. However, these networks are poorly calibrated i.e. they tend to produce overconfident predictions both in correct and erroneous classifications, making them unreliable and hard to interpret. In this paper, we study predictive un… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 36 publications
(54 reference statements)
0
4
0
Order By: Relevance
“…The authors recommend the use of this uncertainty measure compared to variance based ones since it is more interpretable as the range of uncertainty values is between 0 and 1 (0 very certain, 1 very uncertain). In another work, Mehrtash et al (2019) compared calibrated and uncalibrated segmenta-…”
Section: Uncertainty Analysismentioning
confidence: 99%
“…The authors recommend the use of this uncertainty measure compared to variance based ones since it is more interpretable as the range of uncertainty values is between 0 and 1 (0 very certain, 1 very uncertain). In another work, Mehrtash et al (2019) compared calibrated and uncalibrated segmenta-…”
Section: Uncertainty Analysismentioning
confidence: 99%
“…Other Approaches. Mehrtash et al [38] found that model ensembling improves confidence calibration for medical image segmentation. Karimi et al [26] showed that multi-task learning could yield better-calibrated predictions than dedicated models trained separately.…”
Section: Related Workmentioning
confidence: 99%
“…The following key limitations of existing calibration approaches still need to be addressed: (1) Most of the probability calibration approaches are designed for classifications, thus are not guaranteed to work well for semantic segmentation; (2) Most methods are designed to work for binary classifications and approach multi-class problems by a decomposition into k one-vs-rest binary calibrations (where k denotes the number of classes). However, such a decomposition does not guarantee overall calibration (only for the individual subproblems before normalization) and the classification accuracy of the trained model may change after calibration as the probability order of labels may change; (3) Limited work discusses the probability calibration of semantic segmentation, but this work either only applies to specific types of models (e.g., Bayesian neural networks [24]) or only implicitly improves calibration performance (e.g., via model ensembling [38] or multi-task learning [26]).…”
Section: Introductionmentioning
confidence: 99%
“…In the past years, UQ for image segmentation and object detection has been initiated in a series of works [16][17][18][19][20][21] focusing on false positive instances and false negative instances 22,23 , respectively, we also refer to a survey 17 . While many works focus on street scene recognition for autonomous driving, 16,24 cover medical image segmentation applications as well.…”
Section: Introductionmentioning
confidence: 99%